webfetch - an alternative to wget

webfetch is a command line tool to fetch files from a web site using HTTP. It can also submit POST requests (as used in HTML forms). It allows you to spoof the content of many HTTP header items, such as cookies, authentication and user agent. It also supports retrieval through SOCKS4 and HTTP proxy firewalls. On windows platforms, it will use the proxy settings from the registry. In a nutshell, it's an an alternative to wget, that's better at some things, but not as good at others. You'll have to pick.

The program is divided into a core HTTP library which can be used in other software, and a small driver program. If you reuse the library in commercial software, you are encouraged to make a contribution to the author.

It is useful for many things

Example


    webfetch http://tony.aiu.to/sa/webfetch/index.html

Download webfetch


gzipped source - webfetch-5.4.3.tar.gz
zipped source - webfetch-5.4.3.zip
windows binary
windows binary with SSL
MD5 checksums
4a05e021a82a389009acad91b4c675f8  webfetch-5.4.3.tar.gz
54f44f1a92870f103282dea0c10ed523  webfetch-5.4.3.zip
a49c3829a3eb7ac72ab992b93be15c79  webfetch.exe
2d3ebe7edea8ad5683fd56062eeccaa9  webfetchs.exe

The definitive source is located at http://tony.aiu.to/sa/webfetch/ If you use this, please send a note to source@aiu.to and we'll put you on our (low volume) email list for notifications of package updates. You can see the change log for information about the lastest version.

Licensing

webfetch is released under a BSD style license. You can use it in commercial products without a problem. Just keep the copyright notices intact. We do ask, as a favor, that if you use webfetch, that you add a link back to our home page on some web site which you control.

If webfetch saves you money at work, I would think it a nice gesture if you sent me a check for 10% of the amount you thought it saved you. It really would help convince my wife that all the time I spend writing code at nights really has a productive side.

Futures - webfetch 5.5

Version 5.4 was released in February 2004.

Futures


The Manual Page

webfetch - fetch files from a web site

NAME

webfetch - fetch files from a web site

SYNOPSIS

webfetch {options} URL ...
webfetch {options} @commandfile
options:

[-a user:pass] [-A user:pass] [-C {get,head,delete,post}] [-c cookie] [-D] [-d] [-e] [-f from] [-H host] [-h] [-j cookiejarfile] [-l language] [-m hostname] [-n contentType] [-o OutputFile] [-P httpProxy{:port}] [-p post_args] [-R] [-r referer] [-S socksProxy{:port}] [-s] [-t n] [-u {aol{N} ie{N} netscape linux nt 95 98}] [-U userAgent] [-k allow-self-signed] [-k disallow-self-signed] [-k allow-untrusted] [-k disallow-untrusted] [-k certfile=file] [-k certpath=dir] [-k certkey=keyfile]

DESCRIPTION

Webfetch is a command line tool which fetches one or more files from a web site using HTTP. It can also submit POST requests (for HTML forms). It allows you to specify the content of many HTTP header items, such as cookies, authentication and user agent. It also supports retrieval through SOCKS 4 and HTTP proxy firewalls.

On Windows, webfetch will use the proxy settings from the registry.

When used with a command file, URLS should be specified one per line. Options may be specified at any point on a line by themselves.

OPTIONS

-a user:pass Send an Authorization header with the given username/password combination.
-A user:pass Send an Proxy-Authorization header with the given username/password combination.
-C {get,head,delete,post} Set http Command. (head implies -h).
-c cookie Send a cookie along with your request.
-D Do an HTTP DELETE.
-d Create directories as needed for saved files. If not specified, then output is written to the base name of the file. Does not have any effect of -o is specified. Implies -s.
-e Prints status code and HTTP status code on stderr. Each is on a separate line and tagged with either 'status' or 'http'. The error codes are:
0	success
1	Unknown Host
2	Unsupported Scheme
3	Bad Hostname
4	Bad Socks Proxy Host
5	Bad HTTP Proxy Host
6	Connection Refused by Host
10	SSL Failure
99	Other failure
-f from Set the HTTP "From" header.
-H host Set host name to provide in HTTP "Host" header. Default is to use site. This option has been removed - this document only references to aid deciphering an old script using it. Rather than specifying webfetch -H www.aiu.to http://xx.xx.xx.xx/url use -m instead: webfetch -m xx.xx.xx.xx http://www.aiu.to/url
-h Write HTTP headers to standard out.
-j cookiejarfile Use cookiejarfile as a source and sink for cookies.
-l language Set Accept-Language.
-m hostname Connect to physical host (or IP address) hostname. The hostname sent in the "Host" header will be extracted from the URL. This is the prefered way to test a virtual host on a machine were where DNS will not resolve correctly.
-n contentType Specify Content-Type header to send with post data. Setting it to an empty string will suppress sending the Content-Type header at all. Default: application/x-www-form-urlencoded
-o OutputFile Write output to OutputFile rather than the base of the file name. Obviously, it does not make sense to specify -o with multiple file names. Implies -s.
-P httpProxy{:port} Go through an HTTP proxy. port is optional (defaults to 8000).
-p post_args Do a POST request (rather than a GET), with the parameters given. If post_args begins with an '@' charater, it is taken to mean the name of a file containing the data.
-R Follow Redirect responses.
-r referer Add referer option to GET command. Sometimes this can be used to get objects from sites which force you to come in through a front page. Typical referer strings look like 'http://site/MainPage.html'.
-S socksProxy{:port} Go through an SOCKS proxy. port is optional (defaults to 1080).
-s Save the result in a file. Default is to write to standard out. Warning! This usage was reversed in releases before 3.x
-t n Set timeout to n seconds. (default: 60)
-u {aol{N} ie{N} netscape linux nt 95 98} Set user agent to what is sent by various popular browsers.
-U userAgent Set user agent to the specified value.
-k allow-self-signed allow self signed site certificates
-k disallow-self-signed disallow self signed site certificates
-k allow-untrusted allow certs signed by untrusted CAs
-k disallow-untrusted disallow certs signed by untrusted CAs
-k certfile=file specify trusted CA certificate file
-k certpath=dir specify trusted CA certificate directory path. This should usually point to the directory of trusted certs provided with your openssl implementation. The default is to use the path compiled in at build time.
-k certkey=keyfile specify client key file

EXAMPLES

webfetch http://tony.aiu.to/sa/index.html

webfetch -P proxyHost:8008 http://www.aiu.to/index.html

webfetch -u aol7,98 http://www.aiu.to/

webfetch -m testServer http://www.mywebsite.com/

webfetch -u ie5nt @nightly

nightly:
	-R
	-o saindex.html
	http://tony.aiu.to/sa
	-s 
	http://tony.aiu.to/sa/webfetch.html
	http://tony.aiu.to/sa/webfetch-5.0.0.tar.gz

AUTHOR

Tony Aiuto (tony@aiu.to)

SOURCE

The definitive source is located at http://tony.aiu.to/sa/. If you use this package, please drop us a line at source@aiu.to and we'll notify you about future updates.

This manual page describes webfetch 5.4.3

CREDITS


You are visitor 3 gazillion since Jan 1, 2001