wget is a non-interactive command-line utility for download resources from a specified URL. Because it is non-interactive, wget can work in the background or before the user even logs in. The program was designed especially for poor connections, making it especially robust in otherwise flaky conditions. While wget isn’t shipped with macOS, it can be easily downloaded and installed with Homebrew, the best Mac package manager available.
1. Download and Install Homebrew
To install Homebrew, open a Terminal window and execute the following command taken from Homebrew’s website:
This will cause wget to follow any links found on the documents within the specified directory, recursively downloading the entire specified URL path.
That command also includes -e robots=off, which ignores restrictions in the robots.txt file. In general, it’s a good idea to disable robots.txt to prevent abridged downloads.
Other wget Flags
In addition to the flags above, this selected handful of wget’s flags are the most useful:
Controlling the download
wget -X /absolute/path/to/directory will exclude a specific directory on the remote server.
wget -nH removes the hostname directories. Remember, the hostname is the part of the URL that contains the domain name and ends in a TLD like “.com.” For example, the folder named “www.w3.org” in our previous example would be skipped, starting the download with the “History” directory instead.
wget --cut-dirs=# skips the specified number of directories down the URL before starting to download files. For example, -nH --cut-dirs=1 would change the specified path of “ftp.xemacs.org/pub/xemacs/” into simply “/xemacs/,” reducing the number of empty parent directories in the local download.
wget -R index.html/wget --reject index.html will skip any files matching the specified file name. In this case it will exclude all the index files. The * character can be used as a wildcard, like “*.png,” which would skip all files with the PNG extension.
wget -i file specifies target URLs from an input file. The input file must be an HTML file or be parsed as HTML with the additional flag --force-html
wget -nc/wget --no-clobber will not overwrite files that already exist in the destination.
wget -c/wget --continue will continue downloads of partially downloaded files.
wget -t 10 will try to download the resource up to 10 times before failing.
Adjusting the level of logging
wget -d enables debugging output.
wget -o path/to/log.txt enables logging output to the specified directory instead of displaying the log-in standard output.
wget -q turns off all of wget’s output, including error messages.
wget -v explicitly enables wget’s default of verbose output.
wget --no-verbose turns off log messages but displays error messages.
While that should cover the majority of wget use cases, the downloader is capable of much more. For a full description of wget’s capabilities, you can review wget’s GNU man page online.