Wget and Curl make such a wonderful pair in Linux. Here are some useful tips:
Basic Downloads:
# Download a single file/page
wget http://required_site/file
# Download the entire site using the -r option
wget -r http://required_site/
# Download certain file types using the -A option
wget -r -A pdf,mp3 http://required_site/
# Follow external links using the -H option
wget -r -H -A pdf,mp3 http://required_site/
# Limit the sites to follow using the -D option
wget -r -H -A pdf,mp3 -D files.site.com http://required_site/
# Number of levels to go when using -r
wget -r -l 2 http://required_site/
# Download all images from the site
wget -erobots=off -r -l1 --no-parent -A .gif,.jpg http://required_site/
Advanced Tricks:
# Download content protected by referer and cookies
# Step 1: get base url and save its cookies in file
wget --cookies=on --keep-session-cookies --save-cookies=cookie.txt http://first_page
# Step 2: get protected content using stored cookies
wget --referer=http://first_page --cookies=on --load-cookies=cookie.txt \
--keep-session-cookies --save-cookies=cookie.txt http://second_page
# Mirror website to a static copy for local browsing
wget --mirror -w 2 -p --html-extension --convert-links -P http://required_site
# Wget to work in the background
wget -t 45 -o log http://required_site &
# Wget for FTP (login and password handled automatically)
wget ftp://required_site
# Read the list of URLs from a file
wget -i file
#linux#bash
About Hemanth HM
Hemanth HM is a Sr. Machine Learning Manager at PayPal, Google Developer Expert, TC39 delegate, FOSS advocate, and community leader with a passion for programming, AI, and open-source contributions.