Jump to content
IGNORED

Wget


Recommended Posts

Anyone ever use Wget or something similar to crawl an online journal database?

 

What if I wanted to extract every single .pdf and have them placed neatly into sub-directories, organized by journal name?

Link to comment
https://forum.watmm.com/topic/75968-wget/
Share on other sites

wget -r -np -A.pdf whatever-the-url-is

 

Erm possibly - haven't used it in years, don't blame me if you end up downloading the whole internet ... I seem to remember something about robots.txt preventing crawling (mass downloading) on sites as well

I haven't eaten a Wagon Wheel since 07/11/07... ilovecubus.co.uk - 25ml of mp3 taken twice daily.

Link to comment
https://forum.watmm.com/topic/75968-wget/#findComment-1887111
Share on other sites

Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Recently Browsing   1 Member

×
×