getlinks reads an HTML file and emits the links in an easy to process .
getlinks -s www.aiu.to getlinks.html
The latest package getlinks-2.0.tar.Z
Note: The manual page below is produced using man2html. There are several formatting gliches which are unavoidable at this time. A correctly formatted manual page is in the file getlinks.txt in the package.
getlinks - extract hrefs from an HTML document SYNOPSIS getlinks [-b base_ref] [-f] [-r] [-s site] file DESCRIPTION Getlinks scans an HTML file for hrefs and emits a list of them. OPTIONS -bbase_ref Prepend base_ref to all locations that do not begin with /. -f Do not fix up paths. Normally getlinks will compress . and .. components of a file name. -r Leave paths relative. Normally getlinks will fix up resource names that do not begin with a / with either base_ref or the base part of the input file name. -ssite Suppress hrefs that point to sites other than the designated one. EXAMPLES getlinks -s www.aiu.to index.html AUTHOR Tony Aiuto <email@example.com> Source The definitive source is located at http://tony.aiu.to/sa/getlinks-2.0.tar.Z. It may be mir- rored at http://liii.com/~aiuto.
Comments and money to firstname.lastname@example.org