From 0183f4dd84d8cc44d617fb48436881e79e2bf0f2 Mon Sep 17 00:00:00 2001 From: Xavier Roche Date: Mon, 19 Mar 2012 13:03:26 +0000 Subject: httrack 3.44.1 --- html/httrack.man.html | 3294 ++++++++++++++++++++++--------------------------- 1 file changed, 1465 insertions(+), 1829 deletions(-) (limited to 'html/httrack.man.html') diff --git a/html/httrack.man.html b/html/httrack.man.html index 6427bef..471aa1a 100644 --- a/html/httrack.man.html +++ b/html/httrack.man.html @@ -1,14 +1,25 @@ - - + + + + + httrack + -

httrack

+

httrack

+ NAME
SYNOPSIS
DESCRIPTION
@@ -25,28 +36,24 @@ SEE ALSO

+ + +

NAME -

NAME

- - - - - -
-

httrack − offline browser : copy websites to a -local directory

-
+ + + +

httrack − +offline browser : copy websites to a local directory

+ +

SYNOPSIS -

SYNOPSIS

- - - - - -
-

httrack [ url ]... [ −filter ]... [ +filter ]... -[ −O, −−path ] [ −%O, + + + +

httrack [ +url ]... [ −filter ]... [ +filter ]... [ −O, +−−path ] [ −%O, −−chroot ] [ −w, −−mirror ] [ −W, −−mirror−wizard ] [ −g, @@ -140,2562 +147,2191 @@ local directory

−%U, −−user ] [ −%W, −−callback ] [ −K, −−keep−links[=N] ] [

-
+ +

DESCRIPTION -

DESCRIPTION

- - - - - -
-

httrack allows you to download a World Wide Web -site from the Internet to a local directory, building -recursively all directories, getting HTML, images, and other -files from the server to your computer. HTTrack arranges the -original site’s relative link-structure. Simply open a -page of the "mirrored" website in your browser, -and you can browse the site from link to link, as if you -were viewing it online. HTTrack can also update an existing + + + +

httrack +allows you to download a World Wide Web site from the +Internet to a local directory, building recursively all +directories, getting HTML, images, and other files from the +server to your computer. HTTrack arranges the original +site’s relative link-structure. Simply open a page of +the "mirrored" website in your browser, and you +can browse the site from link to link, as if you were +viewing it online. HTTrack can also update an existing mirrored site, and resume interrupted downloads.

-
+ +

EXAMPLES -

EXAMPLES

- - - - - -
-

httrack www.someweb.com/bob/

- - - - - -
-

mirror site www.someweb.com/bob/ and only this site

-
- - - - - -
-

httrack www.someweb.com/bob/ www.anothertest.com/mike/ -+*.com/*.jpg −mime:application/*

- - - - - -
-

mirror the two sites together (with shared links) and -accept any .jpg files on .com sites

-
- - - - - -
-

httrack www.someweb.com/bob/bobby.html +* -−r6

- - - - - -
-

means get all files starting from bobby.html, with 6 -link−depth, and possibility of going everywhere on the -web

-
- - - - - -
-

httrack www.someweb.com/bob/bobby.html -−−spider −P -proxy.myhost.com:8080

- - - - - -
-

runs the spider on www.someweb.com/bob/bobby.html using a -proxy

-
- - - - - -
-

httrack −−update

- - - - - -
-

updates a mirror in the current folder

-
- - - - - -
-

httrack

- - - - - -
-

will bring you to the interactive mode

-
- - - - - -
-

httrack −−continue

- - - - - -
-

continues a mirror in the current folder

-
+ + + +

httrack +www.someweb.com/bob/

+ +

mirror site +www.someweb.com/bob/ and only this site

+ +

httrack www.someweb.com/bob/ +www.anothertest.com/mike/ +*.com/*.jpg
+−mime:application/*

+ +

mirror the two sites together +(with shared links) and accept any .jpg files on .com +sites

+ +

httrack +www.someweb.com/bob/bobby.html +* −r6

+ +

means get all files starting +from bobby.html, with 6 link−depth, and possibility of +going everywhere on the web

+ +

httrack +www.someweb.com/bob/bobby.html −−spider −P +
+proxy.myhost.com:8080

+ +

runs the spider on +www.someweb.com/bob/bobby.html using a proxy

+ +

httrack +−−update

+ +

updates a mirror in the current +folder

+ +

httrack

+ +

will bring you to the +interactive mode

+ +

httrack +−−continue

+ +

continues a mirror in the +current folder

+ +

OPTIONS -

OPTIONS

- - - - - -
-

General options:

- - + + + +

General +options:

+ +
- - + + +<param>)

- - + + -
-

−O

-
+ +

−O

+

path for mirror/logfiles+cache (−O path mirror[,path cache and logfiles]) (−−path -<param>)

-
-

−%O

-
+ +

−%O

+

chroot path to, must be r00t (−%O root path) -(−−chroot <param>)

-
- - - - - +(−−chroot <param>)

-

Action options:

- - + +

Action +options:

+ +
- - - + + + +

*mirror web sites +(−−mirror)

- + + - - +(−−mirror−wizard)

- + + - - +(−−get−files)

- + + - - +(−−continue)

- + + - - -
+ + -

−w

-
+

−w

-

*mirror web sites (−−mirror)

-
+ + + +

−W

-

−W

-

mirror web sites, semi−automatic (asks questions) -(−−mirror−wizard)

-
+ + + +

−g

-

−g

-

just get files (saved in the current directory) -(−−get−files)

-
+ + + +

−i

-

−i

-

continue an interrupted mirror using the cache -(−−continue)

-
+ + + +

−Y

-

−Y

-

mirror ALL links located in the first level pages -(mirror links) (−−mirrorlinks)

-
- - - - - +(mirror links) (−−mirrorlinks)

-

Proxy options:

- - + +

Proxy +options:

+ +
- - +

−P

+ + - - + + +(−−httpproxy−ftp[=N])

- - + + +hostname) (−−bind <param>)

-

−P

-
-

proxy use (−P proxy:port or −P -user:pass@proxy:port) (−−proxy -<param>)

-
+ + +

proxy use (−P proxy:port +or −P user:pass@proxy:port) (−−proxy +<param>)

-

−%f

-
+ +

−%f

+

*use proxy for ftp (f0 don t use) -(−−httpproxy−ftp[=N])

-
-

−%b

-
+ +

−%b

+

use this local hostname to make/send requests (−%b -hostname) (−−bind <param>)

-
- - - - - -
-

Limits options:

- - + +

Limits +options:

+ +
- - - +

−rN

+ + - + + - - +(−−ext−depth[=N])

- + + - - +(−−max−files[=N])

- + + - - +

maximum file length for non html (N) and html (N2)

- + + - - +(−−max−size[=N])

- + + - - +hour) (−−max−time[=N])

- + + - - +(−−max−rate[=N])

- + + - - +(−−connection−per−second[=N])

- + + - - +file is deleted (−−max−pause[=N])

- + + - - -
+ -

−rN

-
-

set the mirror depth to N (* r9999) -(−−depth[=N])

-
+ + +

set the mirror depth to N (* +r9999) (−−depth[=N])

+ + + +

−%eN

-

−%eN

-

set the external links depth to N (* %e0) -(−−ext−depth[=N])

-
+ + + +

−mN

-

−mN

-

maximum file length for a non−html file -(−−max−files[=N])

-
+ + + +

−mN,N2

-

−mN,N2

-
-

maximum file length for non html (N) and html (N2)

-
+ + + +

−MN

-

−MN

-

maximum overall size that can be uploaded/scanned -(−−max−size[=N])

-
+ + + +

−EN

-

−EN

-

maximum mirror time in seconds (60=1 minute, 3600=1 -hour) (−−max−time[=N])

-
+ + + +

−AN

-

−AN

-

maximum transfer rate in bytes/seconds (1000=1KB/s max) -(−−max−rate[=N])

-
+ + + +

−%cN

-

−%cN

-

maximum number of connections/seconds (*%c10) -(−−connection−per−second[=N])

-
+ + + +

−GN

-

−GN

-

pause transfer if N bytes reached, and wait until lock -file is deleted (−−max−pause[=N])

-
+ + + +

−%mN

-

−%mN

-

maximum mms stream download time in seconds (60=1 minute, 3600=1 hour) -(−−max−mms−time[=N])

-
- - - - - +(−−max−mms−time[=N])

-

Flow control:

- - + +

Flow +control:

+ +
- - - +

−cN

+ + - + + - - +link is shutdown (−−timeout)

- + + - - +errors (*R1) (−−retries[=N])

- + + - - +(−−min−rate[=N])

- + + - - -
+ -

−cN

-
-

number of multiple connections (*c8) -(−−sockets[=N])

-
+ + +

number of multiple connections +(*c8) (−−sockets[=N])

+ + + +

−TN

-

−TN

-

timeout, number of seconds after a non−responding -link is shutdown (−−timeout)

-
+ + + +

−RN

-

−RN

-

number of retries, in case of timeout or non−fatal -errors (*R1) (−−retries[=N])

-
+ + + +

−JN

-

−JN

-

traffic jam control, minimum transfert rate (bytes/seconds) tolerated for a link -(−−min−rate[=N])

-
+ + + +

−HN

-

−HN

-

host is abandonned if: 0=never, 1=timeout, 2=slow, -3=timeout or slow (−−host−control[=N])

-
- - - - - +3=timeout or slow (−−host−control[=N])

-

Links options:

- - + +

Links +options:

+ +
- - - + + + +

*extended parsing, attempt to +parse all links, even in unknown tags or Javascript (%P0 don +t use) (−−extended−parsing[=N])

- + + - - +located outside) (−−near)

- + + - - +(−−test)

- + + - - +URL per line) (−−list <param>)

- + + - - -
+ + -

−%P

-
+

−%P

-

*extended parsing, attempt to parse all links, even in -unknown tags or Javascript (%P0 don t use) -(−−extended−parsing[=N])

-
+ + + +

−n

-

−n

-

get non−html files near an html file (ex: an image -located outside) (−−near)

-
+ + + +

−t

-

−t

-

test all URLs (even forbidden ones) -(−−test)

-
+ + + +

−%L

-

−%L

-

<file> add all URL located in this text file (one -URL per line) (−−list <param>)

-
+ + + +

−%S

-

−%S

-

<file> add all scan rules located in this text file (one scan rule per line) (−−urllist -<param>)

-
- - - - - +<param>)

-

Build options:

- - + +

Build +options:

+ +
- + + - - +

structure type (0 *original +structure, 1+: see below) (−−structure[=N])

- + + - - +"%h%p/%n%q.%t")

- + + - - +t use, %N1 use for unknown extensions, * %N2 always use)

- + + - - +(−−cached−delayed−type−check)

- + + - - +(−−mime−html)

- + + - - +(−−long−names[=N])

- + + - - +absolute URI links) (−−keep−links[=N])

- + + - - +(−−replace−external)

- + + - - +(−−disable−passwords)

- + + - - +(−−include−query−string)

- + + - - +don t generate) (−−generate−errors)

- + + - - +(−−purge−old[=N])

- + + - - +−%F "" ) (−−preserve)

+ + + +

−NN

-

−NN

-
-

structure type (0 *original structure, 1+: see below) -(−−structure[=N])

-
+ + + +

−or

-

−or

-

user defined structure (−N -"%h%p/%n%q.%t")

-
+ + + +

−%N

-

−%N

-

delayed type check, don t make any link test but wait for files download to start instead (experimental) (%N0 don -t use, %N1 use for unknown extensions, * %N2 always use)

-
+ + + +

−%D

-

−%D

-

cached delayed type check, don t wait for remote type during updates, to speedup them (%D0 wait, * %D1 don t wait) -(−−cached−delayed−type−check)

-
+ + + +

−%M

-

−%M

-

generate a RFC MIME−encapsulated full−archive (.mht) -(−−mime−html)

-
+ + + +

−LN

-

−LN

-

long names (L1 *long names / L0 8−3 conversion / L2 ISO9660 compatible) -(−−long−names[=N])

-
+ + + +

−KN

-

−KN

-

keep original links (e.g. http://www.adr/link) (K0 *relative link, K absolute links, K4 original links, K3 -absolute URI links) (−−keep−links[=N])

-
+ + + +

−x

-

−x

-

replace external html links by error pages -(−−replace−external)

-
+ + + +

−%x

-

−%x

-

do not include any password for external password protected websites (%x0 include) -(−−disable−passwords)

-
+ + + +

−%q

-

−%q

-

*include query string for local files (useless, for information purpose only) (%q0 don t include) -(−−include−query−string)

-
+ + + +

−o

-

−o

-

*generate output html file in case of error (404..) (o0 -don t generate) (−−generate−errors)

-
+ + + +

−X

-

−X

-

*purge old files after update (X0 keep delete) -(−−purge−old[=N])

-
+ + + +

−%p

-

−%p

-

preserve html files as is (identical to −K4 -−%F "" ) (−−preserve)

-
- - - - - -
-

Spider options:

- - + +

Spider +options:

+ +
- - - +

−bN

+ + - + + - - +(−−check−type[=N])

- + + - - +be aggressive) (−−parse−java[=N])

- + + - - +rules)) (−−robots[=N])

- + + - - +(−−http−10)

- + + - - +(−−keep−alive)

- + + - - +servers, but not standard!) (−−tolerant)

- + + - - +(−−updatehack)

- + + - - +//, www.foo.com==foo.com..) (−−urlhack)

- + + - - +(−−assume <param>)

- + + - - +−−assume foo.cgi=text/html

- + + - - +only) (−−protocol[=N])

- + + - - -
+ -

−bN

-
-

accept cookies in cookies.txt (0=do not accept,* -1=accept) (−−cookies[=N])

-
+ + +

accept cookies in cookies.txt +(0=do not accept,* 1=accept) (−−cookies[=N])

+ + + +

−u

-

−u

-

check document type if unknown (cgi,asp..) (u0 don t check, * u1 check but /, u2 check always) -(−−check−type[=N])

-
+ + + +

−j

-

−j

-

*parse Java Classes (j0 don t parse, bitmask: |1 parse default, |2 don t parse .class |4 don t parse .js |8 don t -be aggressive) (−−parse−java[=N])

-
+ + + +

−sN

-

−sN

-

follow robots.txt and meta robots tags (0=never,1=sometimes,* 2=always, 3=always (even strict -rules)) (−−robots[=N])

-
+ + + +

−%h

-

−%h

-

force HTTP/1.0 requests (reduce update features, only for old servers or proxies) -(−−http−10)

-
+ + + +

−%k

-

−%k

-

use keep−alive if possible, greately reducing latency for small files and test requests (%k0 don t use) -(−−keep−alive)

-
+ + + +

−%B

-

−%B

-

tolerant requests (accept bogus responses on some -servers, but not standard!) (−−tolerant)

-
+ + + +

−%s

-

−%s

-

update hacks: various hacks to limit re−transfers when updating (identical size, bogus response..) -(−−updatehack)

-
+ + + +

−%u

-

−%u

-

url hacks: various hacks to limit duplicate URLs (strip -//, www.foo.com==foo.com..) (−−urlhack)

-
+ + + +

−%A

-

−%A

-

assume that a type (cgi,asp..) is always linked with a mime type (−%A php3,cgi=text/html;dat,bin=application/x−zip) -(−−assume <param>)

-
+ + + +

−can

-

−can

-

also be used to force a specific file type: -−−assume foo.cgi=text/html

-
+ + + +

−@iN

-

−@iN

-

internet protocol (0=both ipv6+ipv4, 4=ipv4 only, 6=ipv6 -only) (−−protocol[=N])

-
+ + + +

−%w

-

−%w

-

disable a specific external mime module (−%w htsswf −%w htsjava) -(−−disable−module <param>)

-
- - - - - +(−−disable−module <param>)

-

Browser ID:

- - - - - - - +
-

−F

-
+

Browser +ID:

-

user−agent field sent in HTTP headers (−F -"user−agent name") -(−−user−agent <param>)

-
- - - - - - + + - - +

user−agent field sent in +HTTP headers (−F "user−agent name") +(−−user−agent <param>)

- + + + + + + + + + + + + - - +<param>)

- + + - - +(−−language <param>)

+ -

−%R

-
-

default referer field sent in HTTP headers -(−−referer <param>)

-
+

−F

-

−%E

-
-

from email address sent in HTTP headers -(−−from <param>)

-
+ + + +

−%R

+ + +

default referer field sent in HTTP headers +(−−referer <param>)

+ + +

−%E

+ + +

from email address sent in HTTP headers +(−−from <param>)

+ + +

−%F

-

−%F

-

footer string in Html code (−%F "Mirrored [from host %s [file %s [at %s]]]" (−−footer -<param>)

-
+ + + +

−%l

-

−%l

-

preffered language (−%l "fr, en, jp, *" -(−−language <param>)

-
- - - - - -
-

Log, index, cache

- - + +

Log, index, +cache

+ +
- + + - - +

create/use a cache for updates +and retries (C0 no cache,C1 cache is prioritary,* C2 test +update before) (−−cache[=N])

- + + - - +(−−store−all−in−cache)

- + + - - +(−−do−not−recatch)

- + + - - +(−−display)

- + + - - +(−−do−not−log)

- + + - - +(−−quiet)

- + + - - +(−−extra−log)

- + + - - +

log − debug (−−debug−log)

- - - + + + +

log on screen (−−verbose)

- - - +

−f

+ + - - - +

−f2

+ + - - - + + + +

*make an index (I0 don t make) (−−index)

- + + - - +(−−build−top−index)

- + + - - +make) (−−search−index)

+ + + +

−C

-

−C

-
-

create/use a cache for updates and retries (C0 no -cache,C1 cache is prioritary,* C2 test update before) -(−−cache[=N])

-
+ + + +

−k

-

−k

-

store all files in cache (not useful if files on disk) -(−−store−all−in−cache)

-
+ + + +

−%n

-

−%n

-

do not re−download locally erased files -(−−do−not−recatch)

-
+ + + +

−%v

-

−%v

-

display on screen filenames downloaded (in realtime) − * %v1 short version − %v2 full animation -(−−display)

-
+ + + +

−Q

-

−Q

-

no log − quiet mode -(−−do−not−log)

-
+ + + +

−q

-

−q

-

no questions − quiet mode -(−−quiet)

-
+ + + +

−z

-

−z

-

log − extra infos -(−−extra−log)

-
+ + + +

−Z

-

−Z

-
-

log − debug (−−debug−log)

-
+ + -

−v

-
+

−v

-

log on screen (−−verbose)

-
+ -

−f

-
-

*log in files (−−file−log)

-
+ + +

*log in files (−−file−log)

+ -

−f2

-
-

one single log file (−−single−log)

-
+ + +

one single log file (−−single−log)

+ + -

−I

-
+

−I

-

*make an index (I0 don t make) (−−index)

-
+ + + +

−%i

-

−%i

-

make a top index for a project folder (* %i0 don t make) -(−−build−top−index)

-
+ + + +

−%I

-

−%I

-

make an searchable index for this mirror (* %I0 don t -make) (−−search−index)

-
- - - - - -
-

Expert options:

- - + +

Expert +options:

+ +
- - - +

−pN

+ + - - - +

−p0

+ + - + + - - +

save only html files

- - - +

−p2

+ + - - - + + + +

save all files

- + + - - +

get html files before, then treat other files

- + + - - +(−−stay−on−same−dir)

- + + - - +(−−can−go−down)

- + + - - +(−−can−go−up)

- + + - - +(−−can−go−up−and−down)

- + + - - +(−−stay−on−same−address)

- + + - - +(−−stay−on−same−domain)

- + + - - +(−−stay−on−same−tld)

- + + - - +(−−go−everywhere)

- + + - - -
+ -

−pN

-
-

priority mode: (* p3) (−−priority[=N])

-
+ + +

priority mode: (* p3) +(−−priority[=N])

+ -

−p0

-
-

just scan, don t save anything (for checking links)

-
+ + +

just scan, don t save anything (for checking links)

+ + + +

−p1

-

−p1

-
-

save only html files

-
+ -

−p2

-
-

save only non html files

-
+ + +

save only non html files

+ + -

−*p3

-
+

−*p3

-

save all files

-
+ + + +

−p7

-

−p7

-
-

get html files before, then treat other files

-
+ + + +

−S

-

−S

-

stay on the same directory -(−−stay−on−same−dir)

-
+ + + +

−D

-

−D

-

*can only go down into subdirs -(−−can−go−down)

-
+ + + +

−U

-

−U

-

can only go to upper directories -(−−can−go−up)

-
+ + + +

−B

-

−B

-

can both go up&down into the directory structure -(−−can−go−up−and−down)

-
+ + + +

−a

-

−a

-

*stay on the same address -(−−stay−on−same−address)

-
+ + + +

−d

-

−d

-

stay on the same principal domain -(−−stay−on−same−domain)

-
+ + + +

−l

-

−l

-

stay on the same TLD (eg: .com) -(−−stay−on−same−tld)

-
+ + + +

−e

-

−e

-

go everywhere on the web -(−−go−everywhere)

-
+ + + +

−%H

-

−%H

-

debug HTTP headers in logfile -(−−debug−headers)

-
- - - - - +(−−debug−headers)

-

Guru options: (do NOT use if possible)

- - + +

Guru +options: (do NOT use if possible)

+ +
- - - +

−#X

+ + - + + - - +(−−debug−testfilters <param>)

- + + - - +

simplify test (−#1 ./foo/bar/../foobar)

- - - +

−#2

+ + - + + - - +(−−debug−cache <param>)

- + + - - +(−−repair−cache)

- - - +

−#d

+ + - - - +

−#E

+ + - + + - - +(−−advanced−flushlogs)

- + + - - +(−−advanced−maxfilters[=N])

- - - +

−#h

+ + - + + - - +(−−debug−scanstdin)

- + + - - -
+ -

−#X

-
-

*use optimized engine (limited memory boundary checks) -(−−fast−engine)

-
+ + +

*use optimized engine (limited +memory boundary checks) +(−−fast−engine)

+ + + +

−#0

-

−#0

-

filter test (−#0 *.gif www.bar.com/foo.gif ) -(−−debug−testfilters <param>)

-
+ + + +

−#1

-

−#1

-
-

simplify test (−#1 ./foo/bar/../foobar)

-
+ -

−#2

-
-

type test (−#2 /foo/bar.php)

-
+ + +

type test (−#2 /foo/bar.php)

+ + + +

−#C

-

−#C

-

cache list (−#C *.com/spider*.gif -(−−debug−cache <param>)

-
+ + + +

−#R

-

−#R

-

cache repair (damaged cache) -(−−repair−cache)

-
+ -

−#d

-
-

debug parser (−−debug−parsing)

-
+ + +

debug parser (−−debug−parsing)

+ -

−#E

-
-

extract new.zip cache meta−data in meta.zip

-
+ + +

extract new.zip cache meta−data in meta.zip

+ + + +

−#f

-

−#f

-

always flush log files -(−−advanced−flushlogs)

-
+ + + +

−#FN

-

−#FN

-

maximum number of filters -(−−advanced−maxfilters[=N])

-
+ -

−#h

-
-

version info (−−version)

-
+ + +

version info (−−version)

+ + + +

−#K

-

−#K

-

scan stdin (debug) -(−−debug−scanstdin)

-
+ + + +

−#L

-

−#L

-

maximum number of links (−#L1000000) -(−−advanced−maxlinks)

-
- - +(−−advanced−maxlinks)

- + + - - +(−−advanced−progressinfo)

- + + - - +

catch URL (−−catch−url)

- + + - - +(−−repair−cache)

- + + - - +(−−debug−xfrstats)

- + + - - +

wait time (−−advanced−wait)

- + + - - +(−−debug−ratestats)

- + + - - -
+ + + +

−#p

-

−#p

-

display ugly progress information -(−−advanced−progressinfo)

-
+ + + +

−#P

-

−#P

-
-

catch URL (−−catch−url)

-
+ + + +

−#R

-

−#R

-

old FTP routines (debug) -(−−repair−cache)

-
+ + + +

−#T

-

−#T

-

generate transfer ops. log every minutes -(−−debug−xfrstats)

-
+ + + +

−#u

-

−#u

-
-

wait time (−−advanced−wait)

-
+ + + +

−#Z

-

−#Z

-

generate transfer rate statictics every minutes -(−−debug−ratestats)

-
+ + + +

−#!

-

−#!

-

execute a shell command (−#! "echo -hello") (−−exec <param>)

-
- - - - - +hello") (−−exec <param>)

-

Dangerous options: (do NOT use unless you exactly know -what you are doing)

- - + +

Dangerous +options: (do NOT use unless you exactly know what you are +doing)

+ +
- - -
-

−%!

-
-

bypass built−in security limits aimed to avoid -bandwith abuses (bandwidth, simultaneous connections) -(−−disable−security−limits)

-
- - - - - -
-

−IMPORTANT

- - - - - +

−%!

+ +
-

NOTE: DANGEROUS OPTION, ONLY SUITABLE FOR EXPERTS

-
+ + +

bypass built−in security +limits aimed to avoid bandwith abuses (bandwidth, +simultaneous connections) +(−−disable−security−limits)

- - + +

−IMPORTANT

+ +

NOTE: DANGEROUS OPTION, ONLY +SUITABLE FOR EXPERTS

+ +
- - - - -
+ -

−USE

-
-

IT WITH EXTREME CARE

-
-
- - - - - +

−USE

+ + +
-

Command−line specific options:

+ + +

IT WITH EXTREME CARE

+
- - + + +

Command−line +specific options:

+ +
- - - + + + +

execute system command after +each files ($0 is the filename: −V "rm ") +(−−userdef−cmd <param>)

- + + - - +(−%U smith) (−−user <param>)

- + + - - +<param>)

+ + -

−V

-
+

−V

-

execute system command after each files ($0 is the -filename: −V "rm ") -(−−userdef−cmd <param>)

-
+ + + +

−%U

-

−%U

-

run the engine with another id when called as root -(−%U smith) (−−user <param>)

-
+ + + +

−%W

-

−%W

-

use an external library function as a wrapper (−%W myfoo.so[,myparameters]) (−−callback -<param>)

-
- - - - - -
-

Details: Option N

- - + +

Details: +Option N

+ +
- + + - +

Site−structure +(default)

- - +

−N1

+ + - - +

−N2

+ + - - +

−N3

+ + - + + - +example)

- + + - +

Images/other in web/xxx and HTML in web/HTML

- + + - +

All files in web/, with random names (gadget !)

- - +

−N100

+ + - + + - +by the site s name

- + + - +by the site s name

- + + - +by the site s name

- + + - +by the site s name

- + + - +by the site s name

- + + - +by the site s name

- + + - +directory

- + + - +directory

- + + - +directory (option set for g option)

- + + - +directory

- + + - +directory

- + + - -
+ + + +

−N0

-

−N0

-
-

Site−structure (default)

-
+ -

−N1

-
-

HTML in web/, images/other files in web/images/

-
+ + +

HTML in web/, images/other files in web/images/

+ -

−N2

-
-

HTML in web/HTML, images/other in web/images

-
+ + +

HTML in web/HTML, images/other in web/images

+ -

−N3

-
-

HTML in web/, images/other in web/

-
+ + +

HTML in web/, images/other in web/

+ + + +

−N4

-

−N4

-

HTML in web/, images/other in web/xxx, where xxx is the file extension (all gif will be placed onto web/gif, for -example)

-
+ + + +

−N5

-

−N5

-
-

Images/other in web/xxx and HTML in web/HTML

-
+ + + +

−N99

-

−N99

-
-

All files in web/, with random names (gadget !)

-
+ -

−N100

-
-

Site−structure, without www.domain.xxx/

-
+ + +

Site−structure, without www.domain.xxx/

+ + + +

−N101

-

−N101

-

Identical to N1 exept that "web" is replaced -by the site s name

-
+ + + +

−N102

-

−N102

-

Identical to N2 exept that "web" is replaced -by the site s name

-
+ + + +

−N103

-

−N103

-

Identical to N3 exept that "web" is replaced -by the site s name

-
+ + + +

−N104

-

−N104

-

Identical to N4 exept that "web" is replaced -by the site s name

-
+ + + +

−N105

-

−N105

-

Identical to N5 exept that "web" is replaced -by the site s name

-
+ + + +

−N199

-

−N199

-

Identical to N99 exept that "web" is replaced -by the site s name

-
+ + + +

−N1001

-

−N1001

-

Identical to N1 exept that there is no "web" -directory

-
+ + + +

−N1002

-

−N1002

-

Identical to N2 exept that there is no "web" -directory

-
+ + + +

−N1003

-

−N1003

-

Identical to N3 exept that there is no "web" -directory (option set for g option)

-
+ + + +

−N1004

-

−N1004

-

Identical to N4 exept that there is no "web" -directory

-
+ + + +

−N1005

-

−N1005

-

Identical to N5 exept that there is no "web" -directory

-
+ + + +

−N1099

-

−N1099

-

Identical to N99 exept that there is no "web" -directory

-
- - - - - -
-

Details: User−defined option N

- - - - - +
-

%n Name of file without file type (ex: image) %N Name of -file, including file type (ex: image.gif) %t File type (ex: -gif) %p Path [without ending /] (ex: /someimages) %h Host -name (ex: www.someweb.com) %M URL MD5 (128 bits, 32 ascii -bytes) %Q query string MD5 (128 bits, 32 ascii bytes) %r -protocol name (ex: http) %q small query string MD5 (16 bits, -4 ascii bytes) %s? Short name version (ex: %sN) %[param] -param variable in query string +directory

+ +

Details: +User−defined option N
+%n Name of file without file type (ex: image)
+%N Name of file, including file type (ex: image.gif)
+%t File type (ex: gif)
+%p Path [without ending /] (ex: /someimages)
+%h Host name (ex: www.someweb.com)
+%M URL MD5 (128 bits, 32 ascii bytes)
+%Q query string MD5 (128 bits, 32 ascii bytes)
+%r protocol name (ex: http)
+%q small query string MD5 (16 bits, 4 ascii bytes)
+%s? Short name version (ex: %sN)
+%[param] param variable in query string
%[param:before:after:empty:notfound] advanced variable extraction

- - - - - - - -
-

Details: User−defined option N and advanced -variable extraction

- - - - - -
-

%[param:before:after:empty:notfound]

-
- - + +

Details: +User−defined option N and advanced variable +extraction
+%[param:before:after:empty:notfound]

+ +
- - - - -
+ -

−param

-
-

: parameter name

-
-
- - - - - -
-

−before

- - - - - +

−param

+ + +
-

: string to prepend if the parameter was found

-
+ + +

: parameter name

+
- - + +

−before

+ +

: string to prepend if the +parameter was found

+ +
- - - - -
+ -

−after

-
-

: string to append if the parameter was found

-
-
- - - - - -
-

−notfound

- - - - - +

−after

+ + +
-

: string replacement if the parameter could not be -found

-
+ + +

: string to append if the parameter was found

+
- - + +

−notfound

+ +

: string replacement if the +parameter could not be found

+ +
- - - +

−empty

+ + - + + - - -
+ -

−empty

-
-

: string replacement if the parameter was empty

-
+ + +

: string replacement if the parameter was empty

+ + + +

−all

-

−all

-

fields, except the first one (the parameter name), can -be empty

-
- - - - - +be empty

-

Details: Option K

- - + +

Details: +Option K

+ +
- - - + + + +

foo.cgi?q=45 −> +foo4B54.html?q=45 (relative URI, default)

- + + - - +(absolute URL) (−−keep−links[=N])

- + + - - +

−> foo.cgi?q=45 (original URL)

- - - -
+ + -

−K0

-
+

−K0

-

foo.cgi?q=45 −> foo4B54.html?q=45 (relative -URI, default)

-
+ + + +

−K

-

−K

-

−> http://www.foobar.com/folder/foo.cgi?q=45 -(absolute URL) (−−keep−links[=N])

-
+ + + +

−K4

-

−K4

-
-

−> foo.cgi?q=45 (original URL)

-
+ -

−K3

-
-

−> /folder/foo.cgi?q=45 (absolute URI)

-
- - - - - -
-

Shortcuts:

- - - - - -
-

−−mirror

- - - - - +

−K3

+ +
-

<URLs> *make a mirror of site(s) (default)

-
+ + +

−> /folder/foo.cgi?q=45 (absolute URI)

- - + + +

Shortcuts: +
+−−mirror

+ +

<URLs> *make a mirror of +site(s) (default)

+ +
- + + - +URLs (−qg)

- + + - -
+ + + +

−−get

-

−−get

-

<URLs> get the files indicated, do not seek other -URLs (−qg)

-
+ + + +

−−list

-

−−list

-

<text file> add all URL located in this text file -(−%L)

-
- - - - - -
-

−−mirrorlinks

- - - - - -
-

<URLs> mirror all links in 1st level pages -(−Y)

-
- - - - - -
-

−−testlinks

- - - - - -
-

<URLs> test links in pages (−r1p0C0I0t)

-
- - - - - -
-

−−spider

- - - - - -
-

<URLs> spider site(s), to test links: reports -Errors & Warnings (−p0C0I0t)

-
- - - - - -
-

−−testsite

- - - - - -
-

<URLs> identical to −−spider

-
- - - - - -
-

−−skeleton

- - - - - -
-

<URLs> make a mirror, but gets only html files -(−p1)

-
- - - - - -
-

−−update

- - - - - -
-

update a mirror, without confirmation (−iC2)

-
- - - - - -
-

−−continue

- - - - - -
-

continue a mirror, without confirmation (−iC1)

-
- - - - - -
-

−−catchurl

- - - - - -
-

create a temporary proxy to capture an URL or a form post -URL

-
- - - - - -
-

−−clean

- - - - - -
-

erase cache & log files

-
- - - - - -
-

−−http10

- - - - - -
-

force http/1.0 requests (−%h)

-
- - - - - -
-

Details: Option %W: External callbacks -prototypes

- - - - - +(−%L)

-

see htsdefines.h

+ +

−−mirrorlinks

+ +

<URLs> mirror all links +in 1st level pages (−Y)

+ +

−−testlinks

+ +

<URLs> test links in +pages (−r1p0C0I0t)

+ +

−−spider

+ +

<URLs> spider site(s), to +test links: reports Errors & Warnings +(−p0C0I0t)

+ +

−−testsite

+ +

<URLs> identical to +−−spider

+ +

−−skeleton

+ +

<URLs> make a mirror, but +gets only html files (−p1)

+ +

−−update

+ +

update a mirror, without +confirmation (−iC2)

+ +

−−continue

+ +

continue a mirror, without +confirmation (−iC1)

+ +

−−catchurl

+ +

create a temporary proxy to +capture an URL or a form post URL

+ +

−−clean

+ +

erase cache & log files

+ +

−−http10

+ +

force http/1.0 requests +(−%h)

+ +

Details: +Option %W: External callbacks prototypes
+see htsdefines.h

+ +

FILES -

FILES

- - - - - -
-

/etc/httrack.conf

- - - - - -
-

The system wide configuration file.

-
+ + + + +

/etc/httrack.conf

+ +

The system wide configuration +file.

+ +

ENVIRONMENT -

ENVIRONMENT

- - + + + +
- + + - - +

Is being used if you defined in +/etc/httrack.conf the line path ~/websites/#

+ + + +

HOME

-

HOME

-
-

Is being used if you defined in /etc/httrack.conf the -line path ~/websites/#

-
+ +

DIAGNOSTICS -

DIAGNOSTICS

- - - - - -
-

Errors/Warnings are reported to hts−log.txt -by default, or to stderr if the -v option was -specified.

-
+ + + + +

Errors/Warnings +are reported to hts−log.txt by default, or to +stderr if the -v option was specified.

+ +

LIMITS -

LIMITS

- - - - - -
-

These are the principals limits of HTTrack for that -moment. Note that we did not heard about any other utility -that would have solved them.

- -

- Several scripts generating complex filenames may -not find them (ex: + + + +

These are the +principals limits of HTTrack for that moment. Note that we +did not heard about any other utility that would have solved +them.

+ +

- +Several scripts generating complex filenames may not find +them (ex: img.src=’image’+a+Mobj.dst+’.gif’)

- -

- Some java classes may not find some files on -them (class included)

- -

- Cgi-bin links may not work properly in some -cases (parameters needed). To avoid them: use filters like + +

- Some +java classes may not find some files on them (class +included)

+ +

- +Cgi-bin links may not work properly in some cases +(parameters needed). To avoid them: use filters like -*cgi-bin*

-
+ +

BUGS -

BUGS

- - - - - -
-

Please reports bugs to <bugs@httrack.com>. -Include a complete, self-contained example that will allow -the bug to be reproduced, and say which version of httrack -you are using. Do not forget to detail options used, OS -version, and any other information you deem necessary.

-
+ + + +

Please reports +bugs to <bugs@httrack.com>. Include a complete, +self-contained example that will allow the bug to be +reproduced, and say which version of httrack you are using. +Do not forget to detail options used, OS version, and any +other information you deem necessary.

+ +

COPYRIGHT -

COPYRIGHT

- - - - - -
-

Copyright (C) Xavier Roche and other contributors

- -

This program is free software; you can redistribute it -and/or modify it under the terms of the GNU General Public -License as published by the Free Software Foundation; either -version 2 of the License, or any later version.

- -

This program is distributed in the hope that it will be -useful, but WITHOUT ANY WARRANTY; without even the implied -warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR -PURPOSE. See the GNU General Public License for more -details.

- -

You should have received a copy of the GNU General Public -License along with this program; if not, write to the Free -Software Foundation, Inc., 59 Temple Place - Suite 330, -Boston, MA 02111-1307, USA.

-
+ + + +

Copyright (C) +Xavier Roche and other contributors

+ +

This program is +free software; you can redistribute it and/or modify it +under the terms of the GNU General Public License as +published by the Free Software Foundation; either version 2 +of the License, or any later version.

+ +

This program is +distributed in the hope that it will be useful, but WITHOUT +ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details.

+ +

You should have +received a copy of the GNU General Public License along with +this program; if not, write to the Free Software Foundation, +Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, +USA.

+ +

AVAILABILITY -

AVAILABILITY

- - - - - -
-

The most recent released version of httrack can be found -at: http://www.httrack.com

-
+ + + +

The most recent +released version of httrack can be found at: +http://www.httrack.com

+ +

AUTHOR -

AUTHOR

- - - - - -
-

Xavier Roche <roche@httrack.com>

-
+ + + +

Xavier Roche +<roche@httrack.com>

+ +

SEE ALSO -

SEE ALSO

- - - - - -
-

The HTML documentation (available online at + + + +

The HTML +documentation (available online at http://www.httrack.com/html/ ) contains more detailed information. Please also refer to the httrack FAQ (available online at http://www.httrack.com/html/faq.html )

-

-- cgit v1.2.3