From aafafee47ef8653d2d38ec28804061f8e12e3478 Mon Sep 17 00:00:00 2001
From: Xavier Roche
httrack [ -url ]... [ −filter ]... [ +filter ]... [ −O, -−−path ] [ −w, -−−mirror ] [ −W, -−−mirror−wizard ] [ −g, -−−get−files ] [ −i, -−−continue ] [ −Y, -−−mirrorlinks ] [ −P, -−−proxy ] [ −%f, -−−httpproxy−ftp[=N] ] [ −%b, -−−bind ] [ −rN, -−−depth[=N] ] [ −%eN, -−−ext−depth[=N] ] [ −mN, -−−max−files[=N] ] [ −MN, -−−max−size[=N] ] [ −EN, -−−max−time[=N] ] [ −AN, -−−max−rate[=N] ] [ −%cN, +url ]... [ −filter ]... [ +filter ]... [ −O +−−path ] [ −w +−−mirror ] [ −W +−−mirror−wizard ] [ −g +−−get−files ] [ −i +−−continue ] [ −Y +−−mirrorlinks ] [ −P +−−proxy ] [ −%f +−−httpproxy−ftp[=N] ] [ −%b +−−bind ] [ −rN +−−depth[=N] ] [ −%eN +−−ext−depth[=N] ] [ −mN +−−max−files[=N] ] [ −MN +−−max−size[=N] ] [ −EN +−−max−time[=N] ] [ −AN +−−max−rate[=N] ] [ −%cN −−connection−per−second[=N] ] [ -−GN, −−max−pause[=N] ] [ -−cN, −−sockets[=N] ] [ -−TN, −−timeout[=N] ] [ -−RN, −−retries[=N] ] [ -−JN, −−min−rate[=N] ] [ -−HN, −−host−control[=N] ] [ -−%P, −−extended−parsing[=N] ] -[ −n, −−near ] [ −t, -−−test ] [ −%L, -−−list ] [ −%S, -−−urllist ] [ −NN, -−−structure[=N] ] [ −%D, +−GN −−max−pause[=N] ] [ +−cN −−sockets[=N] ] [ −TN +−−timeout[=N] ] [ −RN +−−retries[=N] ] [ −JN +−−min−rate[=N] ] [ −HN +−−host−control[=N] ] [ −%P +−−extended−parsing[=N] ] [ −n +−−near ] [ −t +−−test ] [ −%L +−−list ] [ −%S +−−urllist ] [ −NN +−−structure[=N] ] [ −%D −−cached−delayed−type−check -] [ −%M, −−mime−html ] [ -−LN, −−long−names[=N] ] [ -−KN, −−keep−links[=N] ] [ -−x, −−replace−external ] [ -−%x, −−disable−passwords ] [ -−%q, +] [ −%M −−mime−html ] [ +−LN −−long−names[=N] ] [ +−KN −−keep−links[=N] ] [ +−x −−replace−external ] [ +−%x −−disable−passwords ] [ +−%q −−include−query−string ] [ -−o, −−generate−errors ] [ -−X, −−purge−old[=N] ] [ -−%p, −−preserve ] [ −%T, -−−utf8−conversion ] [ −bN, -−−cookies[=N] ] [ −u, -−−check−type[=N] ] [ −j, -−−parse−java[=N] ] [ −sN, -−−robots[=N] ] [ −%h, -−−http−10 ] [ −%k, -−−keep−alive ] [ −%B, -−−tolerant ] [ −%s, -−−updatehack ] [ −%u, -−−urlhack ] [ −%A, -−−assume ] [ −@iN, -−−protocol[=N] ] [ −%w, -−−disable−module ] [ −F, -−−user−agent ] [ −%R, -−−referer ] [ −%E, -−−from ] [ −%F, -−−footer ] [ −%l, -−−language ] [ −%a, -−−accept ] [ −%X, -−−headers ] [ −C, -−−cache[=N] ] [ −k, +−o −−generate−errors ] [ +−X −−purge−old[=N] ] [ +−%p −−preserve ] [ −%T +−−utf8−conversion ] [ −bN +−−cookies[=N] ] [ −u +−−check−type[=N] ] [ −j +−−parse−java[=N] ] [ −sN +−−robots[=N] ] [ −%h +−−http−10 ] [ −%k +−−keep−alive ] [ −%B +−−tolerant ] [ −%s +−−updatehack ] [ −%u +−−urlhack ] [ −%A +−−assume ] [ −@iN +−−protocol[=N] ] [ −%w +−−disable−module ] [ −F +−−user−agent ] [ −%R +−−referer ] [ −%E +−−from ] [ −%F +−−footer ] [ −%l +−−language ] [ −%a +−−accept ] [ −%X +−−headers ] [ −C +−−cache[=N] ] [ −k −−store−all−in−cache ] [ -−%n, −−do−not−recatch ] -[ −%v, −−display ] [ −Q, -−−do−not−log ] [ −q, -−−quiet ] [ −z, -−−extra−log ] [ −Z, -−−debug−log ] [ −v, -−−verbose ] [ −f, -−−file−log ] [ −f2, -−−single−log ] [ −I, -−−index ] [ −%i, +−%n −−do−not−recatch ] +[ −%v −−display ] [ −Q +−−do−not−log ] [ −q +−−quiet ] [ −z +−−extra−log ] [ −Z +−−debug−log ] [ −v +−−verbose ] [ −f +−−file−log ] [ −f2 +−−single−log ] [ −I +−−index ] [ −%i −−build−top−index ] [ -−%I, −−search−index ] [ -−pN, −−priority[=N] ] [ -−S, +−%I −−search−index ] [ +−pN −−priority[=N] ] [ −S −−stay−on−same−dir ] [ -−D, −−can−go−down ] [ -−U, −−can−go−up ] [ -−B, +−D −−can−go−down ] [ +−U −−can−go−up ] [ +−B −−can−go−up−and−down -] [ −a, +] [ −a −−stay−on−same−address ] [ -−d, +−d −−stay−on−same−domain ] [ -−l, +−l −−stay−on−same−tld ] [ -−e, −−go−everywhere ] [ -−%H, −−debug−headers ] [ -−%!, +−e −−go−everywhere ] [ +−%H −−debug−headers ] [ +−%! −−disable−security−limits ] [ -−V, −−userdef−cmd ] [ -−%W, −−callback ] [ −K, +−V −−userdef−cmd ] [ +−%W −−callback ] [ −K −−keep−links[=N] ] [
means get all files starting -from bobby.html, with 6 link−depth, and possibility of +from bobby.html with 6 link−depth and possibility of going everywhere on the web
httrack @@ -234,7 +233,7 @@ options:
path for mirror/logfiles+cache (−O path -mirror[,path cache and logfiles]) (−−path +mirror[path cache and logfiles]) (−−path <param>)
@@ -265,7 +264,7 @@ options:mirror web sites, semi−automatic (asks questions) +
mirror web sites semi−automatic (asks questions) (−−mirror−wizard)
−rN
−%eN
−mN
−mN,N2
−mNN2
+maximum file length for non html (N) and html (N2)
−MN
−EN
maximum mirror time in seconds (60=1 minute, 3600=1 -hour) (−−max−time[=N])
maximum mirror time in seconds (60=1 minute 3600=1 hour) +(−−max−time[=N])
−AN
−%cN
−GN
pause transfer if N bytes reached, and wait until lock +
pause transfer if N bytes reached and wait until lock file is deleted (−−max−pause[=N])
timeout, number of seconds after a non−responding +
timeout number of seconds after a non−responding link is shutdown (−−timeout[=N])
number of retries, in case of timeout or non−fatal +
number of retries in case of timeout or non−fatal errors (*R1) (−−retries[=N])
traffic jam control, minimum transfert rate +
traffic jam control minimum transfert rate (bytes/seconds) tolerated for a link (−−min−rate[=N])
host is abandonned if: 0=never, 1=timeout, 2=slow, +
host is abandonned if: 0=never 1=timeout 2=slow 3=timeout or slow (−−host−control[=N])
*extended parsing, attempt to -parse all links, even in unknown tags or Javascript (%P0 don +
*extended parsing attempt to +parse all links even in unknown tags or Javascript (%P0 don t use) (−−extended−parsing[=N])
structure type (0 *original -structure, 1+: see below) (−−structure[=N])
delayed type check, don t make any link test but wait -for files download to start instead (experimental) (%N0 don -t use, %N1 use for unknown extensions, * %N2 always use)
delayed type check don t make any link test but wait for +files download to start instead (experimental) (%N0 don t +use %N1 use for unknown extensions * %N2 always use)
cached delayed type check, don t wait for remote type -during updates, to speedup them (%D0 wait, * %D1 don t wait) +
cached delayed type check don t wait for remote type +during updates to speedup them (%D0 wait * %D1 don t wait) (−−cached−delayed−type−check)
keep original links (e.g. http://www.adr/link) (K0 -*relative link, K absolute links, K4 original links, K3 -absolute URI links, K5 transparent proxy link) +*relative link K absolute links K4 original links K3 +absolute URI links K5 transparent proxy link) (−−keep−links[=N])
*include query string for local files (useless, for +
*include query string for local files (useless for information purpose only) (%q0 don t include) (−−include−query−string)
accept cookies in cookies.txt -(0=do not accept,* 1=accept) (−−cookies[=N])
check document type if unknown (cgi,asp..) (u0 don t -check, * u1 check but /, u2 check always) +
check document type if unknown (cgiasp..) (u0 don t +check * u1 check but / u2 check always) (−−check−type[=N])
*parse Java Classes (j0 don t parse, bitmask: |1 parse -default, |2 don t parse .class |4 don t parse .js |8 don t -be aggressive) (−−parse−java[=N])
*parse Java Classes (j0 don t parse bitmask: |1 parse +default |2 don t parse .class |4 don t parse .js |8 don t be +aggressive) (−−parse−java[=N])
follow robots.txt and meta robots tags -(0=never,1=sometimes,* 2=always, 3=always (even strict -rules)) (−−robots[=N])
+(0=never1=sometimes* 2=always 3=always (even strict rules)) +(−−robots[=N])force HTTP/1.0 requests (reduce update features, only -for old servers or proxies) -(−−http−10)
force HTTP/1.0 requests (reduce update features only for +old servers or proxies) (−−http−10)
use keep−alive if possible, greately reducing +
use keep−alive if possible greately reducing latency for small files and test requests (%k0 don t use) (−−keep−alive)
tolerant requests (accept bogus responses on some -servers, but not standard!) (−−tolerant)
update hacks: various hacks to limit re−transfers -when updating (identical size, bogus response..) +when updating (identical size bogus response..) (−−updatehack)
url hacks: various hacks to limit duplicate URLs (strip -//, www.foo.com==foo.com..) (−−urlhack)
assume that a type (cgi,asp..) is always linked with a +
assume that a type (cgiasp..) is always linked with a mime type (−%A -php3,cgi=text/html;dat,bin=application/x−zip) +php3cgi=text/html;datbin=application/x−zip) (−−assume <param>)
internet protocol (0=both ipv6+ipv4, 4=ipv4 only, 6=ipv6 +
internet protocol (0=both ipv6+ipv4 4=ipv4 only 6=ipv6 only) (−−protocol[=N])
preffered language (−%l "fr, en, jp, *" +
preffered language (−%l "fr en jp *" (−−language <param>)
accepted formats (−%l -"text/html,image/png,image/jpeg,image/gif;q=0.9,*/*;q=0.1" +"text/htmlimage/pngimage/jpegimage/gif;q=0.9*/*;q=0.1" (−−accept <param>)
Log, index, +
Log index cache
| @@ -1249,7 +1247,7 @@ options: |
- just scan, don t save anything (for checking links) | |
| @@ -1293,7 +1291,7 @@ options: |
- get html files before, then treat other files | |
|
@@ -1650,14 +1648,14 @@ doing)
bypass built−in security -limits aimed to avoid bandwidth abuses (bandwidth, +limits aimed to avoid bandwidth abuses (bandwidth simultaneous connections) (−−disable−security−limits) |
−IMPORTANT
-NOTE: DANGEROUS OPTION, ONLY +
NOTE: DANGEROUS OPTION ONLY SUITABLE FOR EXPERTS
HTML in web/, images/other files in web/images/
HTML in web/ images/other files in web/images/
HTML in web/HTML, images/other in web/images
HTML in web/HTML images/other in web/images
HTML in web/, images/other in web/
HTML in web/ images/other in web/
HTML in web/, images/other in web/xxx, where xxx is the -file extension (all gif will be placed onto web/gif, for +
HTML in web/ images/other in web/xxx where xxx is the +file extension (all gif will be placed onto web/gif for example)
All files in web/, with random names (gadget !)
All files in web/ with random names (gadget !)
Site−structure, without www.domain.xxx/
Site−structure without www.domain.xxx/
Details:
User−defined option N
%n Name of file without file type (ex: image)
-%N Name of file, including file type (ex: image.gif)
+%N Name of file including file type (ex: image.gif)
%t File type (ex: gif)
%p Path [without ending /] (ex: /someimages)
%h Host name (ex: www.someweb.com)
-%M URL MD5 (128 bits, 32 ascii bytes)
-%Q query string MD5 (128 bits, 32 ascii bytes)
+%M URL MD5 (128 bits 32 ascii bytes)
+%Q query string MD5 (128 bits 32 ascii bytes)
%k full query string
%r protocol name (ex: http)
-%q small query string MD5 (16 bits, 4 ascii bytes)
+%q small query string MD5 (16 bits 4 ascii bytes)
%s? Short name version (ex: %sN)
%[param] param variable in query string
%[param:before:after:empty:notfound] advanced variable
@@ -2042,8 +2040,8 @@ parameter could not be found
fields, except the first one (the parameter name), can -be empty
fields except the first one (the parameter name) can be +empty
Details: @@ -2062,7 +2060,7 @@ Option K
foo.cgi?q=45 −> -foo4B54.html?q=45 (relative URI, default)
+foo4B54.html?q=45 (relative URI default)<URLs> get the files indicated, do not seek other +
<URLs> get the files indicated do not seek other URLs (−qg)
−−spider
-<URLs> spider site(s), to +
<URLs> spider site(s) to test links: reports Errors & Warnings (−p0C0I0t)
@@ -2171,17 +2169,17 @@ test links: reports Errors & Warnings−−skeleton
-<URLs> make a mirror, but +
<URLs> make a mirror but gets only html files (−p1)
−−update
-update a mirror, without +
update a mirror without confirmation (−iC2)
−−continue
-continue a mirror, without +
continue a mirror without confirmation (−iC1)
−−catchurl
-- cgit v1.2.3