From a016accc4bbf6fdbb3353b13bddec377f6563fad Mon Sep 17 00:00:00 2001
From: Xavier Roche
httrack [ -url ]... [ −filter ]... [ +filter ]... [ −O -−−path ] [ −w -−−mirror ] [ −W -−−mirror−wizard ] [ −g -−−get−files ] [ −i -−−continue ] [ −Y -−−mirrorlinks ] [ −P -−−proxy ] [ −%f -−−httpproxy−ftp[=N] ] [ −%b -−−bind ] [ −rN -−−depth[=N] ] [ −%eN -−−ext−depth[=N] ] [ −mN -−−max−files[=N] ] [ −MN -−−max−size[=N] ] [ −EN -−−max−time[=N] ] [ −AN -−−max−rate[=N] ] [ −%cN +url ]... [ −filter ]... [ +filter ]... [ −O, +−−path ] [ −w, +−−mirror ] [ −W, +−−mirror−wizard ] [ −g, +−−get−files ] [ −i, +−−continue ] [ −Y, +−−mirrorlinks ] [ −P, +−−proxy ] [ −%f, +−−httpproxy−ftp[=N] ] [ −%b, +−−bind ] [ −rN, +−−depth[=N] ] [ −%eN, +−−ext−depth[=N] ] [ −mN, +−−max−files[=N] ] [ −MN, +−−max−size[=N] ] [ −EN, +−−max−time[=N] ] [ −AN, +−−max−rate[=N] ] [ −%cN, −−connection−per−second[=N] ] [ -−GN −−max−pause[=N] ] [ -−cN −−sockets[=N] ] [ −TN -−−timeout[=N] ] [ −RN -−−retries[=N] ] [ −JN -−−min−rate[=N] ] [ −HN -−−host−control[=N] ] [ −%P -−−extended−parsing[=N] ] [ −n -−−near ] [ −t -−−test ] [ −%L -−−list ] [ −%S -−−urllist ] [ −NN -−−structure[=N] ] [ −%D +−GN, −−max−pause[=N] ] [ +−cN, −−sockets[=N] ] [ +−TN, −−timeout[=N] ] [ +−RN, −−retries[=N] ] [ +−JN, −−min−rate[=N] ] [ +−HN, −−host−control[=N] ] [ +−%P, −−extended−parsing[=N] ] +[ −n, −−near ] [ −t, +−−test ] [ −%L, +−−list ] [ −%S, +−−urllist ] [ −NN, +−−structure[=N] ] [ −%D, −−cached−delayed−type−check -] [ −%M −−mime−html ] [ -−LN −−long−names[=N] ] [ -−KN −−keep−links[=N] ] [ -−x −−replace−external ] [ -−%x −−disable−passwords ] [ -−%q +] [ −%M, −−mime−html ] [ +−LN, −−long−names[=N] ] [ +−KN, −−keep−links[=N] ] [ +−x, −−replace−external ] [ +−%x, −−disable−passwords ] [ +−%q, −−include−query−string ] [ -−o −−generate−errors ] [ -−X −−purge−old[=N] ] [ -−%p −−preserve ] [ −%T -−−utf8−conversion ] [ −bN -−−cookies[=N] ] [ −u -−−check−type[=N] ] [ −j -−−parse−java[=N] ] [ −sN -−−robots[=N] ] [ −%h -−−http−10 ] [ −%k -−−keep−alive ] [ −%B -−−tolerant ] [ −%s -−−updatehack ] [ −%u -−−urlhack ] [ −%A -−−assume ] [ −@iN -−−protocol[=N] ] [ −%w -−−disable−module ] [ −F -−−user−agent ] [ −%R -−−referer ] [ −%E -−−from ] [ −%F -−−footer ] [ −%l -−−language ] [ −%a -−−accept ] [ −%X -−−headers ] [ −C -−−cache[=N] ] [ −k +−o, −−generate−errors ] [ +−X, −−purge−old[=N] ] [ +−%p, −−preserve ] [ −%T, +−−utf8−conversion ] [ −bN, +−−cookies[=N] ] [ −u, +−−check−type[=N] ] [ −j, +−−parse−java[=N] ] [ −sN, +−−robots[=N] ] [ −%h, +−−http−10 ] [ −%k, +−−keep−alive ] [ −%B, +−−tolerant ] [ −%s, +−−updatehack ] [ −%u, +−−urlhack ] [ −%A, +−−assume ] [ −@iN, +−−protocol[=N] ] [ −%w, +−−disable−module ] [ −F, +−−user−agent ] [ −%R, +−−referer ] [ −%E, +−−from ] [ −%F, +−−footer ] [ −%l, +−−language ] [ −%a, +−−accept ] [ −%X, +−−headers ] [ −C, +−−cache[=N] ] [ −k, −−store−all−in−cache ] [ -−%n −−do−not−recatch ] -[ −%v −−display ] [ −Q -−−do−not−log ] [ −q -−−quiet ] [ −z -−−extra−log ] [ −Z -−−debug−log ] [ −v -−−verbose ] [ −f -−−file−log ] [ −f2 -−−single−log ] [ −I -−−index ] [ −%i +−%n, −−do−not−recatch ] +[ −%v, −−display ] [ −Q, +−−do−not−log ] [ −q, +−−quiet ] [ −z, +−−extra−log ] [ −Z, +−−debug−log ] [ −v, +−−verbose ] [ −f, +−−file−log ] [ −f2, +−−single−log ] [ −I, +−−index ] [ −%i, −−build−top−index ] [ -−%I −−search−index ] [ -−pN −−priority[=N] ] [ −S +−%I, −−search−index ] [ +−pN, −−priority[=N] ] [ +−S, −−stay−on−same−dir ] [ -−D −−can−go−down ] [ -−U −−can−go−up ] [ -−B +−D, −−can−go−down ] [ +−U, −−can−go−up ] [ +−B, −−can−go−up−and−down -] [ −a +] [ −a, −−stay−on−same−address ] [ -−d +−d, −−stay−on−same−domain ] [ -−l +−l, −−stay−on−same−tld ] [ -−e −−go−everywhere ] [ -−%H −−debug−headers ] [ -−%! +−e, −−go−everywhere ] [ +−%H, −−debug−headers ] [ +−%!, −−disable−security−limits ] [ -−V −−userdef−cmd ] [ -−%W −−callback ] [ −K +−V, −−userdef−cmd ] [ +−%W, −−callback ] [ −K, −−keep−links[=N] ] [
means get all files starting -from bobby.html with 6 link−depth and possibility of +from bobby.html, with 6 link−depth, and possibility of going everywhere on the web
httrack @@ -233,7 +234,7 @@ options:
path for mirror/logfiles+cache (−O path -mirror[path cache and logfiles]) (−−path +mirror[,path cache and logfiles]) (−−path <param>)
@@ -264,7 +265,7 @@ options:mirror web sites semi−automatic (asks questions) +
mirror web sites, semi−automatic (asks questions) (−−mirror−wizard)
−rN
−%eN
−mN
−mNN2
−mN,N2
+maximum file length for non html (N) and html (N2)
−MN
−EN
maximum mirror time in seconds (60=1 minute 3600=1 hour) -(−−max−time[=N])
maximum mirror time in seconds (60=1 minute, 3600=1 +hour) (−−max−time[=N])
−AN
−%cN
−GN
pause transfer if N bytes reached and wait until lock +
pause transfer if N bytes reached, and wait until lock file is deleted (−−max−pause[=N])
timeout number of seconds after a non−responding +
timeout, number of seconds after a non−responding link is shutdown (−−timeout[=N])
number of retries in case of timeout or non−fatal +
number of retries, in case of timeout or non−fatal errors (*R1) (−−retries[=N])
traffic jam control minimum transfert rate +
traffic jam control, minimum transfert rate (bytes/seconds) tolerated for a link (−−min−rate[=N])
host is abandonned if: 0=never 1=timeout 2=slow +
host is abandonned if: 0=never, 1=timeout, 2=slow, 3=timeout or slow (−−host−control[=N])
*extended parsing attempt to -parse all links even in unknown tags or Javascript (%P0 don +
*extended parsing, attempt to +parse all links, even in unknown tags or Javascript (%P0 don t use) (−−extended−parsing[=N])
structure type (0 *original -structure 1+: see below) (−−structure[=N])
delayed type check don t make any link test but wait for -files download to start instead (experimental) (%N0 don t -use %N1 use for unknown extensions * %N2 always use)
delayed type check, don t make any link test but wait +for files download to start instead (experimental) (%N0 don +t use, %N1 use for unknown extensions, * %N2 always use)
cached delayed type check don t wait for remote type -during updates to speedup them (%D0 wait * %D1 don t wait) +
cached delayed type check, don t wait for remote type +during updates, to speedup them (%D0 wait, * %D1 don t wait) (−−cached−delayed−type−check)
keep original links (e.g. http://www.adr/link) (K0 -*relative link K absolute links K4 original links K3 -absolute URI links K5 transparent proxy link) +*relative link, K absolute links, K4 original links, K3 +absolute URI links, K5 transparent proxy link) (−−keep−links[=N])
*include query string for local files (useless for +
*include query string for local files (useless, for information purpose only) (%q0 don t include) (−−include−query−string)
accept cookies in cookies.txt -(0=do not accept* 1=accept) (−−cookies[=N])
check document type if unknown (cgiasp..) (u0 don t -check * u1 check but / u2 check always) +
check document type if unknown (cgi,asp..) (u0 don t +check, * u1 check but /, u2 check always) (−−check−type[=N])
*parse Java Classes (j0 don t parse bitmask: |1 parse -default |2 don t parse .class |4 don t parse .js |8 don t be -aggressive) (−−parse−java[=N])
*parse Java Classes (j0 don t parse, bitmask: |1 parse +default, |2 don t parse .class |4 don t parse .js |8 don t +be aggressive) (−−parse−java[=N])
follow robots.txt and meta robots tags -(0=never1=sometimes* 2=always 3=always (even strict rules)) -(−−robots[=N])
+(0=never,1=sometimes,* 2=always, 3=always (even strict +rules)) (−−robots[=N])force HTTP/1.0 requests (reduce update features only for -old servers or proxies) (−−http−10)
force HTTP/1.0 requests (reduce update features, only +for old servers or proxies) +(−−http−10)
use keep−alive if possible greately reducing +
use keep−alive if possible, greately reducing latency for small files and test requests (%k0 don t use) (−−keep−alive)
tolerant requests (accept bogus responses on some -servers but not standard!) (−−tolerant)
update hacks: various hacks to limit re−transfers -when updating (identical size bogus response..) +when updating (identical size, bogus response..) (−−updatehack)
url hacks: various hacks to limit duplicate URLs (strip -// www.foo.com==foo.com..) (−−urlhack)
assume that a type (cgiasp..) is always linked with a +
assume that a type (cgi,asp..) is always linked with a mime type (−%A -php3cgi=text/html;datbin=application/x−zip) +php3,cgi=text/html;dat,bin=application/x−zip) (−−assume <param>)
internet protocol (0=both ipv6+ipv4 4=ipv4 only 6=ipv6 +
internet protocol (0=both ipv6+ipv4, 4=ipv4 only, 6=ipv6 only) (−−protocol[=N])
preffered language (−%l "fr en jp *" +
preffered language (−%l "fr, en, jp, *" (−−language <param>)
accepted formats (−%l -"text/htmlimage/pngimage/jpegimage/gif;q=0.9*/*;q=0.1" +"text/html,image/png,image/jpeg,image/gif;q=0.9,*/*;q=0.1" (−−accept <param>)
Log index +
Log, index, cache
| @@ -1247,7 +1249,7 @@ options: |
- just scan don t save anything (for checking links) | |
| @@ -1291,7 +1293,7 @@ options: |
- get html files before then treat other files | |
|
@@ -1648,14 +1650,14 @@ doing)
bypass built−in security -limits aimed to avoid bandwidth abuses (bandwidth +limits aimed to avoid bandwidth abuses (bandwidth, simultaneous connections) (−−disable−security−limits) |
−IMPORTANT
-NOTE: DANGEROUS OPTION ONLY +
NOTE: DANGEROUS OPTION, ONLY SUITABLE FOR EXPERTS
| @@ -1705,7 +1707,7 @@ each files ($0 is the filename: −V "rm ") |
HTML in web/ images/other files in web/images/
HTML in web/, images/other files in web/images/
HTML in web/HTML images/other in web/images
HTML in web/HTML, images/other in web/images
HTML in web/ images/other in web/
HTML in web/, images/other in web/
HTML in web/ images/other in web/xxx where xxx is the -file extension (all gif will be placed onto web/gif for +
HTML in web/, images/other in web/xxx, where xxx is the +file extension (all gif will be placed onto web/gif, for example)
All files in web/ with random names (gadget !)
All files in web/, with random names (gadget !)
Site−structure without www.domain.xxx/
Site−structure, without www.domain.xxx/
Details:
User−defined option N
%n Name of file without file type (ex: image)
-%N Name of file including file type (ex: image.gif)
+%N Name of file, including file type (ex: image.gif)
%t File type (ex: gif)
%p Path [without ending /] (ex: /someimages)
%h Host name (ex: www.someweb.com)
-%M URL MD5 (128 bits 32 ascii bytes)
-%Q query string MD5 (128 bits 32 ascii bytes)
+%M URL MD5 (128 bits, 32 ascii bytes)
+%Q query string MD5 (128 bits, 32 ascii bytes)
%k full query string
%r protocol name (ex: http)
-%q small query string MD5 (16 bits 4 ascii bytes)
+%q small query string MD5 (16 bits, 4 ascii bytes)
%s? Short name version (ex: %sN)
%[param] param variable in query string
%[param:before:after:empty:notfound] advanced variable
@@ -2040,8 +2042,8 @@ parameter could not be found
fields except the first one (the parameter name) can be -empty
fields, except the first one (the parameter name), can +be empty
Details: @@ -2060,7 +2062,7 @@ Option K
foo.cgi?q=45 −> -foo4B54.html?q=45 (relative URI default)
+foo4B54.html?q=45 (relative URI, default)<URLs> get the files indicated do not seek other +
<URLs> get the files indicated, do not seek other URLs (−qg)
−−spider
-<URLs> spider site(s) to +
<URLs> spider site(s), to test links: reports Errors & Warnings (−p0C0I0t)
@@ -2169,17 +2171,17 @@ test links: reports Errors & Warnings−−skeleton
-<URLs> make a mirror but +
<URLs> make a mirror, but gets only html files (−p1)
−−update
-update a mirror without +
update a mirror, without confirmation (−iC2)
−−continue
-continue a mirror without +
continue a mirror, without confirmation (−iC1)
−−catchurl
-- cgit v1.2.3