From ad5b7acc19290ff91e0f42a0de448a26760fcf99 Mon Sep 17 00:00:00 2001 From: Xavier Roche Date: Mon, 19 Mar 2012 12:36:11 +0000 Subject: Imported httrack 3.20.2 --- HelpHtml/options.html | 363 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 363 insertions(+) create mode 100644 HelpHtml/options.html (limited to 'HelpHtml/options.html') diff --git a/HelpHtml/options.html b/HelpHtml/options.html new file mode 100644 index 0000000..7db0516 --- /dev/null +++ b/HelpHtml/options.html @@ -0,0 +1,363 @@ + + + + + + + HTTrack Website Copier - Offline Browser + + + + + + + + + +
HTTrack Website Copier
+ + + + +
Open Source offline browser
+ + + + +
+ + + + +
+ + + + +
+ + +

Options

+ +
    +
  • Filters: how to use them
  • +
    Here you can find informations on filters: how to accept all gif files in a mirror, for example +

    +
  • List of options
  • +
+ + +
+
+w  mirror with automatic wizard
+This is the default scanning option, the engine automatically scans links according to the default options, and filters defined. It does not prompt a message when a "foreign" link is reached.
+
+W  semi-automatic mirror with help-wizard (asks questions)
+This option lets the engine ask the user if a link must be mirrored or not, when a new web has been found.
+
+g  just get files (saved in the current directory)
+This option forces the engine not to scan the files indicated - i.e. the engine only gets the files indicated.
+
+i  continue an interrupted mirror using the cache
+This option indicates to the engine that a mirror must be updated or continued.
+
+rN  recurse get with limited link depth of N
+This option sets the maximum recurse level. Default is infinite (the engine "knows" that it should not go out of current domain)
+
+a   stay on the same address
+This is the default primary scanning option, the engine does not go out of domains without permissions (filters, for example)
+
+d   stay on the same principal domain
+This option lets the engine go on all sites that exist on the same principal domain.
+Example: a link located at www.someweb.com that goes to members.someweb.com will be followed.
+
+l   stay on the same location (.com, etc.)
+This option lets the engine go on all sites that exist on the same location.
+Example: a link located at www.someweb.com that goes to www.anyotherweb.com will be followed.
+Warning: this is a potentially dangerous option, limit the recurse depth with r option.
+
+e   go everywhere on the web
+This option lets the engine go on any sites.
+Example: a link located at www.someweb.com that goes to www.anyotherweb.org will be followed.
+Warning: this is a potentially dangerous option, limit the recurse depth with r option.
+
+n   get non-html files 'near' an html file (ex: an image located outside)
+This option lets the engine catch all files that have references on a page, but that exist outside the web site.
+Example: List of ZIP files links on a page.
+
+t   test all URLs (even forbidden ones)
+This option lets the engine test all links that are not caught.
+Example: to test broken links in a site
+
+x   replace external html links by error pages
+This option tells the engine to rewrite all links not taken into warning pages.
+Example: to browse offline a site, and to warn people that they must be online if they click to external links.
+
+sN  follow robots.txt and meta robots tags
+This option sets the way the engine treats "robots.txt" files. This file is often set by webmasters to avoir cgi-bin directories, or other irrevelant pages.
+Values: 
+  s0  Do not take robots.txt rules
+  s1  Follow rules, if compatible with internal filters
+  s2  Always follow site's rules
+
+bN  accept cookies in cookies.txt
+This option activates or unactivates the cookie
+  b0 do not accept cookies
+  b1 accept cookies
+
+S   stay on the same directory
+This option asks the engine to stay on the same folder level.
+Example: A link in /index.html that points to /sub/other.html will not be followed
+
+D   can only go down into subdirs
+This is the default option, the engine can go everywhere on the same directoy, or in lower structures
+
+U   can only go to upper directories
+This option asks the engine to stay on the same folder level or in upper structures
+
+B   can both go up&down into the directory structure
+This option lets the engine to go in any directory level
+
+Y   mirror ALL links located in the first level pages (mirror links)
+This option is activated for the links typed in the command line
+Example: if you have a list of web sites in www.asitelist.com/index.html, then all these sites will be mirrored
+
+NN  name conversion type (0 *original structure 1,2,3 html/data in one directory)
+  N0 Site-structure (default)
+  N1 Html in web/, images/other files in web/images/
+  N2 Html in web/html, images/other in web/images
+  N3 Html in web/,  images/other in web/
+  N4 Html in web/, images/other in web/xxx, where xxx is the file extension (all gif will be placed onto web/gif, for example)
+  N5 Images/other in web/xxx and Html in web/html
+
+  N99 All files in web/, with random names (gadget !)
+
+  N100 Site-structure, without www.domain.xxx/
+  N101 Identical to N1 exept that "web" is replaced by the site's name
+  N102 Identical to N2 exept that "web" is replaced by the site's name
+  N103 Identical to N3 exept that "web" is replaced by the site's name
+  N104 Identical to N4 exept that "web" is replaced by the site's name
+  N105 Identical to N5 exept that "web" is replaced by the site's name
+  N199 Identical to N99 exept that "web" is replaced by the site's name
+
+  N1001 Identical to N1 exept that there is no "web" directory
+  N1002 Identical to N2 exept that there is no "web" directory
+  N1003 Identical to N3 exept that there is no "web" directory (option set for g option)
+  N1004 Identical to N4 exept that there is no "web" directory
+  N1005 Identical to N5 exept that there is no "web" directory
+  N1099 Identical to N99 exept that there is no "web" directory
+
+LN  long names
+  L0 Filenames and directory names are limited to 8 characters + 3 for extension
+  L1 No restrictions (default)
+
+K   keep original links (e.g. http://www.adr/link) (K0 *relative link)
+This option has only been kept for compatibility reasons
+
+pN  priority mode:
+  p0 just scan, don't save anything (for checking links)
+  p1 save only html files
+  p2 save only non html files
+  p3 save all files
+  p7 get html files before, then treat other files
+
+cN  number of multiple connections (*c8)
+Set the numer of multiple simultaneous connections
+
+O   path for mirror/logfiles+cache (-O path_mirror[,path_cache_and_logfiles])
+This option define the path for mirror and log files
+Example: -P "/user/webs","/user/logs"
+
+P   proxy use (-P proxy:port or -P user:pass@proxy:port)
+This option define the proxy used in this mirror
+Example: -P proxy.myhost.com:8080
+
+F   user-agent field (-F \"user-agent name\
+This option define the user-agent field
+Example: -F "Mozilla/4.5 (compatible; HTTrack 1.2x; Windows 98)"
+
+mN maximum file length for a non-html file
+This option define the maximum size for non-html files
+Example: -m100000
+
+mN,N'  for non html (N) and html (N')
+This option define the maximum size for non-html files and html-files
+Example: -m100000,250000
+
+MN maximum overall size that can be uploaded/scanned
+This option define the maximum amount of bytes that can be downloaded
+Example: -M1000000
+
+EN maximum mirror time in seconds (60=1 minute, 3600=1 hour)
+This option define the maximum time that the mirror can last
+Example: -E3600
+
+AN maximum transfer rate in bytes/seconds (1000=1kb/s max)
+This option define the maximum transfer rate
+Example: -A2000
+
+GN pause transfer if N bytes reached, and wait until lock file is deleted
+This option asks the engine to pause every time N bytes have been transfered, and restarts when the lock file "hts-pause.lock" is being deleted
+Example: -G20000000
+
+u check document type if unknown (cgi,asp..)
+This option define the way the engine checks the file type
+  u0 do not check
+  u1 check but /
+  u2 check always
+
+RN number of retries, in case of timeout or non-fatal errors (*R0)
+This option sets the maximum number of tries that can be processed for a file
+
+o *generate output html file in case of error (404..) (o0 don't generate)
+This option define whether the engine has to generate html output file or not if an error occured
+
+TN timeout, number of seconds after a non-responding link is shutdown
+This option define the timeout
+Example: -T120
+
+JN traffic jam control, minimum transfert rate (bytes/seconds) tolerated for a link
+This option define the minimum transfer rate
+Example: -J200
+
+HN host is abandonned if: 0=never, 1=timeout, 2=slow, 3=timeout or slow
+This option define whether the engine has to abandon a host if a timeout/"too slow" error occured
+
+&P extended parsing, attempt to parse all links (even in unknown tags or Javascript)
+This option activates the extended parsing, that attempt to find links in unknown Html code/javascript
+
+j *parse Java Classes (j0 don't parse)
+This option define whether the engine has to parse java files or not to catch included files
+
+I *make an index (I0 don't make)
+This option define whether the engine has to generate an index.html on the top directory
+
+X *delete old files after update (X0 keep delete)
+This option define whether the engine has to delete locally, after an update, files that have been deleted in the remote mirror, or that have been excluded
+
+C *create/use a cache for updates and retries (C0 no cache)
+This option define whether the engine has to generate a cache for retries and updates or not
+
+k  store all files in cache (not useful if files on disk)
+This option define whether the engine has to store all files in cache or not
+
+V execute system command after each files ($0 is the filename: -V \"rm \\$0\
+This option lets the engine execute a command for each file saved on disk
+
+q  quiet mode (no questions)
+Do not ask questions (for example, for confirm an option)
+
+Q  log quiet mode (no log)
+Do not generate log files
+
+v  verbose screen mode
+Log files are printed in the screen
+
+f *log file mode
+Log files are generated into two log files
+
+z  extra infos log
+Add more informations on log files
+
+Z  debug log
+Add debug informations on log files
+
+
+--mirror   *make a mirror of site(s) 
+--get   get the files indicated, do not seek other URLs
+--mirrorlinks   test links in pages (identical to -Y)
+--testlinks     test links in pages
+--spider    spider site(s), to test links (reports Errors & Warnings)
+--update    update a mirror, without confirmation
+--skeleton  make a mirror, but gets only html files
+
+--http10  force http/1.0 requests when possible
+
+
+
+ + + +
+
+
+ + + + + +
+ + + + + + -- cgit v1.2.3