From 25adbdabb47499fe641c7bd9595024ff82667058 Mon Sep 17 00:00:00 2001 From: Xavier Roche Date: Mon, 19 Mar 2012 12:51:31 +0000 Subject: httrack 3.30.1 --- HelpHtml/faq.html | 902 ------------------------------------------------------ 1 file changed, 902 deletions(-) delete mode 100644 HelpHtml/faq.html (limited to 'HelpHtml/faq.html') diff --git a/HelpHtml/faq.html b/HelpHtml/faq.html deleted file mode 100644 index 1beee2b..0000000 --- a/HelpHtml/faq.html +++ /dev/null @@ -1,902 +0,0 @@ - - - - - - - HTTrack Website Copier - Offline Browser - - - - - - - - - -
HTTrack Website Copier
- - - - -
Open Source offline browser
- - - - -
- - - - -
- - - - -
- - -

F A Q

- -
- -


-

    -Tips: -
  • In case of troubles/problems during transfer, first check the hts-log.txt (and hts-err.txt) files to figure out what happened. These log files report all -events that may be useful to detect a problem. You can also ajust the debug level of the log files in the option -
  • -The tutorial written by Fred Cohen is a very good document to read, to understand how to use the engine, -how the command line version works, and how the window version works, too! All options are described and explained in -clear language! -
  • -
-

- - - -

- -
-
-
- -Very Frequently Asked Questions:

- -Q: HTTrack does not capture all files I want to capture!
-A: This is a frequent question, generally related to the filters. -BUT first check if your problem is not related to the
robots.txt website rules. -
-
-Okay, let me explain how to precisely control the capture process.
-
-Let's take an example:
-
-Imagine you want to capture the following site:
-www.someweb.com/gallery/flowers/
-
-HTTrack, by default, will capture all links encountered in www.someweb.com/gallery/flowers/ or in lower directories, like -www.someweb.com/gallery/flowers/roses/.
-It will not follow links to other websites, because this behaviour might cause to capture the Web entirely!
-It will not follow links located in higher directories, too (for example, www.someweb.com/gallery/flowers/ itself) because this -might cause to capture too much data.
-
-This is the default behaviour of HTTrack, BUT, of course, if you want, you can tell HTTrack to capture other directorie(s), website(s)!.. -
-In our example, we might want also to capture all links in www.someweb.com/gallery/trees/, and in www.someweb.com/photos/
-
-This can easily done by using filters: go to the Option panel, select the Filters tab, and enter this line: -(you can leave a blank space between each rules, instead of entering a carriage return)
-+www.someweb.com/gallery/trees/*
-+www.someweb.com/photos/*

-
-This means "accept all links begining with www.someweb.com/gallery/trees/ and www.someweb.com/photos/" -- the + means "accept" and the final * means "any character will match after the previous ones". -Remember the *.doc or *.zip encountered when you want to select all files from a certain type on your computer: -it is almost the same here, except the begining "+"
-
-Now, we might want to exclude all links in www.someweb.com/gallery/trees/hugetrees/, because with the previous filter, -we accepted too many files. Here again, you can add a filter rule to refuse these links. Modify the previous filters to:
-+www.someweb.com/gallery/trees/*
-+www.someweb.com/photos/*
--www.someweb.com/gallery/trees/hugetrees/*

-
-You have noticed the - in the begining of the third rule: this means "refuse links matching the rule" -; and the rule is "any files begining with www.someweb.com/gallery/trees/hugetrees/
- -Voila! With these three rules, you have precisely defined what you wanted to capture.
-
-A more complex example?
-
-Imagine that you want to accept all jpg files (files with .jpg type) that have "blue" in the name and located in www.someweb.com
-+www.someweb.com/*blue*.jpg
-
-More detailed information can be found here!
-
-
- -
-General questions:
-

- -Q: Is there any 'spyware' or 'adware' in this program? Can you prove that there isn't any?
-A: No ads (banners), and absolutely no 'spy' features inside the program.
-The best proof is the software status: all sources are released, and everybody can check them. Open source is the best protection against privacy problems - HTTrack is an open source project, free of charge and free of any spy 'features'.
- -

Q: Are there any risks of viruses with this software?
-A: For the software itself: -All official releases (at httrack.com) are checked against all known viruses, and the packaging process is also checked. Archives are stored on Un*x servers, not really concerned by viruses.
-For files you are downloading on the WWW using HTTrack: You may encounter websites which were corrupted by viruses, and downloading data on these websites might be dangerous (as dangerous as if using a regular Browser). Always ensure that websites you are crawling are safe. - (Note: remember that using an antivirus software is a good idea once you are connected to the Internet)
- -

Q: The install is not working on NT without administrator rights!
-A: That's right. You can, however, install WinHTTrack on your own machine, and then copy your WinHTTrack folder from your Program Files folder to another machine, in a temporary directory (e.g. C:\temp\) - -

Q: Where can I find French/other languages documentation?
-A: Windows interface is available on several languages, but not yet the documentation! - -

Q: Is HTTrack working on NT/2000?
-A: Yes, it does - -

Q: What's the difference between HTTrack and WinHTTrack?
-A: WinHTTrack is the Windows release of HTTrack (with a graphic shell) - -

Q: Is HTTrack Mac compatible?
-A: No, because of a lack of time. But sources are available - -

Q: Can HTTrack be compiled on all Un*x?
-A: It should. The Makefile may be modified in some cases, however - -

Q: I use HTTrack for professional purpose. What about restrictions/license fee?
-A: HTTrack is covered by the GNU General Public License (GPL). There is no restrictions using HTTrack for professional purpose, -except if you develop a software which uses HTTrack components (parts of the source, or any other component). -See the license.txt file for more information - -

Q: Is there any license royalties for distributing a mirror made with HTTrack?
-A: No. - -

Q: Is a DLL/library version available?
-A: Not yet. But, again, sources are available (see license.txt for distribution infos) - -

Q: Is there a X11/KDE shell available for Linux and Un*x?
-A: Yes. See the download/contribution section at www.httrack.com! - - -


-Troubleshooting:
-

- -Q: Some sites are captured very well, other aren't. Why?
-A: -There are several reasons (and solutions) for a mirror to fail. Reading the log files (ans this FAQ!) is generally a VERY good idea to figure out what occured. - -

- -There are cases, however, that can not be (yet) handled: - -
    -
  • Flash sites - not handled
  • -
  • Intensive Java/Javascript sites - might be bogus/incomplete
  • -
  • Complex CGI with built-in redirect, and other tricks - very complicated to handle, and therefore might cause problems
  • -
  • Parsing problem in the HTML code (cases where the engine is fooled, for example by a false comment (<!--) which has no closing comment (-->) detected. - Rare cases, but might occur. - A bug report is then generally good! -
  • -
- -Note: -For some sites, setting "Force old HTTP/1.0 requests" option can be useful, as this option uses more basic requests (no HEAD request for example). -This will cause a performance loss, but will increase the compatibility with some cgi-based sites. -
- -
- -Q: Only the first page is caught. What's wrong?
-A: First, check the hts-log.txt file (and/or hts-err.txt error log file) - this can give you precious information.
-The problem can be a website that redirects you to another site (for example, www.someweb.com to public.someweb.com) : -in this case, use filters to accept this site
-This can be, also, a problem in the HTTrack options (link depth too low, for example)
- -

Q: With WinHTTrack, sometimes the minimize in system tray causes a crash!
-A: This bug sometimes appears in the shell on some systems. If you encounter this problem, avoid minimizing the window! - -

Q: Are https URL working?
-A: Yes, HTTrack does support (since 3.20 release) https (secure socket layer protocol) sites - -

Q: Are ipv6 URL working?
-A: Yes, HTTrack does support (since 3.20 release) ipv6 sites, using A/AAAA entries, or direct v6 addresses (like http://[3ffe:b80:12:34:56::78]/) - -

Q: Files are created with strange names, like '-1.html'!
-A: Check the build options (you may have selected user-defined structure with wrong parameters!) - -

Q: When capturing real audio/video links (.ram), I only get a shortcut!
-A: Yes, but .ra/.rm associated file should be captured together - except if rtsp:// protocol is used (not supported by HTTrack yet), or if proper filters are needed - -

Q: Using user:password@address is not working!
-A: Again, first check the hts-log.txt and hts-err.txt error log files - this can give you precious information
-The site may have a different authentication scheme - form based authentication, for example. -In this case, use the URL capture features of HTTrack, it might work
-

- -Q: When I use HTTrack, nothing is mirrored (no files) What's -happening?
-A: First, be sure that the URL typed is correct. Then, check if you need to use a -proxy server (see proxy options in WinHTTrack or the -P proxy:port option in the -command line program). The site you want to mirror may only accept certain browsers. You -can change your "browser identity" with the Browser ID option in the OPTION box. -Finally, you can have a look at the hts-log.txt (and hts-err.txt) file to see what -happened.
-
- -
Q: There are missing files! What's happening?
-A: You may want to capture files that exist in a different folder, or in another web site. -You may also want to capture files that are forbidden by default by the
robots.txt website rules. -In these cases, HTTrack does not capture these links automatically, you have to tell it to do so. -

-
  • Either use the filters.
    -Example: You are downloading http://www.someweb.com/foo/ and can not get .jpg images located -in http://www.someweb.com/bar/ (for example, http://www.someweb.com/bar/blue.jpg)
    -Then, add the filter rule +www.someweb.com/bar/*.jpg to accept all .jpg files from this location
    -You can, also, accept all files from the /bar folder with +www.someweb.com/bar/*, or only html files with +www.someweb.com/bar/*.html and so on..

    -
  • -If the problems are related to robots.txt rules, that do not let you access some folders (check in the logs if you are not sure), -you may want to disable the default robots.txt rules in the options. (but only disable this option with great care, -some restricted parts of the website might be huge or not downloadable) -
-
-
- -Q: There are corrupted images/files! How to fix them?
-A: First check the log files to ensure that the images do really exist remotely and are not fake html error pages renamed into .jpg ("Not found" errors, for example). -Rescan the website with "Continue an interrupted download" to catch images that might be broken due to various errors (transfer timemout, for example). -Then, check if the broken image/file name is present in the log (hts-log.txt) - in this case you will find there the reason why the file has not been properly caught. -
If this doesn't work, delete the corrupted files (Note: to detect corrupted images, you can browse the directories with a tool like ACDSee and then delete them) -and rescan the website as described before. HTTrack will be obliged to recatch the deleted files, and this time it should work, if they do really exist remotely!.
-
-
- -
Q: FTP links are not caught! What's happening?
-A: FTP files might be seen as external links, especially if they are located in outside domain. You have either to accept all external links (See the links options, -n option) or -only specific files (see
filters section).
-Example: You are downloading http://www.someweb.com/foo/ and can not get ftp://ftp.someweb.com files
-Then, add the filter rule +ftp.someweb.com/* to accept all files from this (ftp) location
-
-
- -Q: I got some weird messages telling that robots.txt do not allow several files to be captured. What's going on?
-A: -These rules, stored in a file called robots.txt, are given by the website, to specify which links or folders should not be caught by robots and spiders -- for example, /cgi-bin or large images files. -They are followed by default by HTTrack, as it is advised. Therefore, you may miss some files that would have been downloaded without -these rules - check in your logs if it is the case:
-Info: Note: due to www.foobar.com remote robots.txt rules, links begining with these path will be forbidden: /cgi-bin/,/images/ (see in the options to disable this) - -
-If you want to disable them, just change the corresponding option in the option list! (but only disable this option with great care, -some restricted parts of the website might be huge or not downloadable) -
-
-
- -
Q: I have duplicate files! What's going on?
-A: This is generally the case for top indexes (index.html and index-2.html), isn't it? -
-This is a common issue, but that can not be easily avoided!
-For example, http://www.foobar.com/ and http://www.foobar.com/index.html might be the same pages. -But if links in the website refers both to http://www.foobar.com/ and http://www.foobar.com/index.html, these two pages will be caught. -And because http://www.foobar.com/ must have a name, as you may want to browse the website locally (the / would give a directory listing, NOT the index itself!), -HTTrack must find one. Therefore, two index.html will be produced, one with the -2 to show that the file had to be renamed. -
-It might be a good idea to consider that http://www.foobar.com/ and http://www.foobar.com/index.html are the same links, to avoid -duplicate files, isn't it? -NO, because the top index (/) can refer to ANY filename, and if index.html is generally the default name, index.htm can be choosen, -or index.php3, mydog.jpg, or anything you may imagine. (some webmasters are really crazy) -
-
-Note: In some rare cases, duplicate data files can be found when the website redirect to another file. This issue should be rare, and might be avoided using filters. -
-
-
- -
Q: I'm downloading too many files! What can I do?
-A: This is often the case when you use too large a filter, for example +*.html, which asks the -engine to catch all .html pages (even ones on other sites!). In this case, try to use more specific filters, like +www.someweb.com/specificfolder/*.html
-If you still have too many files, use filters to avoid somes files. For example, if you have too many files from www.someweb.com/big/, -use -www.someweb.com/big/* to avoid all files from this folder. Remember that the default behaviour of the engine, when -mirroring http://www.someweb.com/big/index.html, is to catch everything in http://www.someweb.com/big/. Filters are your friends, -use them! -
-
-
- -
Q: The engine turns crazy, getting thousands of files! What's going on?
-A: This can happen if a loop occurs in some bogus website. For example, a page that refers to itself, with a timestamp -in the query string (e.g. http://www.someweb.com/foo.asp?ts=2000/10/10,09:45:17:147). -These are really annoying, as it is VERY difficult to detect the loop (the timestamp might be a page number). -To limit the problem: set a recurse level (for example to 6), or avoid the bogus pages (use the filters) - -
-
- -
Q: File are sometimes renamed (the type is changed)! Why?
-A: By default, HTTrack tries to know the type of remote files. This is useful when links like -http://www.someweb.com/foo.cgi?id=1 can be either HTML pages, images or anything else. -Locally, foo.cgi will not be recognized as an html page, or as an image, by your browser. HTTrack has to rename the file -as foo.html or foo.gif so that it can be viewed.
-
-
- -
Q: File are sometimes *incorrectly* renamed! Why?
-A: Sometimes, some data files are seen by the remote server as html files, or images : in this case HTTrack is -being fooled.. and rename the file. This can generally be avoided by using the "use HTTP/1.0 requests" option. -You might also avoid this by disabling the type checking in the option panel. - -
-
- -
Q: How do I rename all ".dat" files into ".zip" files?
-A: Simply use the --assume dat=application/x-zip option - -
-
- -
Q: I can not access several pages (access forbidden, or redirect to another location), but I can with my browser, what's going on?
-A: You may need cookies! Cookies are specific data (for example, your username or password) that are sent to your browser once -you have logged in certain sites so that you only have to log-in once. For example, after having entered your username in a website, you can -view pages and articles, and the next time you will go to this site, you will not have to re-enter your username/password.
-To "merge" your personnal cookies to an HTTrack project, just copy the cookies.txt file from your Netscape folder (or the cookies located into the Temporary Internet Files folder for IE) -into your project folder (or even the HTTrack folder) -
-
-
- -
Q: Some pages can't be seen, or are displayed with errors!
-A: Some pages may include javascript or java files that are not recognized. For -example, generated filenames. There may be transfer problems, too (broken pipe, etc.). But -most mirrors do work. We still are working to improve the mirror quality of HTTrack.
-
-
- -
Q: Some Java applets do not work properly!
-A: Java applets may not work in some cases, for example if HTTrack failed to detect all included classes -or files called within the class file. Sometimes, Java applets need to be online, because remote files are -directly caught. Finally, the site structure can be incompatible with the class (always try to keep the original site structure -when you want to get Java classes)
-If there is no way to make some classes work properly, you can exclude them with the filters. -They will be available, but only online. -
-
-
- -
Q: HTTrack is taking too much time for parsing, it is very slow. What's wrong?
-A: Former (before 3.04) releases of HTTrack had problems with parsing. It was really slow, and performances -especially -with huge HTML files- were not really good. The engine is now optimized, and should parse very quickly all html files. -For example, a 10MB HTML file should be scanned in less than 3 or 4 seconds.
-
-Therefore, higher values mean that the engine had to wait a bit for testing several links. - -
    -
  • Sometimes, links are malformed in pages. -"a href="/foo"" instead of "a href="/foo/"", for example, is a common mistake. It will force the engine to -make a supplemental request, and find the real /foo/ location. -
  • -

    -
  • Dynamic pages. Links with names terminated by .php3, .asp or other type which are different from the regular -.html or .htm will require a supplemental request, too. HTTrack has to "know" the type (called "MIME type") of a file -before forming the destination filename. Files like foo.gif are "known" to be images, ".html" are obviously HTML pages - but ".php3" -pages may be either dynamically generated html pages, images, data files...
    -
    -If you KNOW that ALL ".php3" and ".asp" pages are in fact HTML pages on a mirror, use the assume option:
    ---assume php3=text/html,asp=text/html -

    -This option can be used to change the type of a file, too : the MIME type "application/x-MYTYPE" will always have the "MYTYPE" type. -Therefore,
    ---assume dat=application/x-zip -
    -will force the engine to rename all dat files into zip files -
  • -
- - -

-
- -
Q: HTTrack is being idle for a long time without -transfering. What's happening?
-A: Maybe you try to reach some very slow sites. Try a lower TimeOut value (see -options, or -Txx option in the command line program). Note that you will abandon -the entire site (except if the option is unchecked) if a timeout happen You can, with the -Shell version, skip some slow files, too.
-
- -
Q: I want to update a site, but it's taking too much time! What's happening?
-A: First, HTTrack always tries to minimize the download flow by interrogating the server about the -file changes. But, because HTTrack has to rescan all files from the begining to rebuild the local site structure, -it can take some time. -Besides, some servers are not very smart and always consider that they get newer files, forcing HTTrack to reload them, -even if no changes have been made! -
-
- -
Q: I wanted to update a site, but after the update the site disappeared!! What's going on?
-A: You may have done something wrong, but not always - -
    -
  • The site has moved : the current location only shows a notification. Therefore, all other files have been deleted to show the current state of the website!
  • -
  • The connection failed: the engine could not catch the first files, and therefore deleted everything. -To avoid that, using the option "do not purge old files" might be a good idea
  • -
  • You tried to add a site to the project BUT in fact deleted the former addresses.
    -Example: A project contains 'www.foo.com www.bar.com' and you want to add 'www.doe.com'. -Ensure that 'www.foo.com www.bar.com www.doe.com' is the new URL list, and NOT 'www.doe.com'! -
  • -
- -

- -
Q: I am behind a firewall. What can I do?
-A: You need to use a proxy, too. Ask your administrator to know the proxy server's -name/port. Then, use the proxy field in HTTrack or use the -P proxy:port option -in the command line program.
-

- -

Q: HTTrack has crashed during a mirror, what's happening?
-A: We are trying to avoid bugs and problems so that the program can be as reliable as -possible. But we can not be infallible. If you occurs a bug, please check if you have the -latest release of HTTrack, and send us an email with a detailed description of your -problem (OS type, addresses concerned, crash description, and everything you deem to be -necessary). This may help the other users too.
-
-
- -
Q: I want to update a mirrored project, but HTTrack is retransfering all pages. What's going on?
-A: First, HTTrack always rescans all local pages to reconstitute the website structure, and it can take some time. -Then, it asks the server if the files that are stored locally are up-to-date. On most sites, pages are not -updated frequently, and the update process is fast. But some sites have dynamically-generated pages that are considered as -"newer" than the local ones.. even if they are identical! Unfortunately, there is no possibility to avoid this problem, -which is strongly linked with the server abilities. - -
-
- -
Q: I want to continue a mirrored project, but HTTrack is rescanning all pages. What's going on?
-A: HTTrack has to (quickly) rescan all pages from the cache, without retransfering them, to rebuild the internal file structure. However, this process can take some time with huge sites -with numerous links. - -
-
- -
Q: HTTrack window sometimes "disappears" at then end of a mirrored project. What's going on?
-A: This is a known bug in the interface. It does NOT affect the quality of the mirror, however. We are still hunting it down, -but this is a smart bug.. - -
-
- -
Questions concerning a mirror:
- -
-
Q: I want to mirror a Web site, but there are some files outside -the domain, too. How to retrieve them?
-A: If you just want to retrieve files that can be reached through links, just activate -the 'get file near links' option. But if you want to retrieve html pages too, you can both -use wildcards or explicit addresses ; e.g. add www.someweb.com/* to accept all -files and pages from www.someweb.com.
-
-
Q: I have forgotten some URLs of files during a long -mirror.. Should I redo all?
-A: No, if you have kept the 'cache' files (in hts-cache), cached files will not be -retransfered.
-
-
Q: I just want to retrieve all ZIP files or other files in a web -site/in a page. How do I do it?
-A: You can use different methods. You can use the 'get files near a link' option if -files are in a foreign domain. You can use, too, a filter adress: adding +*.zip -in the URL list (or in the filter list) will accept all ZIP files, even if these files are -outside the address.
-Example : httrack www.someweb.com/someaddress.html +*.zip will allow -you to retrieve all zip files that are linked on the site.

-
-
Q: There are ZIP files in a page, but I don't want to transfer -them. How do I do it?
-A: Just filter them: add -*.zip in the filter list.
-
-
Q: I don't want to load gif files.. but what may happen if I -watch the page?
-A: If you have filtered gif files (-*.gif), links to gif files will be -rebuilt so that your browser can find them on the server.
-
-
Q: I get all types of files on a web site, but I didn't select -them on filters!
-A: By default, HTTrack retrieves all types of files on authorized links. To avoid -that, define filters like
-* +<website>/*.html -+<website>/*.htm +<website>/ +*.<type wanted>
-Example: httrack www.someweb.com/index.html -* +www.someweb.com/*.htm* +www.someweb.com/*.gif +www.someweb.com/*.jpg
-
-
Q: When I use filters, I get too many files!
-A: You might use too large a filter, for example *.html will get ALL html -files identified. If you want to get all files on an address, use www.<address>/*.html.
-If you want to get ONLY files defined by your filters, use something like -* +www.foo.com/*, because -+www.foo.com/* will only accept selected links without forbidding other ones!
-There are lots of possibilities using filters.
-Example:httrack www.someweb.com +*.someweb.com/*.htm*
-
-
Q: When I use filters, I can't access another domain, but I -have filtered it!
-A: You may have done a mistake declaring filters, for example +www.someweb.com/* --*someweb* will not work, because -*someweb* has an upper priority (because it has -been declared after +www.someweb.com)
-
-
Q: Must I add a  '+' or '-' in the filter list when I want -to use filters?
-A: YES. '+' is for accepting links and '-' to avoid them. If you forget it, HTTrack -will consider that you want to accept a filter if there is a wild card in the syntax - e.g. -+<filter> is identical to <filter> if <filter> contains a wild card (*) -(else it will be considered as a normal link to mirror)

-
-Q: I want to find file(s) in a web-site. How do I do it?
-A: You can use the filters: forbid all files (add a -* in the -filter list) and accept only html files and the file(s) you want to retrieve (BUT do not -forget to add +<website>*.html in the filter list, or pages will not be -scanned! Add the name of files you want with a */ before ; i.e. if you want to -retrieve file.zip, add */file.zip)
-Example:httrack www.someweb.com +www.someweb.com/*.htm* +thefileiwant.zip
-
-
- -
Q: I want to download ftp files/ftp site. How do I do it?
-A: First, HTTrack is not the best tool to download many ftp files. Its ftp engine is basic (even if reget are -possible) and if your purpose is to download a complete site, use a specific client.
-You can download ftp files just by typing the URL, such as ftp://ftp.somesite.com/pub/files/file010.zip and list ftp directories -like ftp://ftp.somesite.com/pub/files/
.
-Note: For the filters, use something like +ftp.somesite.com/* -
- -
Q: How can I retrieve .asp or .cgi sources instead of .html result?
-A: You can't! For security reasons, web servers do not allow that. - -

Q: How can I remove these annoying <!-- Mirrored from... --> from html files?
-A: Use the footer option (-%F, or see the WinHTTrack options) - -

Q: Do I have to select between ascii/binary transfer mode?
-A: No, http files are always transfered as binary files. Ftp files, too (even if ascii mode could be selected) - -

Q: Can HTTrack perform form-based authentication?
-A: Yes. See the URL capture abilities (--catchurl for command-line release, or in the WinHTTrack interface) - -

Q: Can I redirect downloads to tar/zip archive?
-A: Yes. See the shell system command option (-V option for command-line release) - -

Q: Can I use username/password authentication on a site?
-A: Yes. Use user:password@your_url (example: http://foo:bar@www.someweb.com/private/mybox.html) - -

Q: Can I use username/password authentication for a proxy?
-A: Yes. Use user:password@your_proxy_name as your proxy name (example: smith:foo@proxy.mycorp.com) - -

Q: Can HTTrack generates HP-UX or ISO9660 compatible files?
-A: Yes. See the build options (-N, or see the WinHTTrack options) - -

Q: If there any SOCKS support?
-A: Not yet! - -

Q: What's this hts-cache directory? Can I remove it?
-A: NO if you want to update the site, because this directory is used by HTTrack for this purpose. -If you remove it, options and URLs will not be available for updating the site - -

Q: Can I start a mirror from my bookmarks?
-A: Yes. Drag&Drop your bookmark.html file to the WinHTTrack window (or use file://filename for command-line release) and select -bookmark mirroring (mirror all links in pages, -Y) or bookmark testing (--testlinks) - -

Q: Can I convert a local website (file:// links) to a standard website?
-A: Yes. Just start from the top index (example: file://C:\foopages\index.html) and mirror the local website. -HTTrack will convert all file:// links to relative ones. - - -

Q: Can I copy a project to another folder - Will the mirror work?
-A: Yes. There is no absolute links, all links are relative. -You can copy a project to another drive/computer/OS, and browse is without installing anything. - -

Q: Can I copy a project to another computer/system? Can I then update it ?
-A: Absolutely! You can keep your HTTrack favorite folder (C:\My Web Sites) in your local hard disk, copy it -for a friend, and possibly update it, and then bring it back!
You can copy individual folders (projects), too: exchange -your favorite websites with your friends, or send an old version of a site to someone who has a faster connection, and -ask him to update it!
- - -
-Note: Export (Windows <-> Linux)
-The file and cache structure is compatible between Linux/Windows, but you may have to do some changes, like the path
- - - - - -
- Windows -> Linux/Unix -
- Copy (in binary mode) the entire folder and then to update it, enter into it and do a
- - httrack --update -O ./ - -

- - Note: You can then safely replace the existing folder (under Windows) with this one, because - the Linux/Unix version did not change any options
- Note: If you often switch between Windows/Linux with the same project, it might be a good idea to edit the hts-cache/doit.log file - and delete old "-O" entries, because each time you do a httrack --update -O ./ an entry is added, - causing the command line to be long -
-
- Linux/Unix -> Windows -
- Copy (in binary mode) the entire folder in your favorite Web mirror folder. - Then, select this project, AND retype ALL URLs AND redefine all options as if you were - creating a new project. - This is necessary because the profile (winprofile.ini) has not be created with the Linux/Unix version. - But do not be afraid, WinHTTrack will use cached files to update the project! -
-
- -
- -

Q: How can I grab email addresses in web pages?
-A: You can not. HTTrack has not be designed to be an email grabber, like many other (bad) products. - - -
-
-
-Other problems:
-
- -Q: My problerm is not listed!
-A: Feel free to
contact us! -
- -


- - -

-
-
- - - - - -
- - - - - - -- cgit v1.2.3