DocToHtml - Doc to HTML Converter FAQ
MS Word has the “Save as Web Page” feature, so why should I use DocToHtml?
There are several reasons, including these:
- DocToHtml produces standards-compliant, clean HTML code with style formatting.
- MS Word cannot convert multiple documents at once.
- MS Word’s “Save as Web Page” feature provides very few options.
What is the difference between HTML documents produced by DocToHtml and MS Word?
DocToHtml produces a much smaller and cleaner HTML code, ready to use on your website. HTML code produced by MS Word contains a lot of MS Office-specific tags and unnecessary markup. Microsoft’s intention was to save all document-specific information when generating HTML code, to let the user open the HTML document and continue working with it without losing anything. Unfortunately, this approach has led to very bloated HTML-code, with Microsoft-specific markup, which is practically impossible to edit and is no good for a decent website. As a rule, most of that code can be safely stripped off without any impact on the document’s appearance in the web browser. That’s what DocToHtml is intended for!
What is the difference between HTML documents produced by DocToHtml and other “clean-up” conversion tools?
Some tools try to clean up the HTML code generated by MS Word. Though this approach works, it has major drawbacks. That’s because Word HTML code is too messy for doing some operations.
For example, Microsoft Word tends to save lists without using the <ol> or <li> tags. Instead, it just adds a list item number as the first character(s) of the paragraph. After that, it might be pretty hard to decipher that a paragraph starting with a digit is logically not a paragraph at all, but a list item!
Another example is MS Word’s treatment of tables with combined cells. MS Word can produce HTML tables with a different number of cells in each row. This is a nonstandard approach, and there are no guarantees that a particular browser will correctly interpret such a table. MS Word also tends to put the “width” attribute into every table cell, which is unnecessary in most cases. To deal with these issues, it is insufficient just to strip off some tag attributes based on a set of predefined rules, which is what most clean-up utilities do. Instead, you need to restore the layout structure of the table and, based on that structure, make individual decisions of whether the “width” attribute is needed. This is hard to do when you have only the HTML code produced by the Microsoft built-in converter.
Our program uses a totally different approach. Instead of trying to clean-up the HTML code generated by MS Word, DocToHtml produces HTML code on its own. DocToHtml reads the content and properties of the input document via the MS Word automation; based on these data, our program generates HTML code in the most effective way. This approach ensures full control over the output content, and allows us to do things that otherwise would be impossible. This fact, combined with our strong intention to make the resulting HTML documents as small and clean as possible, results in a very optimized output code in most cases.
However, this approach has one drawback—the conversion might be too slow at times. For a list of recommendations on how to speed up the conversion process, please read this topic.
Do I need to have MS Word in order to perform a conversion?
Yes, currently MS Word 2000 or higher is needed for DocToHtml to do the conversion.
Does DocToHtml have any technological limitations?
Yes, DocToHtml does have some limitations. For example, it does not support some MS Word formatting features. To make things better, we regularly add features and functions most requested by our users. DocToHtml’s further development will be based on feedback from our users. For a list of unsupported features, please read this topic. If you need some new features, please e-mail your suggestions to firstname.lastname@example.org, and we’ll consider implementing them. We would appreciate your feedback!
Can DocToHtml produce output documents in the CHM format?
Sorry, but DocToHtml is only intended for creating (X)HTML documents, ready for the Web. To create a CHM document, you will have to use other tools. One of the most advanced Help and Manual authoring tools is Dr. Explain from Indigo Byte Systems. It features WYSIWYG editing; simultaneous generation of output documents in the CHM, HTML, PDF, and RTF formats; automatic annotation of screenshots; support of style templates; easy integration with your program’s source code in any language; and so on.
I need to transfer all DocToHtml settings to another computer. What should I copy?
DocToHtml stores all its settings in several files in the “DocToHtml” subfolder of the %APPDATA% folder. %APPDATA%, shorthand for Application Data, is the environment variable which designates a folder where applications are supposed to create subfolders to store all their settings. DocToHtml uses its own set of settings for every user of the computer. To browse to the Application Data folder, just type %APPDATA% (with percent signs) in the Windows Explorer address bar, instead of a regular Windows filepath, and it will be automatically expanded to the actual location. On 64-bit systems, DocToHtml uses the Application Data folder intended for 32-bit applications, so the “Roaming” subfolder will be added to the AppData path. To transfer all DocToHtml settings, including user-defined HTML templates, just copy the contents of the “%APPDATA%\DocToHtml” folder to the target computer. You will also have to re-enter your registration key, because it is stored in the system registry, not in the above-mentioned folder.
Can I use DocToHtml on my two computers? (I have the Personal License.)
Yes, your Personal License entitles you to use DocToHtml on your two computers. When you purchased the Personal License for DocToHtml, you have been given the right to use the program on different computers. The Personal License prohibits the use of your copy of DocToHtml by any other persons on their computers, but you can use it on more than one computer.
How can I speed up the conversion process?
The program in its current implementation has a major drawback: a rather low conversion speed in case of complex input documents. The reason is the OLE automation calls to MS Word in order to perform the actual actions and retrieve the original formatting of the text to be converted. To learn how to improve the situation, please read the How to Speed Up the Conversion Process help topic.
All mentioned trademarks
are property of their respective owners