Home
Download your own copy of Doc To HTML Converter today!Prev Page Prev Page
Introduction
Using DocToHtml
Getting Started
Conversion Options Dialog
General
Output Filenames
Template Data Editor
Images
Meta Tags
CSS
Advanced
JavaScript
HTML Code
Font Attr
Paragraph Attr
List Attr
Table Attr
Text Boxes
Body & Footer
Footnotes & Endnotes Options
HTML Template
XML & Charset & PG
Browsers
Search & Replace
Batch
Filemasks
Progress Form
Batch Converter
Main Window
Settings Dialog
General
Memory
Timeouts
Command Line Support
Search & Replace Dialog
Installation Issues
Uninstallation Issues
Crash Recovery
Support for 64-Bit MS Word
Troubleshooting
Common Issues
Deep Troubleshooting
Registration Benefits
How to Buy
Support & Feedback
FAQ
How to Speed up the Conversion
Unsupported Formatting
License Agreement
Privacy Policy
Change Log
Credits
Other Products

DocToHtml—the List of Unsupported Formatting Features

DocToHtml does not aim to support all MS Word formatting capabilities. Instead, it targets at the majority of the most used ones, and tries to make the output (X)HTML documents as compact and clean as possible to make them ready for the Web. Below there is a list of some unsupported or partially supported MS Word formatting features.

  • Overlapped bookmarks or hyperlinks may be converted into overlapping <A> tags, or even be completely lost.
  • MS Word autoshapes, diagrams, and organization charts are not supported.
  • Table cell borders are converted only if all cells in a given table have the same border style. If the borders of even one cell in the table have a different formatting, the respective table in the output (X)HTML document will not have any borders at all.
  • All rows in a table must have the same beginning and ending horizontal coordinates. If the rows begin at different left-margin coordinates, cell width may be set incorrectly.
  • Currently, the program cannot split documents based on pages, but based on Headings only.
  • Animated underline styles and animated characters are not supported.
  • Currently, the program does not support character, list, or table-cell styles, but paragraph styles only.
  • MS Word forms and dynamic controls within them (checkboxes, list boxes, text input fields, etc.) are not supported.
  • Applied MS Word formatting styles cannot be easily redefined to make the output (X)HTML document look different from the original MS Word doc.
  • HTML and CSS do not have equivalents for such things as complex text wrapping modes for text boxes, so such formatting is not supported.
  • Sometimes complex formatting will not be completely reproduced in the resulting HTML document. This is normal.
  • The position of “anchored” images cannot be precisely preserved.

MS Word can handle at least two types of pictures: inline pictures and floating ones. An inline picture is embedded into the text stream, and treated as one big character when applying page layout algorithms. A floating picture is attached to its “anchor” and can be moved by dragging the anchor sign.

By default, before starting the actual conversion into (X)HTML, DocToHtml internally converts all floating pictures into inline ones. You can alter this behavior via the “Process floating images” checkbox on the Images tab of the Conversion Options dialog.

If you turn this option off, DocToHtml will not be converting any floating (“anchored”) images at all.

You can also convert anchored images into inline ones by hand, as follows. Right-click an image to open the context menu, select the “Format Picture” menu item, go to the Layout tab, select the “In line with text” option, and then click OK.

If you do that, the connected anchor should disappear, and the picture’s position on the page may slightly change. The same change of picture positions occurs when the DocToHtml conversion process handles the pictures before their actual conversion into the resulting (X)HTML document. That’s why you may observe that pictures in the output (X)HTML document are positioned not exactly as in the original MS Word document. So, the first rule to get the best results in preserving picture positions is to use only inline pictures. The second rule, in case if you need a particularly precise transformation, is to use borderless tables and inline pictures inside table cells, one picture per cell. Suppose, you want to position four pictures inside a square: two pictures in the first row, and two pictures in the second row, with the pictures in the second row being exactly under the ones in the first row. To do that, insert a table with two rows and two columns into the document, and then insert your pictures into the table cells. Apply the “Center” horizontal alignment and the “Center” vertical alignment to all cells (for example, in MS Word 2003 you can do that via the Table and Cell tabs of the Table Properties dialog).

Open the “Borders and Shading” dialog and set borders to “None” for the whole table (unless you need a cell grid in the resulting HTML document).

Do not forget to set proper options on the Table Attr tab of the Conversion Options dialog. You can simply check (ALL) and (All CSS) checkboxes there.

After that, the output (X)HTML document will have the pictures at the same positions as in the original MS Word doc.

(All screenshots were taken in MS Word 2003. Other MS Word versions should have similar dialogs.)

We regularly add features and functions most requested by our users. DocToHtml’s further development will be based on feedback from our users. If you need any new features, please e-mail your suggestions to feedback@opilsoft.com, and we’ll consider implementing them. If you think that we have not included some feature in the above list, please inform us about that. We would appreciate your feedback!