DocToHtml - Doc To HTML ConverterThe easy way to batch convert your Word docs to clean HTML/XHTML

Read two another articles

DocToHtml - Articles

What real users think and say about our product:


I became a webmaster not so long ago, less than two years. Well, one of the first priority things for me after the site construction had turned to be a stable, browser-independent method to add text to the site with normal looking HTML (without those "mso-bidi-font" or other abracadabra things). All the time before I used MS Word to prepare documents and then just copy-pasted onto the site (like every novice do, I suppose). The problem appeared when I installed Internet Explorer 9 in place of Internet Explorer 8 -- their logics of copy-pasting differ much and I saw the resulting HTML page to be absolutely different from what I had prepared in MS Word. I tried to write a converter by myself but shortly understood that my knowledge, be it not so bad, but still not enough for that. A really good knowledge is supposed to be. And I found what I was looking for -- I found "Doc to HTML Converter". It meets my needs almost absolutely.

There are several unanswerable advantages that I see in that program. Firstly, it takes data from deep inside MS Word, not from what it shows on screen and what we see, but actually from what MS Word has inside. I reckon that is the major point, that's stability, the prime approach. Secondly, it's not expensive, unlike other programs of that kind. I would say they ask you even less than it should cost. Thirdly, developers' team consists of very diligent, careful and interactive people. They really do their best to improve the program's quality and functions. These three things are the three whales of success. Good luck!

Tatsu Takamaro

Many thanks to Tatsu Takamaro who took time to write this review, as well as suggesting many improvements and pointing to a series of bugs in the program.


Many of us who publish on the Internet tend to have longer articles we want to share with the world. Often we have written these articles in one or another word processor, such as Microsoft Word. To import these articles into WordPress, or the publishing system we use, is not an easy task - especially if we want to retain formatting of the article. Microsoft Word produces relatively unstandardized and rotten HTML code, so just to save the article as HTML is not acceptable. If we are trying to import the article into the visual editor in WordPress by cutting and pasting, WordPress will remove most of the HTML code to achieve the best possible results. That is neither an optimal solution, since we usually have to go through the text and make formatting again, so that it gets the look we want.

I am responsible for several sites where it is frequently necessary to publish material I have received in Word format or in other word processing formats. Normally I use HTML editor in WordPress when I make new posts or pages (I do that now as I type this, too). I think that is the easiest, because I feel that I have full control of what I put out and how it is presented. So I think also it is a good solution to have a text in cleanest possible HTML to paste. But how do I convert the texts as so pure and clean HTML that I can paste them into the HTML editor and publish them without having to make lots of cleanup work on them first?

I have recently tested several solutions for converting Microsoft Word documents to HTML - without being completely satisfied. Whether there was one thing that was wrong, or so it was the other. One of the tools provided fairly nice clean HTML, but didn't handled multiline paragraphs very well. This meant that each line of a paragraph in the original document was defined as a separate paragraph in the HTML document, and I had to go through the whole text and clean up. I was not further pleased with this.

A couple of weeks ago I happened to come across the program DocToHtml from Opilion Software. I thought that it might be worth to test it too, since I still had not found anything I was completely happy with. As thought, so I did.

The impression so far is very good. DocToHtml has a variety of options, and I find that it delivers good, clean HTML I can import and use without further ado.

DocToHtml installs as an add-in in Microsoft Word, and depends on Word to do the job. For those of us who have Word installed on our computer, this is okay. The program can be run through an extra menu tab in Word 2000/2003, or from the ribbon in the newer versions of Word. In addition, it can be run from the desktop or the Start menu in Windows, and it can also be integrated in the "right-click menu" in Windows Explorer. DocToHtml works perfectly in Windows 7.

I've tested a beta of the upcoming version 3.0 of DocToHtml for a few days, and I can not say other than that I'm very pleased so far. This is a tool I can confidently recommend!

DocToHtml is not free. A license costs $ 39, but in terms of saved time, reduced work and - importantly - saved frustration, this is not an unreasonable price.

For more information about DocToHtml you can go to the program's website at the address There you can also download trial versions of both the current release and beta version of the upcoming edition.

Tore Johnny
The Norwegian Original version of the article

Many thanks to Tore Johnny who took time to write this review, and to give a series of valuable advices regarding program's interface and functionality.


Did you ever have to publish a lot of MS Word files on the Web? I have thought that it should be as simple as File > Save to HTML and then just publish it on the site. But It isn't. MS Word is not using normal, standardized, HTML, but it has its own HTML not being displayed well in any of the browsers - even Internet Explorer. One of my first thoughts on solving this issue was to use some HTML sanitation software or php code - and after googling for hours I had not found any suitable for breaking the "MS Word HTML code".

Ok, I told myself, what if I use some regular expressions to clean up the mess. After lots of trials and fails I had just given up. My regex knowledge wasn't that good anyway.

So after a few days of struggling with something that should be done by uncle Bill's men in the first place, I gave up and started to look for desktop software which would save my day. And after trying some major players (top of search results), I was glad to find DocToHtml by Opilion Software.

It had everything I was looking for and more. I immediately gave it a try and was pleasantly surprised by how good its final HTML was. I was able to successfully convert my most complex DOCX file - with many diagrams, graphs, pictures, inbound and outside hyperlinks - with this Opilion's saviour - DocToHtml.

After detailed examination of the resulting - proper - HTML,- I had found out that there was really no need for other, manual, work - so I just clicked on Add folder and play :)

It really was that easy - like a child's play. All of my MS Word files were finally prepared for publishing - and everybody wanted the receipt for it.

If I have only not been losing time and somebody has told me about this elegant solution.

Thank you Opilion Software and keep up with this good job!

Bogdan Cerovac

Read two another articles