Download CleanXHTML 1.2 for Office Word 2003.
Get the free trial copy by clicking on the Try button.
You can also purchase CleanXHTML 1.2 from the same location by clicking on the Buy button.
CleanXHTML 1.2, now requiring .NET 2.0, extends Microsoft Office Word, adding the ability to export entire documents or a selected range within a document as “clean” XHTML. This “Clean” XHTML is standards-compliant XHTML Transitional markup that is free from the proprietary nature of Microsoft Office Documents. The design goal here is to leverage the robust Office System platform to produce “generic” markup ready for CSS formatting by Blog tools, RSS feeds or any publishing system that depends on XHTML.
This document, an XML fragment stored in a database, is produced by CleanXHTML!
This new version, CleanXHTML 1.2, is now based on .NET 2.0 and is amazingly fast! CleanXHTML also installs as a ‘real’ Office Add-In just like any other .NET 2.0 Setup application. For more details about this version, see “Known Issues” and “Future Plans for CleanXHTML-based Products” below. Download the trial copy of CleanXHTML 1.2 and your existing registration data should upgrade you automatically. In some cases you may have to enter your registration data again. For more questions feel free to contact Songhay System. Your patience with us is greatly appreciated!
The CleanXHTML menu is appended to the main menu as an Office Add-in when it is successfully installed. The menu has three items: Convert Selection…, Options… and About CleanXHTML….
This is the command that converts a selected range into XHTML. To convert an entire Word document, select the entire document (type Ctrl A or choose Edit > Select All). When CleanXHTML finds a selection it will convert it and show XHTML in the Windows Form below:
Use this form to copy the XHTML to the Clipboard (with the Copy XHTML button) or save the XHTML to a file (with the Save XHTML button). There are CleanXHTML output options. The table summarizes:
| Output Option | Remarks |
|---|---|
Copy <body> content only |
CleanXHTML outputs well-formed XML documents by default. When enabled, this option will strip the XHMTL document down to the XML fragment inside of the This feature is useful for the production of Blog entries and other XHTML fragments. |
| Expand glyphs. |
When enabled, selected UTF-8 Latinate glyphs are ‘expanded’ into entities. So a glyph like é is converted into This feature is useful for HTML conversion for non-Unicode environments. |
| Convert to HTML | When enabled, the output is converted into HTML 4.x. |
| Wrap text | When enabled, the output wraps inside of its display area. |
The Options… command opens the Options dialog shown above. These options control what XHTML attributes are generated. This is largely for CSS designing. The table below summarizes:
| Option Group | Option | Remarks |
|---|---|---|
| Block Alignment | Enable <p align=""> |
When enabled, CleanXHTML will translate Word paragraph alignment (except Left alignment) into XHTML. |
| Block Alignment | Enable <p align="left"> |
When enabled, CleanXHTML will translate Word paragraph Left alignment into XHTML. |
| Block Alignment | Enable Table Cell align | When enabled, CleanXHTML will translate Word Table Cell horizontal alignment into XHTML. This will output align attributes for td and th elements. |
| Block Alignment | Enable <table id=""> |
When enabled, CleanXHTML will identify each Word table in XHTML with table id attributes based on the specified prefix (CleanXhtmlTable_ by default), a random sequence of letters and ordinal position. |
| Endnotes | Enable Endnotes | When enabled, CleanXHTML will translate Word Footnotes into XHTML endnotes, a div block with id='EndNotesBlock' by default. |
| Style/Effect | CleanXHTML Output |
|---|---|
| A bold character style or the Bold style |
<strong>bold character</strong>
|
| An italic character style or the Emphasis style |
<em>italic character</em>
|
| A bold-italic style |
<strong><em>bold-italic</em></strong>
|
| A Strike-Through style or the Strikethrough style |
<span style="text-decoration:line-through;">Strike-Through</span>
|
| A 2nd Superscript effect |
A 2<sup>nd</sup> Superscript style
|
| An x2 Subscript effect |
An x<sub>2</sub> Subscript effect
|
| A Small Caps effect or the Small Caps style |
<span style="font-variant:small-caps;">Small Caps</span>
|
| An all caps effect or the All Caps style |
<span style="text-transform:uppercase;">all caps</span>
|
| Style Name | CleanXHTML Output |
|---|---|
| Block Text |
The |
| Heading 1 |
<h1></h1>
|
| Heading 2 |
<h2></h2>
|
| Heading 3 |
<h3></h3>
|
| Heading 4 |
<h4></h4>
|
| Heading 5 |
<h5></h5>
|
| Heading 6 |
<h6></h6>
|
| Hyperlink |
|
|
Normal |
See “The Options… Command” above about details about aligning the |
|
<ul>
<li>List Bullet</li>
</ul>
The List Bullet style appears with the Ctrl+Shift+L shortcut. |
|
<ol>
<li>List Number</li>
</ol>
|
| Style Name | CleanXHTML Output |
|---|---|
| HTML Acronym |
<acronym></acronym>
|
| HTML Cite |
<cite></cite>
|
| HTML Code |
<code></code>
|
| HTML Definition |
<dfn></dfn>
|
| HTML Keyboard |
<kbd></kbd>
|
| HTML Preformatted |
|
| HTML Sample |
<samp></samp>
|
| HTML Typewriter |
<tt></tt>
|
| HTML Variable |
<var></var>
|
This shows credits and a link to this document.
The table below summarizes built-in styles that CleanXHTML recognizes:
th elements instead of td elements by formatting the entire Word Table Cell as any Header style.| “Upgrading My Office Solution to a VSTO SE Application-Level Add-In” | My personal, uncensored notes about the frustrating aspects of VSTO SE development. |
| Use XML to store configuration settings | Provided the germ for developing the CleanXHTML configuration system. |
| Description of Office Safe Mode for Word 2003 and Word 2002 | Introduced the concept of “Safe Mode” relating to Office applications. |
| HOW TO: Set the Mask Property and the Picture Property for an Office 2003 CommandBar Button | Directly related to designing the menu for CleanXHTML. |
| “Menu, Toolbar, or Form Suddenly Stops Working?” | A very important article explaining a bug that got me. Ouch! Hours of fun… |
| “Even More On Adding Images to Command Bar Buttons” | This was another way of highlighting the importance of System.Windows.Forms.AxHost. |
| “A Simple Resource Helper Class” | A Blog entry from a Microsoft guy that made retrieving bitmaps and other resources easier. |
| Converting WordprocessingML into HTML (for easy viewing) | Brian Jones of the Office team introduces Word 2003: XML Viewer. You can use this tool instead of CleanXHTML for document-centric “fidelity.” It contains the XSL transformation released by Microsoft that translates WordprocessingML into HTML. |
| How to use startup command line switches to start Word 2003, Word 2002, and Word 2000 | “This article describes the command-line switches that can be used to start Word and their purpose. Some of these switches are also described in Word Help.” |