Squeaky Clean was written with HTML exported from Microsoft office in mind. It rips out all the classes, styles, strange XML and conditionals. Thus it doesn't look the same afterwards but at least the markup is nice and clean. This makes it easy to go back in and reimplement the styles using sensible CSS. Alternatively you can edit this file to stop style and class attributes being removed. Documents will be converted into utf8 from whatever charset they started in. Installing iconv will increase the charset support to include multi-byte charsets, like east asian and arabic charsets. By default most single byte charsets and unicode are supported. This program uses an XML parser to read the HTML. This means that if the source file is highly non XML compliant it will fail to parse.
- Basic load, clean and display of HTML.
- Log window for status and error messages.
- XML based parsing, cleans out specified attributes and tags.
- Inline editor for cleaning up by hand.
- Options specified in 'Clean.xml' gives the user some control over the attributes and elements.