**************************************** The long awaited high quality BUG report on AntiWord 0.37 --------------------------------------------------------- Critical bugs: 0. PDF creation "-a" using the 16-bit binary rarely works at all, from 3 DOC's supplied, none could be converted, and what's worse, "WMP.DOC" makes it even hang (hard freezer - memory corruption near ZERO ???), partially after some random error messages. The other 2 DOC's end up in "Memory allocation failed" (see "AWSHOT2.PNG"), I could only convert a ridiculously small DOC (not included) successfully. Conversion bugs: 1. Output from 16-bit and 32-bit binary are supposed to be identical, but aren't. The 16-bit output is always "slightly broken" at 1 or 2 places. Just compare "PE6X16F.TXT" vs "PE6X32F.TXT" or other pairs. 2. With "PECOFFV8.DOC" and 16-bit binary, there is a false complaint about a "broken table", see "AWSHOT4.PNG". 3. In PDF output (if brewn at all), there are in some cases a few lines falsely displayed in light yellow colour (some PDF readers display them light blue, RGB-vs-BGR, damn), check "PE6OHAWP.PNG" vs "PE6OHMSW.PNG" 4. In PDF output (if brewn at all), there are in some cases a few corrupt characters, check "PE6TIAWP.PNG" vs "PE6TIMSW.PNG". Usability issues: 5. "It's neither in 'C:\antiword' nor in 'C:\antiword'" is neither smart nor too helpful, see "AWSHOT0.PNG", and "AWSHOT1.PNG" about fontnames. Those files will NOT be found in the directory where the executable resides (where they are placed in the ZIP). I see no point in defining such 2 fixed pathes and then risking that they will be identical. Why not search those files just in the directory of the executable (alternatively in "HOME" if defined) ? 6. Sending the output to the screen by default and pressuring the user to include the ">" redirection hack is IMHO suboptimal, see "AWSHOT2.PNG". When AntiWord fails to brew the file properly (more common with the 16-bit version), it leaves an empty or very small useless broken file. Why not convert "BLAH.DOC" into "BLAH.TXT" or "BLAH.PDF" ? This would allow to use a temporary file ("ANTI5973.TMP") and rename it to the correct name ("BLAH.PDF") only if the conversion completes successfully, and to add some progress indicator for the conversion too. Maybe add a "send output to stdout" option for systems where such can be useful (not DOS) then. 7. Calling "-f" as "formatted" text is confusing, as the text is always formatted. Maybe rather call it "-y" and "add text style hints" (see also [22.]) ? 8. Commandline is case-sensitive, see "AWSHOT3.PNG". It could also default to "a4" for "-a" PDF output rather than whine. 9. In "00README.TXT", the mentioned "Problems" could be easily avoided by including the DPMI host (for example D3X, just 9 KiB extra). See also [13.]. 10. In "00README.TXT", it says "The DOS version expects its mapping files to be in DOS text format. That means that all lines must end with a carriage return, line feed combination. (CR+NL)". Would it be difficult to make all versions to accept both ? FASM IDE does ;-) 11. When feeding in a RTF file, it brews 2 error messages, one of them complaining also about "8+3". Better: check for "8+3" and either refuse to open the file and whine about "8+3" only, or open it, and then while about possible RTF only, but not about "8+3". See also [21.]. Packaging: 12. Documentation files are mixed with files that are part of the application: "00README.TXT" "8859-1.TXT" "8859-2.TXT" "8859-5.TXT" "CP437.TXT" "CP850.TXT" "CP852.TXT" "CP862.TXT" "CP866.TXT" "FONTNAME.TXT". Idea: prefix the files being part of the application with "AW" ("FONTNAME.TXT" -> "AWFONTNA.TXT") to make them easily distinguishable. The subdirectory "DOCS" can be deleted then. See also [15.]. Docs: 13. System requirements are not documented. Write something like: "The 16-bit DOS version should run on 8086 or compatible with 512 KiB RAM" and "The 32-bit DOS version uses a DOS Extender (D3X) and needs at least 80386 and 2 MiB RAM. A separate DPMI host is not required." (after fixing [9.]). What about FPU ? 14. "HISTORY" and "CHANGES.TXT" coulde be merged, also "00README.TXT" and "FAQ" and "README" and "ANTIWORD.MAN". 15. The files that are part of the application are not documented, the "88" and "CP" are always required, but "FONTNAME.TXT" -> "AWFONTNA.TXT" only for PDF output ? See also [12.]. 16. In "00README.TXT", the "More advanced installation" is strange and could maybe get completely removed. 17. PDF output is not documented. Ideas: "AntiWord creates a simple PDF file having its version set to "%PDF-1.3". Text styles and sizes and partially preserved. No text block compression is used.". For DOS port additionally: "It can be viewed using the DGJPP port of MUPDF and even old Adobe Acro 1.0". 18. The 16-bit 8086 version is very iferior (see bugs above). While some problems could be fixed (optimize memory usage, better error messages ("Can't process this file because of 16-bit 8086 limitations")), the limits are impossible to remove completely, so document them. Nevertheless this version of course should NOT be dropped ;-) 19. Add a few lines about the last DOC "upgrade" in 2003 and its death ;-) in 2007. Add a few lines about RTF DOCX and ODT formats and some ideas what to use for them instead of AntiWord. Missing features: 20. The most painful missing feature is that images are not supported. See "PECOFFV8.DOC" "PE8X32A.PDF" and "PECOFFV8.PNG" (extracted manually). Wouldn't it be trivial to add an image extraction (into separate files) feature ? Even better, additionally, in text or PDF output do something like: ----------------- | PNG image | | 200x60 pixels | ----------------- or even better, additonally and optionally, include the images into the PDF file (is it possible to add a PNG or JPG "as-is" ??? Otherwise big problem ...). 21. There is an error about RTF files (see also [11.]). Idea: test also for "PK" and report in such cases: "This is NOT a DOC file, looks like a ZIP or DOCX or ODT file". 22. As enhancement of the "formatted" (see also [7.]) output, output fomat "Wiki" could be added. Some online converters do have this feature. I don't need it badly, just an idea. ****************************************