From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,a82f86f344c98f79,start X-Google-Attributes: gid103376,public X-Google-Language: ENGLISH,ASCII-7-bit Date: Mon, 11 Sep 2006 10:24:25 +0200 From: Manuel Collado User-Agent: Thunderbird 1.5 (Windows/20051201) MIME-Version: 1.0 Newsgroups: comp.lang.ada Subject: Avatox 1.0: Trouble with encoding in Windows Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit NNTP-Posting-Host: 138.100.242.204 Message-ID: <45051d37@news.upm.es> X-Trace: 11 Sep 2006 10:24:23 +0100, 138.100.242.204 Path: g2news2.google.com!news4.google.com!news3.google.com!border1.nntp.dca.giganews.com!nntp.giganews.com!newsfeed00.sul.t-online.de!t-online.de!130.59.10.21.MISMATCH!kanaga.switch.ch!switch.ch!news.rediris.es!news.upm.es!138.100.242.204 Xref: g2news2.google.com comp.lang.ada:6549 Date: 2006-09-11T10:24:25+02:00 List-Id: The XML generated by Avatox 1.0 on my Windows XP machine declares: But it contains text fragments taken from the Ada source code, in the native CP-1252 encoding, without any translation. The result is that for Ada source with non ASCII characteres (like accented letters) the generated XML is not well-formed, and rejected by all XML utilities. 1. The ASIS API should provide a way to know the character encoding of the source file (I think it doesn't). 2. The non-ASCII characters could be converted to XML character references (&#nnn;) by Avatox. -- Manuel Collado