From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,a82f86f344c98f79 X-Google-Attributes: gid103376,public X-Google-Language: ENGLISH,UTF8 Path: g2news2.google.com!news3.google.com!news.glorb.com!newspeer1.se.telia.net!se.telia.net!masternews.telia.net.!newsb.telia.net.POSTED!not-for-mail From: =?UTF-8?B?QmrDtnJuIFBlcnNzb24=?= User-Agent: Thunderbird 1.5.0.5 (X11/20060808) MIME-Version: 1.0 Newsgroups: comp.lang.ada Subject: Re: Avatox 1.0: Trouble with encoding in Windows References: <45051d37@news.upm.es> <45053aec$0$5142$9b4e6d93@newsspool1.arcor-online.net> <5ZednRK-0M3K15rYnZ2dnUVZ_o2dnZ2d@megapath.net> <4507e49f$0$26945$9b4e6d93@newsspool4.arcor-online.net> In-Reply-To: <4507e49f$0$26945$9b4e6d93@newsspool4.arcor-online.net> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Message-ID: Date: Wed, 13 Sep 2006 19:28:38 GMT NNTP-Posting-Host: 83.250.106.238 X-Complaints-To: abuse@telia.com X-Trace: newsb.telia.net 1158175718 83.250.106.238 (Wed, 13 Sep 2006 21:28:38 CEST) NNTP-Posting-Date: Wed, 13 Sep 2006 21:28:38 CEST Organization: Telia Internet Xref: g2news2.google.com comp.lang.ada:6577 Date: 2006-09-13T19:28:38+00:00 List-Id: Georg Bauhaus skrev: > Randy Brukardt wrote: > >> And >> there is no chance of any sort of agreement on source representations for >> ASIS (or even the naming of them) if there isn't be any for Ada. > > Maybe a standard configuration pragma can be devised that informs > Ada source processors of the encoding used in files/compilation > units/...? That would be great, seeing that filesystems in Unix and Windows (and probably most other OSes) typically can't keep track of what character encoding is used in each file. Theoretically it's better to store the encoding outside the file so that you can know what encoding to use *before* you start reading the file. In practice this is usually impossible. The Unix approach is to have a system-wide locale setting that specifies a character encoding, and assume that all text files on the system use that encoding. This assumption collapses when you connect your system to the Internet and start exchanging data with others. Thus it's pretty much necessary to specify the encoding inside each file, like XML does. Python has this feature. It allows an encoding declaration in a comment at the beginning of the file, and tries to be compatible with text editors: http://docs.python.org/ref/encodings.html -- Björn Persson PGP key A88682FD omb jor ers @sv ge. r o.b n.p son eri nu