From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00
	autolearn=unavailable autolearn_force=no version=3.4.4
Path: 
 eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!news.eternal-september.org!news.eternal-september.org!feeder.eternal-september.org!news.mixmin.net!news2.arglkargh.de!news.karotte.org!uucp.gnuu.de!newsfeed.arcor.de!newsspool4.arcor-online.net!news.arcor.de.POSTED!not-for-mail
Date: Sat, 16 Nov 2013 18:01:07 +0100
From: Georg Bauhaus <rm.dash-bauhaus@futureapps.de>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7;
 rv:17.0) Gecko/20130801 Thunderbird/17.0.8
MIME-Version: 1.0
Newsgroups: comp.lang.ada
Subject: Re: strange behaviour of utf-8 files
References: <73e0853b-454a-467f-9dc7-84ca5b9c29b2@googlegroups.com>
 <1ghx537y5gbfq.17oazom68d4n6.dlg@40tude.net>
 <9d00683c-949c-4e88-a161-ebd78b350d39@googlegroups.com>
In-Reply-To: <9d00683c-949c-4e88-a161-ebd78b350d39@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Message-ID: <5287a4d3$0$9523$9b4e6d93@newsspool1.arcor-online.net>
Organization: Arcor
NNTP-Posting-Date: 16 Nov 2013 18:01:07 CET
NNTP-Posting-Host: c4f01361.newsspool1.arcor-online.net
X-Trace: 
 DXC=Ih[2:iXnH]beoCI^f\Y]Eaic==]BZ:afn4Fo<]lROoRankgeX?EC@@`L]_QDT8B6ioPCY\c7>ejVhjk=I8hkh:mc00G>eV4@Mca
X-Complaints-To: usenet-abuse@arcor.de
Xref: news.eternal-september.org comp.lang.ada:17697
Date: 2013-11-16T18:01:07+01:00
List-Id: <comp.lang.ada>

On 16.11.13 16:09, Stoik wrote:

> Thanks for the answer. Your advice is certainly sound, but not very satisfactory. The whole purpose of utf-8 is to make
> things portable across platforms. If the compiler cannot deal properly with the
> source code written in the utf-8 encoding, then the whole effort that went into
> all the wide_ and wide_wide_ packages and the new packages that deal with various encodings is lost (all the Latin-x possibilities are useless anyway, at least on Windows platform). I am adjoining a trivial program which works differently according to the encoding (UTF-8 or ISO-8859-1) of the source code, printing 1 or 2 as the answer.
>
> with ada.text_io; use ada.text_io;
> procedure example is
>     S : String := "ó";
> begin
>     Put_Line (S'Length'Img);
> end;

GNAT has two switches that affect its way of looking at
coded characters in source text:

for identifiers in source text, specify -gnatiC
  where C is one of the characters listed 3.2.10
  of the GNAT UG accompanying the compiler;

for the wide character encoding method, specify -gnatWE
  where E is one of the characters listed in the
  same document.

With switch -gnatW8, I get

$ ./example
  1
$

That is, the source text is understood to be encoded
in UTF-8, and 'ó' becomes Character'Val (243), viz. LC_O_Acute.