From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00
	autolearn=unavailable autolearn_force=no version=3.4.4
X-Received: by 10.157.6.225 with SMTP id 88mr3418149otx.7.1470939775628;
        Thu, 11 Aug 2016 11:22:55 -0700 (PDT)
X-Received: by 10.157.2.39 with SMTP id 36mr158820otb.3.1470939775581; Thu, 11
 Aug 2016 11:22:55 -0700 (PDT)
Path: 
 eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!news.eternal-september.org!feeder.eternal-september.org!usenet.blueworldhosting.com!feeder01.blueworldhosting.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!f6no9207811ith.0!news-out.google.com!d130ni32046ith.0!nntp.google.com!f6no9207805ith.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail
Newsgroups: comp.lang.ada
Date: Thu, 11 Aug 2016 11:22:55 -0700 (PDT)
In-Reply-To: <noidr9$atk$1@gioia.aioe.org>
Complaints-To: groups-abuse@google.com
Injection-Info: glegroupsg2000goo.googlegroups.com;
 posting-host=2001:8a0:6a4f:fe01:b44b:9abb:7567:a522;
 posting-account=nd46uAkAAAB2IU3eJoKQE6q_ACEyvPP_
NNTP-Posting-Host: 2001:8a0:6a4f:fe01:b44b:9abb:7567:a522
References: <267bd80f-b388-4df6-b712-315ee9bda2b8@googlegroups.com>
 <noi8r3$2p6$1@gioia.aioe.org>
 <90caee48-5fa7-47d7-aad5-761e11225e2c@googlegroups.com>
 <noidr9$atk$1@gioia.aioe.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <4c6509a9-5ff2-4f94-b2c3-55d89ca2b076@googlegroups.com>
Subject: Re: A few questions on parsing, sockets, UTF-8 strings
From: john@peppermind.com
Injection-Date: Thu, 11 Aug 2016 18:22:55 +0000
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Xref: news.eternal-september.org comp.lang.ada:31401
Date: 2016-08-11T11:22:55-07:00
List-Id: <comp.lang.ada>

On Thursday, August 11, 2016 at 6:49:33 PM UTC+1, Dmitry A. Kazakov wrote:

> ASCII string is an UTF-8 string. The reverse if false.

You're right, Ascii uses only 0...127 as code points. But I thought that Ad=
a fixed strings hold one byte per character, meaning that I can store UTF-8=
 in it? Am I mistaken about that?

> > So if I Base64 encode this directly, do I have to care about UTF-8?
>=20
> No, if it is strictly ASCII. Yes, if you are going to use other Unicode=
=20
> code points.

Sorry for being such a noob, but I still don't get it. If GNAT GPS is set t=
o UTF-8 (-gnatW8 for gnatmake and source encoding in GPS preferences), does=
n't that mean that if I enter a Unicode character into a fixed string liter=
al (just String, not Wide_String or Wide_Wide_String) that the string will =
contain this character in the form of as many bytes as the Unicode code poi=
nt requires? So if it's a two-byte UTF-8 code point, then the string will c=
ontain two bytes?

In that case, as long as I don't need to access single characters ever, cou=
ld I stick with fixed strings?