From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.5-pre1 (2020-06-20) on ip-172-31-74-118.ec2.internal X-Spam-Level: X-Spam-Status: No, score=-1.9 required=3.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.5-pre1 Path: eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail From: "Jeffrey R. Carter" Newsgroups: comp.lang.ada Subject: Why UTF-8 (was Re: Lower bounds of Strings) Date: Sat, 9 Jan 2021 15:52:39 +0100 Organization: Also freenews.netfront.net; news.tornevall.net; news.eternal-september.org Message-ID: References: <1cc09f04-98f2-4ef3-ac84-9a9ca5aa3fd5n@googlegroups.com> <37ada5ff-eee7-4082-ad20-3bd65b5a2778n@googlegroups.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Date: Sat, 9 Jan 2021 14:52:40 -0000 (UTC) Injection-Info: reader02.eternal-september.org; posting-host="4a9db7039c6bf2e7d4856adbdff68c15"; logging-data="19555"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX192E8hSf0/xkj/qN42leZDlIHLUCtHt9Z8=" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.5.0 Cancel-Lock: sha1:tt8tx8Vsic+UdTaAlpWucMaSJaU= In-Reply-To: Content-Language: en-US Xref: reader02.eternal-september.org comp.lang.ada:61076 List-Id: On 1/9/21 3:31 AM, Randy Brukardt wrote: > The default String should be UTF-8, the others should be reserved for > special cases (interfacing in particular). You don't want the default string > type to restrict the contents, and you don't want it to waste a lot of > space. I don't understand this. I presume there was a time when the extra complexity of UTF-8 was a reasonable price to pay for the larger than 1-byte character range it provided, and there may be systems where it still makes sense, but with most systems these days having GB of memory and TB of storage, the simplicity of using 2 bytes per character seems worth the wasted space. On my 4-yr-old computer I could do everything with 4-byte characters and not have a problem. -- Jeff Carter "Unix and C are the ultimate computer viruses." Richard Gabriel 99