From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,e136d2bb18e6fb60
X-Google-Attributes: gid103376,public
X-Google-ArrivalTime: 2002-12-14 19:46:32 PST
Path: archiver1.google.com!postnews1.google.com!not-for-mail
From: starner@okstate.edu (David Starner)
Newsgroups: comp.lang.ada
Subject: Re: Character Sets
Date: 14 Dec 2002 19:46:32 -0800
Organization: http://groups.google.com/
Message-ID: <81f70ac6.0212141946.5e4f132@posting.google.com>
References: <mailman.1038963002.11173.comp.lang.ada@ada.eu.org>
 <81f70ac6.0212131927.4fa6b642@posting.google.com>
 <665e587a.0212141453.42386f5d@posting.google.com>
NNTP-Posting-Host: 139.78.98.169
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Trace: posting.google.com 1039923992 18607 127.0.0.1 (15 Dec 2002 03:46:32
 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: 15 Dec 2002 03:46:32 GMT
Xref: archiver1.google.com comp.lang.ada:31839
Date: 2002-12-15T03:46:32+00:00
List-Id: <comp.lang.ada>

vgodunko@vipmail.ru (Vadim Godunko) wrote in message news:<665e587a.0212141453.42386f5d@posting.google.com>...
>
> How many memory required for save all data from Unicode Character
> Database? 

After stripping the converters, ICU takes up 3 MB.
<http://oss.software.ibm.com/icu/userguide/icudata.html> But that
includes a lot of locale data, and could probably be compressed more
with work.
There's no reason it would need to be paged into memory;

> What you do if this constant changed? Retest all existing
> applications?

If the constant changed, then your version of the compiler changed,
and it's certainly possible that it broke your program, constant or
not. Given a stable API, a program should not break from a change in
the Unicode data, especially as they try not to make major changes to
the data between versions.
 
> Latest public GNAT version and GCC3/GNAT both support ISO-8859-5
> encoding in identifiers. 

Which may explain why people weren't using it in earlier versions. 

> And don't known any GNAT users who use
> KOI8-R/U/B encodings outside comment, character and string literals.

The problem is, source encoding is tied into the encoding that I/O
uses.

> The best way for implement localization/internationalization support
> in Ada is define special needs annex, 

The non-BMP Unicode is not l10n/i18n - it's basic text handling just
like the rest of Unicode. As for the character data and encodings -
sure, whatever. Just so long as it's supported in some way.