comp.lang.ada
 help / color / mirror / Atom feed
From: Manuel Collado <m.collado@fi.upm.es>
Subject: Re: Reading "normal" text files with Wide_Text_IO in GNAT
Date: Mon, 11 Dec 2006 20:49:02 +0100
Date: 2006-12-11T20:49:02+01:00	[thread overview]
Message-ID: <457db62d@news.upm.es> (raw)
In-Reply-To: <1165456975.595248.177740@l12g2000cwl.googlegroups.com>

Adam Beneschan escribi�:
> Bj�rn Persson wrote:
>> ...
>> I'd still like to know where UCS-1 is defined, and by whom.
>> http://www.iana.org/assignments/character-sets lists ISO-10646-UCS-2,
>> ISO-10646-UCS-4 and ISO-10646-UCS-Basic, but no UCS-1.
>> http://www.unicode.org/glossary/#U also has entries for UCS-2 and UCS-4,
>> but no UCS-1.
> ...
> UCS-2 and UCS-4 are representations in which if an integer N maps to a
> character, then that character is represented simply by a 2- or 4-byte
> binary representation of N (byte ordering is an issue, though).  So it
> would seem logical that UCS-1 would simply refer to a 1-byte binary
> representation of a number.  That's how it seemed to me, and I did find
> other references to this term, so I figured it was the correct term.
> But maybe it isn't official.

Well, it seems that there are no official names for simple, direct 
encodings (no tied to a given character set). In fact UCS-2 and UCS-4 are 
specific names for Unicode stuff (UCS means Universal Character Set).

Character encoding concepts are precisely defined in:

     http://en.wikipedia.org/wiki/Character_encoding

As you can see, the encoding issue is composed of two separated ideas: the 
CEF (character encodng form) and the CES (character encoding scheme). Some 
of the latest ones have explicit names. But the direct CEFs are so simple 
that they don't need explicit names (just the size of the code value).

If we take UCS-2 and UCS-4 out of the Unicode world and use them as general 
names for direct CEFs with 16-bit and 32-bit code values, then UCS-1 
becomes the natural name for the direct CEF with 8-bit code values. Let it 
be official or not.

Regards.
-- 
Manuel Collado - http://lml.ls.fi.upm.es/~mcollado



      parent reply	other threads:[~2006-12-11 19:49 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-11-30 19:54 Reading "normal" text files with Wide_Text_IO in GNAT Adam Beneschan
2006-12-03  1:22 ` Björn Persson
2006-12-04 18:17   ` Adam Beneschan
2006-12-04 23:35     ` Manuel Collado
2006-12-06 23:46       ` Björn Persson
2006-12-07  2:02         ` Adam Beneschan
2006-12-09 20:43           ` Björn Persson
2006-12-11 19:49           ` Manuel Collado [this message]
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox