comp.lang.ada
 help / color / mirror / Atom feed
From: "Adam Beneschan" <adam@irvine.com>
Subject: Re: Reading "normal" text files with Wide_Text_IO in GNAT
Date: 6 Dec 2006 18:02:55 -0800
Date: 2006-12-06T18:02:55-08:00	[thread overview]
Message-ID: <1165456975.595248.177740@l12g2000cwl.googlegroups.com> (raw)
In-Reply-To: lDIdh.25626$E02.10478@newsb.telia.net

Björn Persson wrote:
> Manuel Collado wrote:
> > UCS-1 means encoding each character (codepoint) as a single byte whose
> > numerical value is just the codepoint. Can be used only for codepoints in
> > the range (0..255). UCS-1 is the natural, implicit encoding of all 8-bit
> > (and 7-bits) character sets.
>
> I'd still like to know where UCS-1 is defined, and by whom.
> http://www.iana.org/assignments/character-sets lists ISO-10646-UCS-2,
> ISO-10646-UCS-4 and ISO-10646-UCS-Basic, but no UCS-1.
> http://www.unicode.org/glossary/#U also has entries for UCS-2 and UCS-4,
> but no UCS-1.

UCS-Basic may be the "official" name for what I'm talking about.
Unfortunately, I'm having trouble figuring it out.  The IANA website
you referred me to is titled "Character Sets", but some of the things
listed underneath are encoding standards (UTF-8, etc.) rather than
character sets; UCS-Basic is listed as a "subset of Unicode", however,
and Unicode is a character set (not an encoding; there are multiple
ways to encode Unicode characters, including UTF-8, UTF-16, UCS-2).  So
this page just exemplifies the sort of confusion Manuel referred to.  A
quick Google search hasn't provided any further enlightenment on
exactly what UCS-Basic is.  Specifically, I can't tell whether it's a
character set or an encoding.

UCS-2 and UCS-4 are representations in which if an integer N maps to a
character, then that character is represented simply by a 2- or 4-byte
binary representation of N (byte ordering is an issue, though).  So it
would seem logical that UCS-1 would simply refer to a 1-byte binary
representation of a number.  That's how it seemed to me, and I did find
other references to this term, so I figured it was the correct term.
But maybe it isn't official.

Sigh....

                               -- Adam




  reply	other threads:[~2006-12-07  2:02 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-11-30 19:54 Reading "normal" text files with Wide_Text_IO in GNAT Adam Beneschan
2006-12-03  1:22 ` Björn Persson
2006-12-04 18:17   ` Adam Beneschan
2006-12-04 23:35     ` Manuel Collado
2006-12-06 23:46       ` Björn Persson
2006-12-07  2:02         ` Adam Beneschan [this message]
2006-12-09 20:43           ` Björn Persson
2006-12-11 19:49           ` Manuel Collado
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox