From: Adam Beneschan <adam@irvine.com>
Subject: Re: Why no Ada.Wide_Directories?
Date: Tue, 18 Oct 2011 11:33:01 -0700 (PDT)
Date: 2011-10-18T11:33:01-07:00 [thread overview]
Message-ID: <2aa8f6de-2c28-47e3-b358-514b0dd5ee6d@u13g2000prm.googlegroups.com> (raw)
In-Reply-To: j7kcu7$gcg$1@dont-email.me
On Oct 18, 10:27 am, "J-P. Rosen" <ro...@adalog.fr> wrote:
> Le 18/10/2011 17:34, Adam Beneschan a crit :
>
>
>
> > On Oct 18, 12:55 am, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de>
> > wrote:
> >> On Mon, 17 Oct 2011 18:10:35 -0700 (PDT), Adam Beneschan wrote:
> >>> I have a feeling you're fundamentally confused about what UTF-8 is, as
> >>> compared to "Latin-1". Latin-1 is a character mapping. It defines,
> >>> for all integers in the range 0..255, what character that integer
> >>> represents (e.g. 77 represents 'M', etc.). Unicode is a character
> >>> mapping that defines characters for a much larger integer range.
>
> >> No, Unicode is a standard describes character mappings. Both UTF-8 and
> >> Latin-1 are encodings. Latin-1 as an encoding has a property that there is
> >> 1-1 octet to code point correspondence, at the cost that some (most) of
> >> code points cannot be represented by the encoding. UTF-8 lacks this
> >> property, but is capable to represent all code points.
>
> > Sigh... I guess you're right about the term "Latin-1". It appears to
> > be *both* a character mapping *and* an encoding, based on a bit of
> > Wikipedia research. The problem for me is this: what does that make
> > Latin-2, Latin-3, KOI8-R, etc.? Those seem to describe the same
> > encoding mechanism as Latin-1 (each code represented as one 8-bit
> > byte), but with different meanings for the codes in the 16#A0#..16#FF#
> > range. So the same encoding scheme seems to have multiple different
> > names. That's very confusing to me.
>
> Not 100% sure, but I think here is the picture.
> 1) Code points are always 31 bits (or maybe 30).
> 2) Below is the lower left corner of BMP (use fixed fonts!):
>
> |
> |____________________
> | | |
> | Latin 1 | Latin 2 |
> |_________|_________|_______
>
> The lower halves of Latin-1 and Latin-2 are identical, i.e. the same
> characters have two different code-points, differing by 256.
>
> When you use Latin-1 with 8 bit bytes, you can view this as an encoding
> with the 24 upper bits being 16#00_00_00#. When you use Latin-2 with 8
> bit bytes, you can view this as an encoding with the 24 upper bits being
> 16#00_00_01#.
>
> So in a sense, Latin-1 and Latin-2 are both character sets, and when
> represented on only 8 bits, an encoding.
>
> Does this make sense?
No, I don't think so. In Latin-2 (ISO/IEC-8859-2), the code points
16#00#..16#A0# have the same meanings as in Latin-1 and Unicode. Past
that, though, the correspondence is all over the place. Thus, 16#A1#
in Latin-2 corresponds to 16#0104# in the Unicode BMP; 16#A2# ->
16#02D8#, 16#A3# -> 16#0141#, 16#A5# -> 16#013D#, etc.
-- Adam
next prev parent reply other threads:[~2011-10-18 18:33 UTC|newest]
Thread overview: 100+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-10-14 6:58 Why no Ada.Wide_Directories? Michael Rohan
2011-10-14 7:39 ` Yannick Duchêne (Hibou57)
2011-10-14 9:07 ` Dmitry A. Kazakov
2011-10-14 12:48 ` Yannick Duchêne (Hibou57)
2011-10-14 12:54 ` Yannick Duchêne (Hibou57)
2011-10-15 1:06 ` ytomino
2011-10-15 6:55 ` Vadim Godunko
2011-10-15 12:34 ` ytomino
2011-10-15 8:38 ` Dmitry A. Kazakov
2011-10-15 13:12 ` Peter C. Chapin
2011-10-15 13:22 ` Ludovic Brenta
2011-10-15 14:47 ` Dmitry A. Kazakov
2011-10-16 5:48 ` Yannick Duchêne (Hibou57)
2011-10-17 0:15 ` Peter C. Chapin
2011-10-17 3:23 ` Yannick Duchêne (Hibou57)
2011-10-17 7:12 ` Simon Wright
2011-10-17 7:59 ` Dmitry A. Kazakov
2011-10-18 10:55 ` Peter C. Chapin
2011-10-18 12:27 ` Dmitry A. Kazakov
2011-10-16 5:51 ` Yannick Duchêne (Hibou57)
2011-10-17 21:41 ` Randy Brukardt
2011-10-18 7:29 ` Dmitry A. Kazakov
2011-10-18 14:06 ` Pascal Obry
2011-10-18 14:08 ` Pascal Obry
2011-10-19 21:32 ` Randy Brukardt
2011-10-17 21:33 ` Randy Brukardt
2011-10-17 23:47 ` ytomino
2011-10-18 1:10 ` Adam Beneschan
2011-10-18 2:32 ` ytomino
2011-10-18 4:46 ` ytomino
2011-10-18 9:32 ` Yannick Duchêne (Hibou57)
2011-10-18 10:00 ` Dmitry A. Kazakov
2011-10-18 10:06 ` Yannick Duchêne (Hibou57)
2011-10-18 12:01 ` Dmitry A. Kazakov
2011-10-18 15:02 ` Adam Beneschan
2011-10-18 15:16 ` Dmitry A. Kazakov
2011-10-18 23:42 ` Adam Beneschan
2011-10-19 8:12 ` Dmitry A. Kazakov
2011-10-19 21:43 ` Randy Brukardt
2011-10-20 7:37 ` Dmitry A. Kazakov
2011-10-20 11:04 ` Yannick Duchêne (Hibou57)
2011-10-20 12:21 ` Dmitry A. Kazakov
2011-10-20 12:38 ` Yannick Duchêne (Hibou57)
2011-10-20 14:31 ` Dmitry A. Kazakov
2011-10-20 15:54 ` Yannick Duchêne (Hibou57)
2011-10-20 17:35 ` Dmitry A. Kazakov
2011-10-21 12:53 ` Yannick Duchêne (Hibou57)
2011-10-21 13:41 ` Dmitry A. Kazakov
2011-10-25 19:22 ` Randy Brukardt
2011-10-25 19:35 ` Dmitry A. Kazakov
2011-10-26 22:41 ` Randy Brukardt
2011-10-27 7:43 ` Dmitry A. Kazakov
2011-10-27 15:13 ` Yannick Duchêne (Hibou57)
2011-10-27 19:39 ` Robert A Duff
2011-10-27 21:09 ` Yannick Duchêne (Hibou57)
2011-10-28 7:50 ` Dmitry A. Kazakov
2011-10-28 8:45 ` Yannick Duchêne (Hibou57)
2011-10-28 14:59 ` Dmitry A. Kazakov
2011-10-20 17:40 ` J-P. Rosen
2011-10-20 18:43 ` Dmitry A. Kazakov
2011-10-21 10:07 ` Vadim Godunko
2011-10-21 11:25 ` J-P. Rosen
2011-10-21 12:25 ` Yannick Duchêne (Hibou57)
2011-10-21 13:13 ` Dmitry A. Kazakov
2011-10-21 16:03 ` Yannick Duchêne (Hibou57)
2011-10-21 18:34 ` Dmitry A. Kazakov
2011-10-21 19:30 ` Yannick Duchêne (Hibou57)
2011-10-21 20:02 ` Dmitry A. Kazakov
2011-10-21 20:36 ` Yannick Duchêne (Hibou57)
2011-10-22 7:54 ` Dmitry A. Kazakov
2011-10-22 20:28 ` Yannick Duchêne (Hibou57)
2011-10-22 22:23 ` Yannick Duchêne (Hibou57)
2011-10-23 7:53 ` Dmitry A. Kazakov
2011-10-25 19:16 ` Randy Brukardt
2011-10-21 18:55 ` Vadim Godunko
2011-10-21 19:18 ` J-P. Rosen
2011-10-21 19:41 ` Yannick Duchêne (Hibou57)
2011-10-18 22:54 ` ytomino
2011-10-18 3:15 ` Yannick Duchêne (Hibou57)
2011-10-18 7:55 ` Dmitry A. Kazakov
2011-10-18 9:41 ` Yannick Duchêne (Hibou57)
2011-10-18 10:25 ` J-P. Rosen
2011-10-18 10:56 ` Yannick Duchêne (Hibou57)
2011-10-18 15:34 ` Adam Beneschan
2011-10-18 17:27 ` J-P. Rosen
2011-10-18 18:33 ` Adam Beneschan [this message]
2011-10-18 19:54 ` Yannick Duchêne (Hibou57)
2011-10-18 8:01 ` Dmitry A. Kazakov
2011-10-18 2:59 ` Yannick Duchêne (Hibou57)
2011-10-18 4:07 ` Michael Rohan
2011-10-18 4:54 ` ytomino
2011-10-18 9:54 ` Yannick Duchêne (Hibou57)
2011-10-18 10:52 ` ytomino
2011-10-18 11:02 ` Yannick Duchêne (Hibou57)
2011-10-18 21:18 ` ytomino
2011-10-18 10:10 ` J-P. Rosen
2011-10-22 6:32 ` Michael Rohan
2011-10-22 7:25 ` Yannick Duchêne (Hibou57)
2011-10-25 19:26 ` Randy Brukardt
2011-10-27 17:40 ` anon
replies disabled
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox