comp.lang.ada
 help / color / mirror / Atom feed
From: Adam Beneschan <adam@irvine.com>
Subject: Re: Why no Ada.Wide_Directories?
Date: Tue, 18 Oct 2011 08:34:14 -0700 (PDT)
Date: 2011-10-18T08:34:14-07:00	[thread overview]
Message-ID: <dce57c61-b582-4f1d-ba0a-ffc18e9c4c3b@p27g2000prp.googlegroups.com> (raw)
In-Reply-To: 1tggwi1yicf5z.1q3xra9r00oyb$.dlg@40tude.net

On Oct 18, 12:55 am, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de>
wrote:
> On Mon, 17 Oct 2011 18:10:35 -0700 (PDT), Adam Beneschan wrote:
> > I have a feeling you're fundamentally confused about what UTF-8 is, as
> > compared to "Latin-1".  Latin-1 is a character mapping.  It defines,
> > for all integers in the range 0..255, what character that integer
> > represents (e.g. 77 represents 'M', etc.).  Unicode is a character
> > mapping that defines characters for a much larger integer range.
>
> No, Unicode is a standard describes character mappings. Both UTF-8 and
> Latin-1 are encodings. Latin-1 as an encoding has a property that there is
> 1-1 octet to code point correspondence, at the cost that some (most) of
> code points cannot be represented by the encoding. UTF-8 lacks this
> property, but is capable to represent all code points.

Sigh... I guess you're right about the term "Latin-1".  It appears to
be *both* a character mapping *and* an encoding, based on a bit of
Wikipedia research.  The problem for me is this: what does that make
Latin-2, Latin-3, KOI8-R, etc.?  Those seem to describe the same
encoding mechanism as Latin-1 (each code represented as one 8-bit
byte), but with different meanings for the codes in the 16#A0#..16#FF#
range.  So the same encoding scheme seems to have multiple different
names.  That's very confusing to me.

I've tended to look at character-set issues as having two independent
parts: part 1 is how do we define the correspondence between integers
and the character symbols [or other "characters" with special meanings
like control characters]; and part 2 is, once we have a sequence of
integers that correspond to those characters, how do we represent that
sequence in memory, in a file, when sending bits over a wire, etc.
The two parts appear completely independent to me, which is why I get
confused when a term like "Latin-1" is used that straddles both
parts.  (Unless we decree that Unicode is the only mapping in
existence, and things like Latin-2 or KOI8-R are encodings in which
bytes in the 16#A0#..16#FF# range represent integers which are totally
different and which are defined by the Unicode standard?)

I guess I'll have to learn what people mean by their terms.  I had
some misimpressions.

And I think we could solve a lot by making String a more abstract type
defined by its operations rather than by its representation (array of
character).  For a new language, as opposed to one in which we're
trying to maintain backward compatibility with a language designed in
the 1980s, that would be a great idea.  (I *don't* think it was a good
idea to define UTF8_String as a subtype of String, and to decide that
a String could be used as a sequence of bytes that had no direct
correspondence to any characters from a character set.  That seems
like a big compromise.  On the other hand, doing it "right" would have
been a lot of work which I wouldn't have had to do, most of it
unpaid.  So I'm hesitant to complain too much.)

                            -- Adam



  parent reply	other threads:[~2011-10-18 15:34 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-14  6:58 Why no Ada.Wide_Directories? Michael Rohan
2011-10-14  7:39 ` Yannick Duchêne (Hibou57)
2011-10-14  9:07   ` Dmitry A. Kazakov
2011-10-14 12:48     ` Yannick Duchêne (Hibou57)
2011-10-14 12:54     ` Yannick Duchêne (Hibou57)
2011-10-15  1:06 ` ytomino
2011-10-15  6:55   ` Vadim Godunko
2011-10-15 12:34     ` ytomino
2011-10-15  8:38   ` Dmitry A. Kazakov
2011-10-15 13:12     ` Peter C. Chapin
2011-10-15 13:22       ` Ludovic Brenta
2011-10-15 14:47       ` Dmitry A. Kazakov
2011-10-16  5:48         ` Yannick Duchêne (Hibou57)
2011-10-17  0:15         ` Peter C. Chapin
2011-10-17  3:23           ` Yannick Duchêne (Hibou57)
2011-10-17  7:12           ` Simon Wright
2011-10-17  7:59           ` Dmitry A. Kazakov
2011-10-18 10:55             ` Peter C. Chapin
2011-10-18 12:27               ` Dmitry A. Kazakov
2011-10-16  5:51       ` Yannick Duchêne (Hibou57)
2011-10-17 21:41         ` Randy Brukardt
2011-10-18  7:29           ` Dmitry A. Kazakov
2011-10-18 14:06           ` Pascal Obry
2011-10-18 14:08             ` Pascal Obry
2011-10-19 21:32             ` Randy Brukardt
2011-10-17 21:33   ` Randy Brukardt
2011-10-17 23:47     ` ytomino
2011-10-18  1:10       ` Adam Beneschan
2011-10-18  2:32         ` ytomino
2011-10-18  4:46           ` ytomino
2011-10-18  9:32             ` Yannick Duchêne (Hibou57)
2011-10-18 10:00               ` Dmitry A. Kazakov
2011-10-18 10:06                 ` Yannick Duchêne (Hibou57)
2011-10-18 12:01                   ` Dmitry A. Kazakov
2011-10-18 15:02           ` Adam Beneschan
2011-10-18 15:16             ` Dmitry A. Kazakov
2011-10-18 23:42               ` Adam Beneschan
2011-10-19  8:12                 ` Dmitry A. Kazakov
2011-10-19 21:43               ` Randy Brukardt
2011-10-20  7:37                 ` Dmitry A. Kazakov
2011-10-20 11:04                   ` Yannick Duchêne (Hibou57)
2011-10-20 12:21                     ` Dmitry A. Kazakov
2011-10-20 12:38                       ` Yannick Duchêne (Hibou57)
2011-10-20 14:31                         ` Dmitry A. Kazakov
2011-10-20 15:54                           ` Yannick Duchêne (Hibou57)
2011-10-20 17:35                             ` Dmitry A. Kazakov
2011-10-21 12:53                               ` Yannick Duchêne (Hibou57)
2011-10-21 13:41                                 ` Dmitry A. Kazakov
2011-10-25 19:22                                   ` Randy Brukardt
2011-10-25 19:35                                     ` Dmitry A. Kazakov
2011-10-26 22:41                                       ` Randy Brukardt
2011-10-27  7:43                                         ` Dmitry A. Kazakov
2011-10-27 15:13                                           ` Yannick Duchêne (Hibou57)
2011-10-27 19:39                                             ` Robert A Duff
2011-10-27 21:09                                               ` Yannick Duchêne (Hibou57)
2011-10-28  7:50                                                 ` Dmitry A. Kazakov
2011-10-28  8:45                                                   ` Yannick Duchêne (Hibou57)
2011-10-28 14:59                                                     ` Dmitry A. Kazakov
2011-10-20 17:40                   ` J-P. Rosen
2011-10-20 18:43                     ` Dmitry A. Kazakov
2011-10-21 10:07                     ` Vadim Godunko
2011-10-21 11:25                       ` J-P. Rosen
2011-10-21 12:25                         ` Yannick Duchêne (Hibou57)
2011-10-21 13:13                         ` Dmitry A. Kazakov
2011-10-21 16:03                           ` Yannick Duchêne (Hibou57)
2011-10-21 18:34                             ` Dmitry A. Kazakov
2011-10-21 19:30                               ` Yannick Duchêne (Hibou57)
2011-10-21 20:02                                 ` Dmitry A. Kazakov
2011-10-21 20:36                                   ` Yannick Duchêne (Hibou57)
2011-10-22  7:54                                     ` Dmitry A. Kazakov
2011-10-22 20:28                                       ` Yannick Duchêne (Hibou57)
2011-10-22 22:23                                       ` Yannick Duchêne (Hibou57)
2011-10-23  7:53                                         ` Dmitry A. Kazakov
2011-10-25 19:16                                           ` Randy Brukardt
2011-10-21 18:55                         ` Vadim Godunko
2011-10-21 19:18                           ` J-P. Rosen
2011-10-21 19:41                           ` Yannick Duchêne (Hibou57)
2011-10-18 22:54             ` ytomino
2011-10-18  3:15         ` Yannick Duchêne (Hibou57)
2011-10-18  7:55         ` Dmitry A. Kazakov
2011-10-18  9:41           ` Yannick Duchêne (Hibou57)
2011-10-18 10:25           ` J-P. Rosen
2011-10-18 10:56             ` Yannick Duchêne (Hibou57)
2011-10-18 15:34           ` Adam Beneschan [this message]
2011-10-18 17:27             ` J-P. Rosen
2011-10-18 18:33               ` Adam Beneschan
2011-10-18 19:54               ` Yannick Duchêne (Hibou57)
2011-10-18  8:01       ` Dmitry A. Kazakov
2011-10-18  2:59     ` Yannick Duchêne (Hibou57)
2011-10-18  4:07       ` Michael Rohan
2011-10-18  4:54       ` ytomino
2011-10-18  9:54         ` Yannick Duchêne (Hibou57)
2011-10-18 10:52           ` ytomino
2011-10-18 11:02             ` Yannick Duchêne (Hibou57)
2011-10-18 21:18               ` ytomino
2011-10-18 10:10       ` J-P. Rosen
2011-10-22  6:32         ` Michael Rohan
2011-10-22  7:25           ` Yannick Duchêne (Hibou57)
2011-10-25 19:26           ` Randy Brukardt
2011-10-27 17:40 ` anon
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox