comp.lang.ada
 help / color / mirror / Atom feed
From: "Randy Brukardt" <randy@rrsoftware.com>
Subject: Re: Interpretation of extensions different from Unix/Linux?
Date: Tue, 18 Aug 2009 15:48:16 -0500
Date: 2009-08-18T15:48:16-05:00	[thread overview]
Message-ID: <h6f44m$d6n$1@munin.nbi.dk> (raw)
In-Reply-To: 6f80c882-fa03-4ca9-a53e-fae34cea160d@b15g2000yqd.googlegroups.com

"Adam Beneschan" <adam@irvine.com> wrote in message 
news:6f80c882-fa03-4ca9-a53e-fae34cea160d@b15g2000yqd.googlegroups.com...
On Aug 17, 3:28 pm, "Randy Brukardt" <ra...@rrsoftware.com> wrote:

>> The problem here is that String really is not the right type, but since 
>> you
>> can't have string literals for private types in Ada, you can't make it a
>> private type. (And if you could have string literals, it still couldn't 
>> be
>> used with the existing I/O packages, it would be way too incompatible.)
>
>That wouldn't even be an issue if UTF-8 were strictly a "storage
>format" as you called it above.  If that were the case, you wouldn't
>need string literals for it.  I think the problem is that UTF-8 is
>something of a hybrid.  If all characters in the string are in the
>32..126 range, the "sequence of octets" stored in the UTF-8 string is
>identical to the graphic characters stored in a String.  (UTF-8 was
>designed purposefully so that would happen.)  In cases like that, it
>makes sense to use a string literal.

Well, the problem here is that it *always* makes sense to use a string 
literal. That's how you specify what you want in storage in Ada.

I think Dmitry's point is that he'd rather always see explicit conversions. 
The problem is that they don't work well -- exhibit A is unbounded strings. 
That's especially true for the use-adverse like me. I hate having to write:

    A_Str := Ada.Strings.Unbounded.To_Unbounded_String ("ABC");

and surely UTF8 would be worse:

   A_Str := Ada.Strings.Unbounded_UTF_8.To_Unbounded_UTF_8_String ("ABC");

..
>Also, I'm afraid that using String can backfire.  If I understand it
>correctly, the decision was that the Name parameter of Text_IO.Open
>should be interpreted as a UTF-8 octet sequence even though it's a
>String.  But the intent is to allow string literals.  At some point,
>though, some poor innocent programmer in Germany or Spain is going to
>try to use a string literal (or a Latin-1 string variable) with an
>umlaut or an accented vowel in it and get totally screwed up since
>those characters don't represent themselves in UTF-8 encoding, and
>they'll end up puzzling over how their program created a file with a
>Chinese character in the middle of the name.  (Yeah, I know, that's
>very unlikely; most likely the UTF-8 encoding will simply be invalid.)

I've been presuming that UTF-8 encoding started with a BOM or something like 
that, else you couldn't tell it from regular Latin-1 encoding. It would be 
hard to insert a BOM into a string literal by accident!

But I do agree that this issue needs some discussion.

(Also note that a major reason for this package is to make ASIS work; there 
[as with I/O], we're stuck with existing routines that return Wide_Strings 
that are not enough to handle all possible text.)

                                        Randy.

                                              -- Adam 





  reply	other threads:[~2009-08-18 20:48 UTC|newest]

Thread overview: 104+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-01 17:53 Interpretation of extensions different from Unix/Linux? vlc
2009-08-02 17:13 ` Jacob Sparre Andersen
2009-08-04 11:31   ` vlc
2009-08-04 11:44     ` Jacob Sparre Andersen
2009-08-04 11:57       ` Georg Bauhaus
2009-08-04 12:29         ` vlc
2009-08-04 13:43         ` Dmitry A. Kazakov
2009-08-14  4:33           ` Randy Brukardt
2009-08-14  7:37             ` Dmitry A. Kazakov
2009-08-04 12:25       ` vlc
2009-08-04 19:18         ` Jeffrey R. Carter
2009-08-04 19:52           ` Dmitry A. Kazakov
2009-08-04 20:45             ` Jeffrey R. Carter
2009-08-04 21:22               ` Dmitry A. Kazakov
2009-08-04 22:04                 ` Jeffrey R. Carter
2009-08-05  8:33                   ` Dmitry A. Kazakov
2009-08-05 16:07                     ` Jeffrey R. Carter
2009-08-05 16:35                       ` Dmitry A. Kazakov
2009-08-05 17:49                         ` Jeffrey R. Carter
2009-08-05 18:16                           ` Dmitry A. Kazakov
2009-08-05 19:27                             ` Jeffrey R. Carter
2009-08-05 19:50                               ` Dmitry A. Kazakov
2009-08-05 20:46                                 ` Jeffrey R. Carter
2009-08-06  7:43                                   ` Dmitry A. Kazakov
2009-08-05 21:33                               ` Robert A Duff
2009-08-05 19:45                           ` vlc
2009-08-05 19:56                             ` Dmitry A. Kazakov
2009-08-14  4:56                     ` Randy Brukardt
2009-08-14  8:01                       ` Dmitry A. Kazakov
2009-08-14 23:02                         ` Adam Beneschan
2009-08-14 23:54                         ` Randy Brukardt
2009-08-15  8:10                           ` Dmitry A. Kazakov
2009-08-15 12:49                             ` Pascal Obry
2009-08-15 13:23                               ` Dmitry A. Kazakov
2009-08-15 15:11                                 ` Pascal Obry
2009-08-15 17:11                                   ` Dmitry A. Kazakov
2009-08-15 20:07                                     ` Pascal Obry
2009-08-16  7:26                                       ` Dmitry A. Kazakov
2009-08-17 22:28                             ` Randy Brukardt
2009-08-18  0:32                               ` Adam Beneschan
2009-08-18 20:48                                 ` Randy Brukardt [this message]
2009-08-19  4:08                                   ` stefan-lucks
2009-08-19 22:01                                     ` Randy Brukardt
2009-08-19  7:37                                   ` Jean-Pierre Rosen
2009-08-19 16:10                                   ` Adam Beneschan
2009-08-19 22:11                                     ` Randy Brukardt
2009-08-18  7:48                               ` Dmitry A. Kazakov
2009-08-18 20:37                                 ` Randy Brukardt
2009-08-19  8:04                                   ` Dmitry A. Kazakov
2009-08-19 10:32                                     ` Georg Bauhaus
2009-08-19 12:11                                       ` Dmitry A. Kazakov
2009-08-19 15:21                                         ` Georg Bauhaus
2009-08-19 22:40                                     ` Randy Brukardt
2009-08-20  8:00                                       ` Variable- and fixed-length-character strings (Was: Interpretation of extensions different from Unix/Linux?) Jacob Sparre Andersen
2009-08-20 19:40                                       ` Interpretation of extensions different from Unix/Linux? Dmitry A. Kazakov
2009-08-21  0:08                                         ` Randy Brukardt
2009-08-21  7:43                                           ` Dmitry A. Kazakov
2009-08-21 22:10                                             ` Randy Brukardt
2009-08-22  7:27                                               ` Dmitry A. Kazakov
2009-09-01  1:50                                                 ` Randy Brukardt
2009-09-01  7:28                                                   ` Dmitry A. Kazakov
2009-09-02  3:41                                                     ` Stephen Leake
2009-09-02  7:17                                                       ` Dmitry A. Kazakov
2009-09-02 19:49                                                         ` tmoran
2009-09-03  7:41                                                           ` Dmitry A. Kazakov
2009-09-03 17:27                                                             ` tmoran
2009-09-03 20:44                                                               ` Dmitry A. Kazakov
2009-09-03 22:22                                                                 ` Randy Brukardt
2009-09-04  7:40                                                                   ` Dmitry A. Kazakov
2009-09-05  1:58                                                                     ` Randy Brukardt
2009-09-05  2:08                                                                     ` Randy Brukardt
2009-09-05  8:59                                                                       ` Dmitry A. Kazakov
2009-08-21 10:11                                           ` Enumeration of network shared under Windows (was: Interpretation of extensions different from Unix/Linux?) Dmitry A. Kazakov
2009-08-15 16:01                           ` Interpretation of extensions different from Unix/Linux? Vadim Godunko
2009-08-16 13:13                           ` Stephen Leake
2009-08-14  4:46                 ` Randy Brukardt
2009-08-14  9:00                   ` Dmitry A. Kazakov
2009-08-04 21:19           ` vlc
2009-08-14  5:19     ` Randy Brukardt
2009-08-14  6:13       ` Wilcards in Linux (was: Interpretation of extensions different from Unix/Linux?) stefan-lucks
2009-08-14  6:24         ` stefan-lucks
2009-08-14 10:05         ` Wilcards in Linux Markus Schoepflin
2009-08-14 10:22           ` Ludovic Brenta
2009-08-14 18:20             ` Tero Koskinen
2009-08-19 20:39       ` Interpretation of extensions different from Unix/Linux? Keith Thompson
2009-08-19 22:09         ` Robert A Duff
2009-08-20  7:49           ` Jacob Sparre Andersen
2009-08-20 15:56             ` Adam Beneschan
2009-08-20 21:58               ` sjw
2009-08-20 19:44             ` Robert A Duff
2009-08-20 21:34               ` Adam Beneschan
2009-08-20 22:03                 ` (see below)
2009-08-21  0:55                 ` tmoran
2009-08-20 23:55               ` Randy Brukardt
2009-08-21 17:58               ` Keith Thompson
2009-08-21 18:34                 ` Dmitry A. Kazakov
2009-08-21 19:32                 ` Jeffrey R. Carter
2009-08-21 21:34                 ` Robert A Duff
2009-08-21 22:06                   ` Hyman Rosen
2009-08-24 19:51                   ` Keith Thompson
2009-08-28  0:27                     ` Robert A Duff
2009-08-28 13:15                       ` Anders Wirzenius
2009-08-28 15:02                         ` Robert A Duff
2009-08-21  8:45             ` Stephen Leake
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox