From: "Alexandre E. Kopilovitch" <aek@vib.usr.pu.ru>
To: comp.lang.ada@ada-france.org
Subject: Re: U : Unbounded_String := "bla bla bla"; (was: Is the Writing...)
Date: Fri, 10 Oct 2003 01:35:12 +0400 (MSD)
Date: 2003-10-10T01:35:12+04:00 [thread overview]
Message-ID: <mailman.56.1065735568.25614.comp.lang.ada@ada-france.org> (raw)
In-Reply-To: <3F849B4A.2090008@comcast.net>; from "Robert I. Eachus" at Wed, 08 Oct 2003 23:18:59 GMT
Robert I. Eachus" wrote:
> > BTW, when you mentioned Cyrillic_String you made me smiling grimly. Do you
> > know that there are 3 alive Cyrillic encodings? Do you know that, for example,
> > in Windows, the final effect of your Cyrillic encoding depends not only upon
> > encoding, but upon Regional Settings also? And there are plenty of more subtle
> > issues, which may easily hurt you when you deal with a Cyrillic encoding. So,
> > don't fancy that your Cyrillic_String will be of much help, especially if you
> > want to develop a robust product for actual field use.
>
> You are just thinking Russian,
Well, if you add Ukrainian, Bulgarian and Serbian, not mentioning Belarussian
and a bunch of pseudo-Cyrillic languages from Abkhaz to Kazakh (and I can't
even get what is happening with Tatar now: I heard recently that there is even
a legal case in Constitutional court about that - Tatars want Latin-based
alphabet for their language, but federal authorities insist on Cyrillic one
only... if I understood all that properly), the situation probably will not
become better -;)
> there are even more Cyrillic character bindings for other Cyrillic languages.
I see very little potential use for them, particularly in Ada world... other
than mining raw intelligence data from newspapers, emails and websites -;) .
I can't imagine that those nations will use Ada for their accounting purposes...
or even for desktop publishing and for computer games.
> When it comes to multiple
> representations for one language Japanese is by far the worst!
I'm not sure, though. Yes, Japanese is quite impressive in this regard,
I have seen that in a raw reality (my daughter, being a linguist, had some
correspondense by e-mail with several Japanese girls, and I was called for
decoding and encoding those emails - well, it took some time and effort).
But that is on the surface. When you go deep into real application problems,
the situation may change: I know well that there are subtle and unpleasant
problems with Russian encodings, and I know nothing about Japanese at that
level.
> But if
> you don't see it, try this. In Ada, I can DEFINE a Cyrillic_String type
> and bind it to one of the variants, and add other string types for other
> variants, then provide for conversions between them. The fact that
> almost all conversions are explicit makes all this possible. Let me add
> three types and show you the problem:
>
> type Unbounded_Cyrillic is new Ada.Strings.Unbounded.Unbounded_String;
> -- to make sure you don't get confused. Yeah, I know, in real life
> -- you should make the derivation private, and provide Cyrillic_String
> -- versions of some of the operations in Ada.Strings.Unbounded. Take
> --- all that as given.
> type Georgian_String is (...);
> type Unbounded_Georgian is new Ada.Strings.Unbounded.Unbounded_String;
> -- same as above.
>
> In Ada as it is now, I can say:
>
> Some_String: Unbounded_Cyrillic := To_Unbounded("п°п╟п╨п╣п╢п╬п╫п╦п╦");
> Other_String: Unbounded_Georgian := To_Unbounded("п°п╟п╨п╣п╢п╬п╫п╦п╦");
>
> In each case, there is an implicit conversion from the string_literal
> "п°п╟п╨п╣п╢п╬п╫п╦п╦" to the proper string type, then that type is converted to
> the proper unbounded type. But if you add additional implicit
> conversions into the mix, it all falls apart:
Oh, it seems that I see (at last!) what you mean: you assume that conversions
between encodings should be implicit! But this is far from desirable in real
applications!
> Some_String: Unbounded_Cyrillic := "п°п╟п╨п╣п╢п╬п╫п╦п╦";
>
> I hope you don't expect the compiler to guess which set of implicit
> conversions to apply! I am certainly not going to try to list all the
> possibilities, but for example, there is: "п°п╟п╨п╣п╢п╬п╫п╦п╦" to String to
> Unbounded_String to Cyrillic_String. And yes, in this case, the first
> conversion would raise Constraint_Error. But I could choose some other
> example where all the characters were in both (Latin1) String and
> Cyrillic_String. But I don't have to: "п°п╟п╨п╣п╢п╬п╫п╦п╦" to Georgian_String to
> Unbounded_Georgian to Unbounded_String to Cyrillic_String.
>
> Once you introduce new implicit conversions, the compiler is going to
> have to assume that they may occur anywhere. If the overloading rules
> result in only one possible match, great. But you will find that right
> now Ada has about as many implicit conversions as it can without
> creating lots of ambiguous situations. And yes, there are situations in
> Ada currently where you have to qualify expressions to avoid ambiguity.
> The most userul balance point is where everything can be done, and you
> don't have to qualify expressions too often.
I think that now I understand the difference between our views on the issue.
I understand perfectly that there should not be two competing kinds of implicit
conversions (one between encodings and another between String and Unbounded_String).
So we have to choose between them.
You assumed that implicit conversions between encodings are more natural and
more desirable than implicit conversions between String and Unbounded_String.
My firm opinion is exactly opposite: conversions between encodings should be
explicit as a rule, and they all must be done within the "frontier" layer of
the application; so, I'm quite sure that while such implicit conversions between
encodings may be justified in Visual Basic and sometimes in C++, they are
entirely undesirable for Ada (as a standard feature). At the same time I see
implicit conversions between String and Unbounded_String as very natural and
desirable for real applications.
I don't know the reasons for that your assumption and preference... all I can
say is that my preference is certainly influenced by substantial experience
with strings in real applications, which often involved dealings with various
encodings (although there was not Ada - there were Fortran IV/77, COBOL 66,
several assemblers, PL/1, C/C++, Pascal/Delphi)
> Oh, since I am trying to be fair here, there is one additional implicit
> conversion that I would love to figure out how to add to the language.
> (Well, I know how to add it, I just don't think I'll ever get enough
> interest to make it happen.) That would be to add some pragmas that
> allowed character, string, or numeric literals to private types. The
> conversion directly from a character literal to Unbounded_Cyrillic
> wouldn't break anything. It also wouldn't help if you had a
> Cyrillic_String variable to put in an Unbounded_Cyrillic object.
I am not sure that I understand properly what you meant here, but anyway, I
can repeat that literals are very significant, and making possible to have
(non-trivial) literals for private types would be very good thing. For strings
(I mean Unbounded_Strings) this is especially important. It is the primary
need; full-scale implicit conversions between Strings and Unbounded_Strings
are also desirable, but the case of literals is certainly the most important.
Alexander Kopilovitch aek@vib.usr.pu.ru
Saint-Petersburg
Russia
next prev parent reply other threads:[~2003-10-09 21:35 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-10-02 18:02 U : Unbounded_String := "bla bla bla"; (was: Is the Writing...) amado.alves
2003-10-03 0:05 ` U : Unbounded String : " Alexander Kopilovitch
2003-10-03 20:46 ` Dmitry A. Kazakov
2003-10-03 9:00 ` U : Unbounded_String := " Preben Randhol
2003-10-03 11:17 ` Jeff C,
2003-10-04 2:49 ` Robert I. Eachus
2003-10-06 23:57 ` Alexandre E. Kopilovitch
2003-10-07 8:51 ` Dmitry A. Kazakov
2003-10-08 19:12 ` Alexandre E. Kopilovitch
2003-10-09 8:42 ` Dmitry A. Kazakov
2003-10-10 20:58 ` Alexander Kopilovitch
2003-10-13 8:35 ` Dmitry A. Kazakov
2003-10-13 21:43 ` Alexandre E. Kopilovitch
2003-10-14 8:09 ` Dmitry A. Kazakov
2003-10-16 9:39 ` Alexandre E. Kopilovitch
2003-10-18 10:57 ` Dmitry A. Kazakov
2003-10-08 23:18 ` Robert I. Eachus
2003-10-09 21:35 ` Alexandre E. Kopilovitch [this message]
2003-10-10 18:10 ` Robert I. Eachus
2003-10-11 19:43 ` Alexandre E. Kopilovitch
2003-10-12 5:03 ` Robert I. Eachus
2003-10-13 9:07 ` Dmitry A. Kazakov
2003-10-13 14:36 ` Alexandre E. Kopilovitch
2003-10-13 19:46 ` Robert I. Eachus
2003-10-14 1:35 ` Jeffrey Carter
2003-10-14 17:11 ` Alexandre E. Kopilovitch
2003-10-14 20:26 ` Mark A. Biggar
2003-10-14 20:58 ` Robert I. Eachus
2003-10-15 16:59 ` Alexandre E. Kopilovitch
2003-10-15 20:38 ` (see below)
2003-10-16 0:31 ` Alexandre E. Kopilovitch
2003-10-16 2:30 ` (see below)
2003-10-16 13:54 ` Alexandre E. Kopilovitch
2003-10-16 14:11 ` (see below)
2003-10-16 8:01 ` Dmitry A. Kazakov
2003-10-17 20:26 ` Randy Brukardt
2003-10-17 21:39 ` Alexandre E. Kopilovitch
2003-10-17 23:03 ` Robert I. Eachus
2003-10-23 21:11 ` Alexandre E. Kopilovitch
-- strict thread matches above, loose matches on Subject: below --
2003-10-03 12:00 amado.alves
2003-10-03 15:54 ` Mark A. Biggar
2003-10-03 20:41 ` Dmitry A. Kazakov
2003-10-03 16:12 amado.alves
2003-10-04 12:16 ` Preben Randhol
replies disabled
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox