comp.lang.ada
 help / color / mirror / Atom feed
From: "Alexandre E. Kopilovitch" <aek@vib.usr.pu.ru>
To: comp.lang.ada@ada-france.org
Subject: Re: U : Unbounded_String := "bla bla bla"; (was: Is the Writing...)
Date: Fri, 10 Oct 2003 01:35:12 +0400 (MSD)
Date: 2003-10-10T01:35:12+04:00	[thread overview]
Message-ID: <mailman.56.1065735568.25614.comp.lang.ada@ada-france.org> (raw)
In-Reply-To: <3F849B4A.2090008@comcast.net>; from "Robert I. Eachus" at Wed, 08 Oct 2003 23:18:59 GMT

Robert I. Eachus" wrote:

> > BTW, when you mentioned Cyrillic_String you made me smiling grimly. Do you
> > know that there are 3 alive Cyrillic encodings? Do you know that, for example,
> > in Windows, the final effect of your Cyrillic encoding depends not only upon
> > encoding, but upon Regional Settings also? And there are plenty of more subtle
> > issues, which may easily hurt you when you deal with a Cyrillic encoding. So,
> > don't fancy that your Cyrillic_String will be of much help, especially if you
> > want to develop a robust product for actual field use.
>
> You are just thinking Russian,

Well, if you add Ukrainian, Bulgarian and Serbian, not mentioning Belarussian
and a bunch of pseudo-Cyrillic languages from Abkhaz to Kazakh (and I can't
even get what is happening with Tatar now: I heard recently that there is even
a legal case in Constitutional court about that - Tatars want Latin-based
alphabet for their language, but federal authorities insist on Cyrillic one
only... if I understood all that properly), the situation probably will not
become better -;)

> there are even more Cyrillic character bindings for other Cyrillic languages.

I see very little potential use for them, particularly in Ada world... other
than mining raw intelligence data from newspapers, emails and websites -;) .
I can't imagine that those nations will use Ada for their accounting purposes...
or even for desktop publishing and for computer games.

> When it comes to multiple 
> representations for one language Japanese is by far the worst!

I'm not sure, though. Yes, Japanese is quite impressive in this regard,
I have seen that in a raw reality (my daughter, being a linguist, had some
correspondense by e-mail with several Japanese girls, and I was called for
decoding and encoding those emails - well, it took some time and effort).
But that is on the surface. When you go deep into real application problems,
the situation may change: I know well that there are subtle and unpleasant
problems with Russian encodings, and I know nothing about Japanese at that
level.

>  But if 
> you don't see it, try this.  In Ada, I can DEFINE a Cyrillic_String type 
> and bind it to one of the variants, and add other string types for other 
> variants, then provide for conversions between them.  The fact that 
> almost all conversions are explicit makes all this possible.  Let me add 
> three types and show you the problem:
>
>   type Unbounded_Cyrillic is new Ada.Strings.Unbounded.Unbounded_String;
>   -- to make sure you don't get confused.  Yeah, I know, in real life
>   -- you should make the derivation private, and provide Cyrillic_String
>   -- versions of some of the operations in Ada.Strings.Unbounded.  Take
>   --- all that as given.
>   type Georgian_String is (...);
>   type Unbounded_Georgian is new Ada.Strings.Unbounded.Unbounded_String;
>   -- same as above.
>
>   In Ada as it is now, I can say:
>
>   Some_String: Unbounded_Cyrillic := To_Unbounded("п°п╟п╨п╣п╢п╬п╫п╦п╦");
>   Other_String: Unbounded_Georgian := To_Unbounded("п°п╟п╨п╣п╢п╬п╫п╦п╦");
>
> In each case, there is an implicit conversion from the string_literal 
> "п°п╟п╨п╣п╢п╬п╫п╦п╦" to the proper string type, then that type is converted to 
> the proper unbounded type.  But if you add additional implicit 
> conversions into the mix, it all falls apart:

Oh, it seems that I see (at last!) what you mean: you assume that conversions
between encodings should be implicit! But this is far from desirable in real
applications!

>   Some_String: Unbounded_Cyrillic := "п°п╟п╨п╣п╢п╬п╫п╦п╦";
>
> I hope you don't expect the compiler to guess which set of implicit 
> conversions to apply!  I am certainly not going to try to list all the 
> possibilities, but for example, there is: "п°п╟п╨п╣п╢п╬п╫п╦п╦" to String to 
> Unbounded_String to Cyrillic_String.  And yes, in this case, the first 
> conversion would raise Constraint_Error.  But I could choose some other 
> example where all the characters were in both (Latin1) String and 
> Cyrillic_String.  But I don't have to: "п°п╟п╨п╣п╢п╬п╫п╦п╦" to Georgian_String to 
> Unbounded_Georgian to Unbounded_String to Cyrillic_String.
>
> Once you introduce new implicit conversions, the compiler is going to 
> have to assume that they may occur anywhere.  If the overloading rules 
> result in only one possible match, great.  But you will find that right 
> now Ada has about as many implicit conversions as it can without 
> creating lots of ambiguous situations.  And yes, there are situations in 
> Ada currently where you have to qualify expressions to avoid ambiguity. 
> The most userul balance point is where everything can be done, and you 
> don't have to qualify expressions too often.

I think that now I understand the difference between our views on the issue.

I understand perfectly that there should not be two competing kinds of implicit
conversions (one between encodings and another between String and Unbounded_String).
So we have to choose between them.

You assumed that implicit conversions between encodings are more natural and
more desirable than implicit conversions between String and Unbounded_String.
My firm opinion is exactly opposite: conversions between encodings should be
explicit as a rule, and they all must be done within the "frontier" layer of
the application; so, I'm quite sure that while such implicit conversions between
encodings may be justified in Visual Basic and sometimes in C++, they are
entirely undesirable for Ada (as a standard feature). At the same time I see
implicit conversions between String and Unbounded_String as very natural and
desirable for real applications.

I don't know the reasons for that your assumption and preference... all I can
say is that my preference is certainly influenced by substantial experience
with strings in real applications, which often involved dealings with various
encodings (although there was not Ada - there were Fortran IV/77, COBOL 66,
several assemblers, PL/1, C/C++, Pascal/Delphi)

> Oh, since I am trying to be fair here, there is one additional implicit 
> conversion that I would love to figure out how to add to the language. 
> (Well, I know how to add it, I just don't think I'll ever get enough 
> interest to make it happen.)  That would be to add some pragmas that 
> allowed  character, string, or numeric literals to private types.  The 
> conversion directly from a character literal to Unbounded_Cyrillic 
> wouldn't break anything.  It also wouldn't help if you had a 
> Cyrillic_String variable to put in an Unbounded_Cyrillic object.

I am not sure that I understand properly what you meant here, but anyway, I
can repeat that literals are very significant, and making possible to have
(non-trivial) literals for private types would be very good thing. For strings
(I mean Unbounded_Strings) this is especially important. It is the primary
need; full-scale implicit conversions between Strings and Unbounded_Strings
are also desirable, but the case of literals is certainly the most important. 



Alexander Kopilovitch                      aek@vib.usr.pu.ru
Saint-Petersburg
Russia




  reply	other threads:[~2003-10-09 21:35 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-10-02 18:02 U : Unbounded_String := "bla bla bla"; (was: Is the Writing...) amado.alves
2003-10-03  0:05 ` U : Unbounded String : " Alexander Kopilovitch
2003-10-03 20:46   ` Dmitry A. Kazakov
2003-10-03  9:00 ` U : Unbounded_String := " Preben Randhol
2003-10-03 11:17   ` Jeff C,
2003-10-04  2:49     ` Robert I. Eachus
2003-10-06 23:57       ` Alexandre E. Kopilovitch
2003-10-07  8:51         ` Dmitry A. Kazakov
2003-10-08 19:12           ` Alexandre E. Kopilovitch
2003-10-09  8:42             ` Dmitry A. Kazakov
2003-10-10 20:58               ` Alexander Kopilovitch
2003-10-13  8:35                 ` Dmitry A. Kazakov
2003-10-13 21:43                   ` Alexandre E. Kopilovitch
2003-10-14  8:09                     ` Dmitry A. Kazakov
2003-10-16  9:39                       ` Alexandre E. Kopilovitch
2003-10-18 10:57                         ` Dmitry A. Kazakov
2003-10-08 23:18         ` Robert I. Eachus
2003-10-09 21:35           ` Alexandre E. Kopilovitch [this message]
2003-10-10 18:10             ` Robert I. Eachus
2003-10-11 19:43               ` Alexandre E. Kopilovitch
2003-10-12  5:03                 ` Robert I. Eachus
2003-10-13  9:07                   ` Dmitry A. Kazakov
2003-10-13 14:36                   ` Alexandre E. Kopilovitch
2003-10-13 19:46                     ` Robert I. Eachus
2003-10-14  1:35                       ` Jeffrey Carter
2003-10-14 17:11                       ` Alexandre E. Kopilovitch
2003-10-14 20:26                         ` Mark A. Biggar
2003-10-14 20:58                           ` Robert I. Eachus
2003-10-15 16:59                           ` Alexandre E. Kopilovitch
2003-10-15 20:38                             ` (see below)
2003-10-16  0:31                               ` Alexandre E. Kopilovitch
2003-10-16  2:30                                 ` (see below)
2003-10-16 13:54                                   ` Alexandre E. Kopilovitch
2003-10-16 14:11                                     ` (see below)
2003-10-16  8:01                             ` Dmitry A. Kazakov
2003-10-17 20:26                   ` Randy Brukardt
2003-10-17 21:39                     ` Alexandre E. Kopilovitch
2003-10-17 23:03                     ` Robert I. Eachus
2003-10-23 21:11                       ` Alexandre E. Kopilovitch
  -- strict thread matches above, loose matches on Subject: below --
2003-10-03 12:00 amado.alves
2003-10-03 15:54 ` Mark A. Biggar
2003-10-03 20:41 ` Dmitry A. Kazakov
2003-10-03 16:12 amado.alves
2003-10-04 12:16 ` Preben Randhol
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox