comp.lang.ada
 help / color / mirror / Atom feed
From: Georg Bauhaus <rm.dash-bauhaus@futureapps.de>
Subject: Re: What about a glob standard method in Ada.Command_Line ?
Date: Wed, 25 Aug 2010 13:09:41 +0200
Date: 2010-08-25T13:09:42+02:00	[thread overview]
Message-ID: <4c74f9f6$0$6772$9b4e6d93@newsspool3.arcor-online.net> (raw)
In-Reply-To: <1r82cxcws3pc9$.r40m8l3ttil7$.dlg@40tude.net>

On 25.08.10 11:28, Dmitry A. Kazakov wrote:
> On Wed, 25 Aug 2010 10:57:45 +0200, Georg Bauhaus wrote:
> 
>> Does "wildcard" include both Latin-xyz character ü and UTF-8 ü?
> 
> Wildcard * matches ANY SEQUENCE OF CODE POINTS.

Wildcard has no universal definition, I keep asking
for one.  ISO/IEC 8652 certainly deserves one.

Is * greedy, does it respect locale, ... But more importantly,
is a particular wildcard the crucial part at all?  No.  Again,
when the Pattern_Type is properly defined, there are no questions.

>> Yes.  Many Wildcards do. And it can be handled.
> 
> Wrong, it cannot be implemented to work on both Latin-1 and UTF-8.

Not wrong, insofar as R* typically matches anything that starts
with R which is what you asked for.  Define sets of strings:

- s ∈ S1 where s has a unique representation in any possible
 encoding (based on octets, e.g.)

- s ∈ S2 where s has a unique representation in some encoding
 to be determined by context

S1 is typically empty only if one mentions EBCDI or silly
"GIF text". (When once I had to extract numbers and names from
raw videotext, a simple RE has still been the tool of choice...)

If you now want R* to decide the encoding of:

>    61 C3 B6 = aö    (in Latin-1)
>    61 C3 B6 = aö      (in UTF-8)

then use programming, use context information with S2, or else approach
your customers with the suggestion that you can't solve the problem
because the external environment is not as ideal as it should be...
I don't think a computing device can establish an oracle?
But none is needed!

If the external environment does not specify encoding, and
your algorithm cannot work without encoding then only
normative ontology can add one. So try, add an encoding.

You can write an RE that matches in the following order:

   61 F6	-- aö in Latin-1
   61 C3 B6	-- aö in UTF-8

This is what is needed, not deciding the encoding of 61 C3 B6
without encoding.  Suppose I know the file name has a substring "Rücken".
Therefore I look for strings that have in them one string from
the set of substrings all of which can establish "Rücken" in
some expected encoding. Then I inspect the findings to see whether
one is good.
Done.

Georg



  reply	other threads:[~2010-08-25 11:09 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-21  4:47 What about a glob standard method in Ada.Command_Line ? Yannick Duchêne (Hibou57)
2010-08-21  6:41 ` J-P. Rosen
2010-08-21  7:21   ` Yannick Duchêne (Hibou57)
2010-08-21  9:11   ` Pascal Obry
2010-08-22 19:00     ` J-P. Rosen
2010-08-22 19:29       ` Yannick Duchêne (Hibou57)
2010-08-23 23:06       ` Randy Brukardt
2010-08-24  0:02         ` Yannick Duchêne (Hibou57)
2010-08-24  0:24           ` Adam Beneschan
2010-08-24 10:27             ` Georg Bauhaus
2010-08-24 14:24               ` Dmitry A. Kazakov
2010-08-24 15:42                 ` Georg Bauhaus
2010-08-24 16:04                   ` Dmitry A. Kazakov
2010-08-24 17:10                     ` Georg Bauhaus
2010-08-24 17:24                       ` Georg Bauhaus
2010-08-24 18:42                         ` Yannick Duchêne (Hibou57)
2010-08-24 18:51                           ` Simon Wright
2010-08-24 17:41                       ` Dmitry A. Kazakov
2010-08-24 21:32                         ` Georg Bauhaus
2010-08-25  7:55                           ` Dmitry A. Kazakov
2010-08-25  8:24                             ` Yannick Duchêne (Hibou57)
2010-08-25 20:15                               ` (see below)
2010-08-25 20:39                                 ` Yannick Duchêne (Hibou57)
2010-08-25 21:05                                   ` (see below)
2010-08-25 21:32                                     ` Yannick Duchêne (Hibou57)
2010-08-25  8:57                             ` Georg Bauhaus
2010-08-25  9:28                               ` Dmitry A. Kazakov
2010-08-25 11:09                                 ` Georg Bauhaus [this message]
2010-08-25 12:01                                   ` Dmitry A. Kazakov
2010-08-25 13:09                                     ` Georg Bauhaus
2010-08-25 13:30                                       ` Dmitry A. Kazakov
2010-08-25 14:20                                         ` Georg Bauhaus
2010-08-25 14:56                                           ` Dmitry A. Kazakov
2010-08-25 15:51                                             ` Georg Bauhaus
2010-08-25 16:46                                               ` Dmitry A. Kazakov
2010-08-25 18:44                                                 ` Georg Bauhaus
2010-08-25 19:39                                                   ` Dmitry A. Kazakov
2010-08-26  0:59                                                     ` Georg Bauhaus
2010-08-26  8:49                                                       ` Dmitry A. Kazakov
2010-09-02 19:25                                                         ` Randy Brukardt
2010-09-02 20:47                                                           ` Dmitry A. Kazakov
2010-09-02 19:08                                                       ` Randy Brukardt
2010-09-02 20:48                                                         ` Georg Bauhaus
2010-08-22 19:30     ` Yannick Duchêne (Hibou57)
2010-08-22 19:46       ` Dmitry A. Kazakov
2010-08-25 13:09 ` anon
2010-08-25 13:13   ` Georg Bauhaus
2010-08-25 13:28     ` J-P. Rosen
2010-08-25 20:29       ` Yannick Duchêne (Hibou57)
2010-08-25 14:14     ` Jeffrey Carter
2010-08-25 21:37     ` anon
2010-08-26  8:21       ` J-P. Rosen
2010-08-26 16:29         ` anon
2010-08-26 20:34           ` Yannick Duchêne (Hibou57)
2010-08-27  4:40             ` Yannick Duchêne (Hibou57)
2010-08-27 12:10           ` J-P. Rosen
2010-09-01  8:08             ` Ada compilers and Ada 2005 (was: What about a glob standard method in Ada.Command_Line ?) Georg Bauhaus
2010-09-01  9:45               ` Ada compilers and Ada 2005 Pascal Obry
2010-09-01 10:28                 ` J-P. Rosen
2010-09-02 19:37                   ` Randy Brukardt
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox