comp.lang.ada
 help / color / mirror / Atom feed
From: Georg Bauhaus <rm-host.bauhaus@maps.futureapps.de>
Subject: Re: What about a glob standard method in Ada.Command_Line ?
Date: Wed, 25 Aug 2010 10:57:45 +0200
Date: 2010-08-25T10:57:45+02:00	[thread overview]
Message-ID: <4c74db09$0$6890$9b4e6d93@newsspool2.arcor-online.net> (raw)
In-Reply-To: <bxz3ymdyul62$.16pl56dd68msl.dlg@40tude.net>

On 8/25/10 9:55 AM, Dmitry A. Kazakov wrote:

>>> Does the wildcard pattern "R*"
>>
>> In what RE syntax?
>
> It is a wildcard pattern. Wildcards is the most frequently used pattern
> language.

Does "wildcard" include both Latin-xyz character � and UTF-8 �?
Yes.  Many Wildcards do. And it can be handled.

See whether or not encoding matters in the following program.



with GNAT.SPITBOL.Patterns;  use GNAT.SPITBOL.Patterns;
with Ada.Characters.Latin_1;  use Ada.Characters.Latin_1;

procedure Find_Ruecken (Text : String; Result : VString_Var) is
    In_UTF_8 : constant String := (Character'Val(16#c3#),
                                   Character'Val(16#bc#));
    Ue : Pattern;
begin
    Ue :=  (Any("Rr") & (In_UTF_8 or LC_U_Diaeresis) & "cken") ** Result;

    if not Match (Text, Ue) then
       raise Constraint_Error;
    end if;
end Find_Ruecken;

with GNAT.SPITBOL;  use GNAT.SPITBOL;
with Ada.Text_IO;
with Find_Ruecken;

procedure Test_Find_Ruecken is
    Found : VString;
begin
    Find_Ruecken(Text => "Recken, die R�cken ohne R�ckgrat dr�cken",
                 Result => Found);
    Ada.Text_IO.Put_Line ("Found """ & S(Found) & '"');
end Test_Find_Ruecken;


>>   >  match "readme"? Does it match "R�cken", when
>>> � is (16#c3#, 16#bc#) (UTF-8)?
>>
>> When the Pattern_Type is properly defined, there are no questions.
>
> How do define it properly? Does it match Latin-1's �, UTF-8's �, UTF-16's
> �, UTF-32's �? Don't you get that it cannot be done without abstracting
> *encoding* away?

When the Pattern_Type is properly defined, there are no questions.

Since I have to process a lot of text file and text streams
of unknown encoding, I'm used to REs that just find "R�cken"
in whatever encoding.  That's called programming.  Think of Google
or Yahoo or Bing searching the WWW and tons of email ...

There is no such thing as clean external data.
That  including file names.


Georg



  parent reply	other threads:[~2010-08-25  8:57 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-21  4:47 What about a glob standard method in Ada.Command_Line ? Yannick Duchêne (Hibou57)
2010-08-21  6:41 ` J-P. Rosen
2010-08-21  7:21   ` Yannick Duchêne (Hibou57)
2010-08-21  9:11   ` Pascal Obry
2010-08-22 19:00     ` J-P. Rosen
2010-08-22 19:29       ` Yannick Duchêne (Hibou57)
2010-08-23 23:06       ` Randy Brukardt
2010-08-24  0:02         ` Yannick Duchêne (Hibou57)
2010-08-24  0:24           ` Adam Beneschan
2010-08-24 10:27             ` Georg Bauhaus
2010-08-24 14:24               ` Dmitry A. Kazakov
2010-08-24 15:42                 ` Georg Bauhaus
2010-08-24 16:04                   ` Dmitry A. Kazakov
2010-08-24 17:10                     ` Georg Bauhaus
2010-08-24 17:24                       ` Georg Bauhaus
2010-08-24 18:42                         ` Yannick Duchêne (Hibou57)
2010-08-24 18:51                           ` Simon Wright
2010-08-24 17:41                       ` Dmitry A. Kazakov
2010-08-24 21:32                         ` Georg Bauhaus
2010-08-25  7:55                           ` Dmitry A. Kazakov
2010-08-25  8:24                             ` Yannick Duchêne (Hibou57)
2010-08-25 20:15                               ` (see below)
2010-08-25 20:39                                 ` Yannick Duchêne (Hibou57)
2010-08-25 21:05                                   ` (see below)
2010-08-25 21:32                                     ` Yannick Duchêne (Hibou57)
2010-08-25  8:57                             ` Georg Bauhaus [this message]
2010-08-25  9:28                               ` Dmitry A. Kazakov
2010-08-25 11:09                                 ` Georg Bauhaus
2010-08-25 12:01                                   ` Dmitry A. Kazakov
2010-08-25 13:09                                     ` Georg Bauhaus
2010-08-25 13:30                                       ` Dmitry A. Kazakov
2010-08-25 14:20                                         ` Georg Bauhaus
2010-08-25 14:56                                           ` Dmitry A. Kazakov
2010-08-25 15:51                                             ` Georg Bauhaus
2010-08-25 16:46                                               ` Dmitry A. Kazakov
2010-08-25 18:44                                                 ` Georg Bauhaus
2010-08-25 19:39                                                   ` Dmitry A. Kazakov
2010-08-26  0:59                                                     ` Georg Bauhaus
2010-08-26  8:49                                                       ` Dmitry A. Kazakov
2010-09-02 19:25                                                         ` Randy Brukardt
2010-09-02 20:47                                                           ` Dmitry A. Kazakov
2010-09-02 19:08                                                       ` Randy Brukardt
2010-09-02 20:48                                                         ` Georg Bauhaus
2010-08-22 19:30     ` Yannick Duchêne (Hibou57)
2010-08-22 19:46       ` Dmitry A. Kazakov
2010-08-25 13:09 ` anon
2010-08-25 13:13   ` Georg Bauhaus
2010-08-25 13:28     ` J-P. Rosen
2010-08-25 20:29       ` Yannick Duchêne (Hibou57)
2010-08-25 14:14     ` Jeffrey Carter
2010-08-25 21:37     ` anon
2010-08-26  8:21       ` J-P. Rosen
2010-08-26 16:29         ` anon
2010-08-26 20:34           ` Yannick Duchêne (Hibou57)
2010-08-27  4:40             ` Yannick Duchêne (Hibou57)
2010-08-27 12:10           ` J-P. Rosen
2010-09-01  8:08             ` Ada compilers and Ada 2005 (was: What about a glob standard method in Ada.Command_Line ?) Georg Bauhaus
2010-09-01  9:45               ` Ada compilers and Ada 2005 Pascal Obry
2010-09-01 10:28                 ` J-P. Rosen
2010-09-02 19:37                   ` Randy Brukardt
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox