FAQ and string functions

comp.lang.ada
 help / color / mirror / Atom feed

* FAQ and string functions
@ 2002-07-30  6:32 Oleg Goodyckov
  2002-07-30  8:52 ` Colin Paul Gloster
  2002-07-30 13:48 ` Ted Dennison
  0 siblings, 2 replies; 86+ messages in thread
From: Oleg Goodyckov @ 2002-07-30  6:32 UTC (permalink / raw)


Hi all!

Can anybody say me where can I find FAQ of this newsgroup?
Is there something talked about string processing functions?
If no, can anybody talk me exists and if yes, where can I get string
processing functions set for Ada? I'm talking about string processing like
Perl's manner: string split to list of tokens by delimiter and back. Push,
pop, slice and so on.

Thanx!



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-07-30  6:32 FAQ and string functions Oleg Goodyckov
@ 2002-07-30  8:52 ` Colin Paul Gloster
  2002-07-30 13:48 ` Ted Dennison
  1 sibling, 0 replies; 86+ messages in thread
From: Colin Paul Gloster @ 2002-07-30  8:52 UTC (permalink / raw)


Oleg Goodyckov wrote:
"Hi all!

Can anybody say me where can I find FAQ of this newsgroup?"

Unlike many other newsgroups, a FAQ has not been posted for several years
(most of them were last posted in 1996, there are multi-part
news:comp.lang.ada FAQ lists for different topics). You can usually find
the FAQs at HTTP://WWW.FAQs.org/ (the main ones at
HTTP://WWW.FAQs.org/faqs/computer-lang/Ada/comp-lang-ada/part1/
HTTP://WWW.FAQs.org/faqs/computer-lang/Ada/comp-lang-ada/part2/
HTTP://WWW.FAQs.org/faqs/computer-lang/Ada/comp-lang-ada/part3/
HTTP://WWW.FAQs.org/faqs/computer-lang/Ada/programming/part1/
HTTP://WWW.FAQs.org/faqs/computer-lang/Ada/programming/part2/
HTTP://WWW.FAQs.org/faqs/computer-lang/Ada/programming/part3/
HTTP://WWW.FAQs.org/faqs/computer-lang/Ada/programming/part4/
HTTP://WWW.FAQs.org/faqs/computer-lang/Ada/learning/
and all of them from
HTTP://WWW.FAQs.org/faqs/computer-lang/Ada/ ) but that website seems to
be offline at the moment. While you are waiting for it to come back,
you can get them by searching from
HTTP://groups.Google.com/advanced_group_search?hl=ru
(e.g. I chose FAQ as the word to find and to look in comp.lang.ada
and one of the results was
HTTP://groups.Google.com/groups?q=FAQ+group:comp.lang.ada&hl=ru&lr=&ie=UTF-8&inlang=ru&selm=3fm9jh%24a72%40disunms.epfl.ch&rnum=3
) or by using HTTP://WWW.Google.com/ to search, but when you click
on a result, click on "Cached" instead of clicking on the title
(e.g.
HTTP://216.239.51.100/search?q=cache:7MEJnC8e1MwC:www.faqs.org/faqs/computer-lang/Ada/++site:www.faqs.org+%22faqs.org%22+Ada&hl=en&ie=UTF-8
and
HTTP://216.239.39.100/search?q=cache:RZYKy6nsfDwC:www.faqs.org/faqs/computer-lang/Ada/programming/part1/++site:www.faqs.org+%22faqs.org%22+Ada&hl=en&ie=UTF-8
).

As for your questions about Perl-style string
manipulations, I will leave that to someone with
more interest in those sort of things.



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-07-30  6:32 FAQ and string functions Oleg Goodyckov
  2002-07-30  8:52 ` Colin Paul Gloster
@ 2002-07-30 13:48 ` Ted Dennison
  2002-07-31  4:52   ` Brian May
  2002-07-31  7:46   ` Oleg Goodyckov
  1 sibling, 2 replies; 86+ messages in thread
From: Ted Dennison @ 2002-07-30 13:48 UTC (permalink / raw)


Oleg Goodyckov <og@videoproject.kiev.ua> wrote in message news:<20020730093206.A8550@videoproject.kiev.ua>...
> Can anybody say me where can I find FAQ of this newsgroup?

Unfortunately, we currently do not have an up-to-date FAQ.

> If no, can anybody talk me exists and if yes, where can I get string
> processing functions set for Ada? I'm talking about string processing like
> Perl's manner: string split to list of tokens by delimiter and back. Push,
> pop, slice and so on.

There is all sorts of stuff like that in the Ada.Strings.* packages.
See section A.4 of the online LRM at
http://adaic.org/standards/95lrm/html/RM-TOC.html . More generally,
read *everything* in Annexes A and K before asking any more "does Ada
have XYZ?" questions.

Strings in Ada are particularly tricky for newcommers. One trap is
underuse of perfectly sized string constants. Most strings get their
value once and never need it changed. If that is the case, its fairly
easy to declare the string at the point its value is known, with the
perfect bounds, by using a string constant (possibly inside a
"declare" block). If you do that, it becomes very easy to deal with.

Also, note that a lot of string handling stuff that requires routines
in other languages is trivial in Ada. For instance, you can take a
slice of any array (including strings) with something like
"Array_Name(5..8)". Any two arrays of the same type (including
strings) can be concatenated with "&". Most numeric types (and
enumerations) can be converted to strings with the 'image attribute.

-- 
T.E.D.
Home     -  mailto:dennison@telepath.com (Yahoo: Ted_Dennison)
Homepage -  http://www.telepath.com/~dennison/Ted/TED.html



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-07-30 13:48 ` Ted Dennison
@ 2002-07-31  4:52   ` Brian May
  2002-08-01 16:09     ` Ted Dennison
  2002-07-31  7:46   ` Oleg Goodyckov
  1 sibling, 1 reply; 86+ messages in thread
From: Brian May @ 2002-07-31  4:52 UTC (permalink / raw)


> Strings in Ada are particularly tricky for newcommers. One trap is
> underuse of perfectly sized string constants. Most strings get their
> value once and never need it changed. If that is the case, its fairly
> easy to declare the string at the point its value is known, with the
> perfect bounds, by using a string constant (possibly inside a
> "declare" block). If you do that, it becomes very easy to deal with.

I have two questions concerning strings that I can't find answers for
in the RM:

1. How do I split a string up into tokens and iterate through the list
of tokens (compare with C's strtok function)?

2. Can this ugly looking code be simplified?

      declare
         I    : Iterator;
      begin
         I := Element_Iterator(Class_Element,"parent");
         while More(I) loop
            declare
               My_Node : Node renames Value(I);
               Class_Ref : DOM_String renames
Get_Attribute(My_Node,"ref");
            begin
               Put_Line(Class_Name(Package_Id,Class_Ref)&"($1,$2,$3)");
            end;
            Next(I);
         end loop;
         Free(I);
      end;

where Element_Iterator, More, Value, Next, and Free are my own
functions to iterate over elements in a DOM tree in xmlada (without
these the code was a real mess). DOM_String is similiar to a string
(in fact, I think internally it is defined as a string, but don't
quote me on that).

Ideally I don't want to have the nested declare block for the sole
purpose of
saving the result from Get_Attribute, but I am not sure this is
possible?

eg. I would like to be able to say

      declare
         I    : Iterator;
         Class_Ref : DOM_String;
      begin
         I := Element_Iterator(Class_Element,"parent");
         while More(I) loop
            Class_Ref := Get_Attribute(Value(I),"ref");
            Put_Line(Class_Name(Package_Id,Class_Ref)&"($1,$2,$3)");
            Next(I);
         end loop;
         Free(I);
      end;

but do it in a legal way. I think this would make the code more
readable.

Any comments?



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-07-30 13:48 ` Ted Dennison
  2002-07-31  4:52   ` Brian May
@ 2002-07-31  7:46   ` Oleg Goodyckov
  2002-07-31  9:04     ` Lutz Donnerhacke
                       ` (4 more replies)
  1 sibling, 5 replies; 86+ messages in thread
From: Oleg Goodyckov @ 2002-07-31  7:46 UTC (permalink / raw)

On Tue, Jul 30, 2002 at 06:48:43AM -0700, Ted Dennison wrote:
> There is all sorts of stuff like that in the Ada.Strings.* packages.
> See section A.4 of the online LRM at
> http://adaic.org/standards/95lrm/html/RM-TOC.html . More generally,
> read *everything* in Annexes A and K before asking any more "does Ada
> have XYZ?" questions.

I've read it before asked. Moreover, I've asked because read it.

> Strings in Ada are particularly tricky for newcommers. One trap is
> underuse of perfectly sized string constants. Most strings get their
> value once and never need it changed. If that is the case, its fairly
> easy to declare the string at the point its value is known, with the
> perfect bounds, by using a string constant (possibly inside a
> "declare" block). If you do that, it becomes very easy to deal with.

Thanks a lot. But it is not my case.

> Also, note that a lot of string handling stuff that requires routines
> in other languages is trivial in Ada. For instance, you can take a
> slice of any array (including strings) with something like
> "Array_Name(5..8)". Any two arrays of the same type (including
> strings) can be concatenated with "&". Most numeric types (and
> enumerations) can be converted to strings with the 'image attribute.

All of that I know. And all of that is very inconvinience to handle of
strings. In Booch components set in demo files is present file
bcwords.ada. Look at it. It contains full program for counting and
printing of frequencies of words met in any given text file. But this file
- bcwords.ada - begins from 15-line Perl's program, which does the same
work. Look there, compare volumes of Perl's and Ada's programs, and think,
why difference is so dramatically big in favor of Perl?

If we look at that Ada program carefully, we'll see, that half of it takes
subprogram Get_Next_Word. What it does? It's clear from name - it parses
next word from line. How it is done in Perl program? Simple - by splitting
of line on words by space as delimiter. So, while on Ada we must make
slice, "Array_Name(5..8)", loop, if, and other very important stuff, in
Perl we say @list=split(/ /,String) and that's all.  Is this Perl's own
especiality? No. It can be realized in Ada. And I say more - without this
Ada will never be convinient language. While for splitting string like
"x=2*3" people will must be to write program enstead split("=","x=2*3"),
people will write in Perl, not Ada.

So, by all of diversity of GENERIC string's handling tools in Ada,
convinient tool is not present. 

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-07-31  7:46   ` Oleg Goodyckov
@ 2002-07-31  9:04     ` Lutz Donnerhacke
  2002-07-31  9:39       ` Pascal Obry
  2002-07-31 16:50       ` Oleg Goodyckov
  2002-07-31 20:16     ` Simon Wright
                       ` (3 subsequent siblings)
  4 siblings, 2 replies; 86+ messages in thread
From: Lutz Donnerhacke @ 2002-07-31  9:04 UTC (permalink / raw)


* Oleg Goodyckov wrote:
>Perl we say @list=split(/ /,String) and that's all.  Is this Perl's own
>especiality? No. It can be realized in Ada. And I say more - without this
>Ada will never be convinient language. While for splitting string like
>"x=2*3" people will must be to write program enstead split("=","x=2*3"),
>people will write in Perl, not Ada.

Split is a library function in Perl, not a system call. You can have this
library call in Ada, too. Do you consider Perl unusable because MIME::Parser
is not in the core language?

\f
with Split, Ada.Text_IO, Whole;

procedure Test_Split is
   function Get_Whole_Line is new Whole.Line (Ada.Text_IO.Get_Line);
begin
   loop
      declare
         line : constant String := Get_Whole_Line;
         rang : constant Split.String_Ranges := Split.Split (' ', line);
      begin
         Ada.Text_IO.Put_Line ("Words:" & Integer'Image (rang'Length));
         for i in rang'Range loop
            Ada.Text_IO.Put (line (rang (i).first .. rang (i).last));
            if i < rang'Last then
               Ada.Text_IO.Put (", ");
            end if;
         end loop;
         Ada.Text_IO.New_Line;
      end;
   end loop;
exception
   when Ada.Text_IO.End_Error => null;
end Test_Split;
\f
with Ada.Strings.Maps;

package Split is
   pragma Preelaborate (Split);

   type Ranges is record
      first : Positive;
      last  : Positive;
   end record;
   type String_Ranges is array (Positive range <>) of Ranges;

   function Split (
     Source : String;
     Set    : Ada.Strings.Maps.Character_Set;
     Test   : Ada.Strings.Membership
   ) return String_Ranges;

   function Split (Terminator : Character; Source : String)
     return String_Ranges;
end Split;
\f
with Ada.Strings.Fixed;

package body Split is
   Null_Ranges : String_Ranges (Positive'First .. Positive'First - 1);

   function Split (Source : String;
     Set    : Ada.Strings.Maps.Character_Set;
     Test   : Ada.Strings.Membership) return String_Ranges is
      First, Last : Natural;
   begin
      Ada.Strings.Fixed.Find_Token (Source, Set, Test, First, Last);
      if Last = Natural'First then
         return Null_Ranges;
      else
         return Ranges'(First, Last) &
           Split (Source (Last + 1 .. Source'Last), Set, Test);
      end if;
   end Split;

   function Split (Terminator : Character; Source : String)
     return String_Ranges is
   begin
      return Split (Source,
        Ada.Strings.Maps.To_Set (Terminator), Ada.Strings.Outside);
   end Split;
end Split;
\f
package Whole is
   pragma Preelaborate (Whole);

   buffsize : Positive := 80;

   generic
      with procedure Get (buff : out String; last : out Natural);
   function Line return String;
end Whole;
\f
package body Whole is
   function Line return String is
      buff : String (Positive'First .. Positive'First + buffsize);
      last : Natural;
   begin
      Get (buff, last);
      if last < buff'Last then
         return buff (buff'First .. last);
      else
         return buff & Line;
      end if;
   end Line;
end Whole;



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-07-31  9:04     ` Lutz Donnerhacke
@ 2002-07-31  9:39       ` Pascal Obry
  2002-07-31 15:06         ` Oleg Goodyckov
  2002-07-31 16:50       ` Oleg Goodyckov
  1 sibling, 1 reply; 86+ messages in thread
From: Pascal Obry @ 2002-07-31  9:39 UTC (permalink / raw)



lutz@iks-jena.de (Lutz Donnerhacke) writes:

> * Oleg Goodyckov wrote:
> >Perl we say @list=split(/ /,String) and that's all.  Is this Perl's own
> >especiality? No. It can be realized in Ada. And I say more - without this
> >Ada will never be convinient language. While for splitting string like
> >"x=2*3" people will must be to write program enstead split("=","x=2*3"),
> >people will write in Perl, not Ada.

To split a string look at the String_Cutter package on my homepage or in the
AWS distribution.

Pascal.

-- 

--|------------------------------------------------------
--| Pascal Obry                           Team-Ada Member
--| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE
--|------------------------------------------------------
--|         http://perso.wanadoo.fr/pascal.obry
--| "The best way to travel is by means of imagination"
--|
--| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-07-31  9:39       ` Pascal Obry
@ 2002-07-31 15:06         ` Oleg Goodyckov
  0 siblings, 0 replies; 86+ messages in thread
From: Oleg Goodyckov @ 2002-07-31 15:06 UTC (permalink / raw)


On Wed, Jul 31, 2002 at 11:39:40AM +0200, Pascal Obry wrote:
> 
> lutz@iks-jena.de (Lutz Donnerhacke) writes:
> 
> > * Oleg Goodyckov wrote:
> > >Perl we say @list=split(/ /,String) and that's all.  Is this Perl's own
> > >especiality? No. It can be realized in Ada. And I say more - without this
> > >Ada will never be convinient language. While for splitting string like
> > >"x=2*3" people will must be to write program enstead split("=","x=2*3"),
> > >people will write in Perl, not Ada.
> 
> To split a string look at the String_Cutter package on my homepage or in the
> AWS distribution.
> 
> Pascal.

Thanx a lot!
I'll try ASAP.

Oleg.



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-07-31 22:04     ` Dmitry A.Kazakov
@ 2002-07-31 15:23       ` Oleg Goodyckov
  2002-08-01 21:57         ` Dmitry A.Kazakov
  0 siblings, 1 reply; 86+ messages in thread
From: Oleg Goodyckov @ 2002-07-31 15:23 UTC (permalink / raw)


On Thu, Aug 01, 2002 at 12:04:38AM +0200, Dmitry A.Kazakov wrote:
> Oleg Goodyckov wrote:
> 
> > If we look at that Ada program carefully, we'll see, that half of it takes
> > subprogram Get_Next_Word. What it does? It's clear from name - it parses
> > next word from line. How it is done in Perl program? Simple - by splitting
> > of line on words by space as delimiter. So, while on Ada we must make
> > slice, "Array_Name(5..8)", loop, if, and other very important stuff, in
> > Perl we say @list=split(/ /,String) and that's all.  Is this Perl's own
> > especiality? No. It can be realized in Ada. And I say more - without this
> > Ada will never be convinient language.
> 
> For which use? I would definitely not use something like split for parsing. 
> It is extremely inefficient. Ada was not designed for write-once-use-once 
> programs.

Ok! How about write-once-use-always? For text data analyze applications.

> > While for splitting string like
> > "x=2*3" people will must be to write program enstead split("=","x=2*3"),
> > people will write in Perl, not Ada.
> 
> And what would you do in the case "x=/* An error, should be := */ 2*" and 
> "3" continues on the next line?

Nothing. I know: I have data as described. If no - data is corrupted and must
be throwed out. It's simple.
But what would you do in the case, when data is correct yet?
You'll build PROGRAMMMM, instead write "split(/=/,"x=2*3")".



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-07-31  9:04     ` Lutz Donnerhacke
  2002-07-31  9:39       ` Pascal Obry
@ 2002-07-31 16:50       ` Oleg Goodyckov
  1 sibling, 0 replies; 86+ messages in thread
From: Oleg Goodyckov @ 2002-07-31 16:50 UTC (permalink / raw)


On Wed, Jul 31, 2002 at 09:04:54AM +0000, Lutz Donnerhacke wrote:
> * Oleg Goodyckov wrote:
> >Perl we say @list=split(/ /,String) and that's all.  Is this Perl's own
> >especiality? No. It can be realized in Ada. And I say more - without this
> >Ada will never be convinient language. While for splitting string like
> >"x=2*3" people will must be to write program enstead split("=","x=2*3"),
> >people will write in Perl, not Ada.
> 
> Split is a library function in Perl, not a system call. You can have this
> library call in Ada, too. Do you consider Perl unusable because MIME::Parser
> is not in the core language?

[Ada-text skipped]

Good! Very good! High skill!
Thanx a lot!

Oleg.



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-07-31  7:46   ` Oleg Goodyckov
  2002-07-31  9:04     ` Lutz Donnerhacke
@ 2002-07-31 20:16     ` Simon Wright
  2002-07-31 20:56       ` Robert A Duff
  2002-07-31 22:04     ` Dmitry A.Kazakov
                       ` (2 subsequent siblings)
  4 siblings, 1 reply; 86+ messages in thread
From: Simon Wright @ 2002-07-31 20:16 UTC (permalink / raw)


Oleg Goodyckov <og@videoproject.kiev.ua> writes:

> All of that I know. And all of that is very inconvinience to handle
> of strings. In Booch components set in demo files is present file
> bcwords.ada. Look at it. It contains full program for counting and
> printing of frequencies of words met in any given text file. But
> this file - bcwords.ada - begins from 15-line Perl's program, which
> does the same work. Look there, compare volumes of Perl's and Ada's
> programs, and think, why difference is so dramatically big in favor
> of Perl?

I didn't write Get_Next_Word (nor do I claim responsibility for the
problem that code has with improperly-terminated files :-)

I would be surprised if the part of the Perl code that *implements*
split was short.

When I *use* Get_Next_Word it only takes a line ...



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-07-31 20:16     ` Simon Wright
@ 2002-07-31 20:56       ` Robert A Duff
  2002-08-01  0:11         ` Darren New
                           ` (2 more replies)
  0 siblings, 3 replies; 86+ messages in thread
From: Robert A Duff @ 2002-07-31 20:56 UTC (permalink / raw)

Simon Wright <simon@pushface.org> writes:

> I didn't write Get_Next_Word (nor do I claim responsibility for the
> problem that code has with improperly-terminated files :-)
> 
> I would be surprised if the part of the Perl code that *implements*
> split was short.
> 
> When I *use* Get_Next_Word it only takes a line ...

Yeah, but in Ada you have to implement it (or rummage around on the net
to see if somebody already did, or see if your compiler vendor supports
it), whereas with Perl, it's already there.

Suppose I'm writing a long-lived application that does string fiddling.
Should I choose Ada (because it has good type checking and whatnot,
which helps make my code maintainable), or should I choose Perl, because
it has useful string fiddling ops available (*portably* available)?
It's annoying to have to make that choice, because the two issues are
orthogonal (there's no reason why a language can't be good in both
ways).

So I think the original poster's complaint is reasonable.  The complaint
is, "X is not available"; the response, "well, you can write X yourself"
is not impressive.

To be honest, I would never choose Perl for *anything*, because I value
various "ilities" over having some useful operations available.
(I think Perl is an abomination.)  But I can understand why some folks
make the opposite choice.

By the way, a partial answer to the original poster's question is to
look at the various GNAT packages, such as SNOBOL.  I have no idea
whether they do what you want, but they do some kinds of string
manipulation.  They may be compiler dependent, or they may be useful
with other compilers.

- Bob

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-07-31  7:46   ` Oleg Goodyckov
  2002-07-31  9:04     ` Lutz Donnerhacke
  2002-07-31 20:16     ` Simon Wright
@ 2002-07-31 22:04     ` Dmitry A.Kazakov
  2002-07-31 15:23       ` Oleg Goodyckov
  2002-08-01 14:29     ` Ted Dennison
  2002-08-02  1:04     ` tmoran
  4 siblings, 1 reply; 86+ messages in thread
From: Dmitry A.Kazakov @ 2002-07-31 22:04 UTC (permalink / raw)


Oleg Goodyckov wrote:

> If we look at that Ada program carefully, we'll see, that half of it takes
> subprogram Get_Next_Word. What it does? It's clear from name - it parses
> next word from line. How it is done in Perl program? Simple - by splitting
> of line on words by space as delimiter. So, while on Ada we must make
> slice, "Array_Name(5..8)", loop, if, and other very important stuff, in
> Perl we say @list=split(/ /,String) and that's all.  Is this Perl's own
> especiality? No. It can be realized in Ada. And I say more - without this
> Ada will never be convinient language.

For which use? I would definitely not use something like split for parsing. 
It is extremely inefficient. Ada was not designed for write-once-use-once 
programs.

> While for splitting string like
> "x=2*3" people will must be to write program enstead split("=","x=2*3"),
> people will write in Perl, not Ada.

And what would you do in the case "x=/* An error, should be := */ 2*" and 
"3" continues on the next line?

> So, by all of diversity of GENERIC string's handling tools in Ada,
> convinient tool is not present.

Maybe because is not so awful convinient? (:-))

-- 
Regards,
Dmitry Kazakov
www.dmitry-kazakov.de



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-07-31 20:56       ` Robert A Duff
@ 2002-08-01  0:11         ` Darren New
  2002-08-01  1:08           ` tmoran
                             ` (2 more replies)
  2002-08-01 11:09         ` Oleg Goodyckov
  2002-08-01 14:57         ` Georg Bauhaus
  2 siblings, 3 replies; 86+ messages in thread
From: Darren New @ 2002-08-01  0:11 UTC (permalink / raw)

Robert A Duff wrote:
> Yeah, but in Ada you have to implement it
> Suppose I'm writing a long-lived application that does string fiddling.

If you're writing a long-lived application, the time it takes to write the
string-fiddling subroutines (once) is minor.

> So I think the original poster's complaint is reasonable.  The complaint
> is, "X is not available"; the response, "well, you can write X yourself"
> is not impressive.

This is true, but becomes a smaller complaint as code size grows. It does
make it harder to do something simple in Ada.

Of course, doing real-time interrupt handling in Perl is a bit of a
nightmare, too. :-)

-- 
Darren New 
San Diego, CA, USA (PST). Cryptokeys on demand.
** http://home.san.rr.com/dnew/DNResume.html **
** http://images.fbrtech.com/dnew/ **

Things to be thankful for, #37:
   No sausage was served at the Last Supper.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-01  0:11         ` Darren New
@ 2002-08-01  1:08           ` tmoran
  2002-08-01  9:25           ` Brian May
  2002-08-01 11:20           ` Oleg Goodyckov
  2 siblings, 0 replies; 86+ messages in thread
From: tmoran @ 2002-08-01  1:08 UTC (permalink / raw)


> If you're writing a long-lived application, the time it takes to write the
> string-fiddling subroutines (once) is minor.
  If you've been using the language for years you've built yourself a
private library of such things.  If you're new to the language (or
the language is new to the world) then you depend on its library or
spend time writing your own routines.



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-01  0:11         ` Darren New
  2002-08-01  1:08           ` tmoran
@ 2002-08-01  9:25           ` Brian May
  2002-08-01 11:20           ` Oleg Goodyckov
  2 siblings, 0 replies; 86+ messages in thread
From: Brian May @ 2002-08-01  9:25 UTC (permalink / raw)


Darren New <dnew@san.rr.com> wrote in message news:<3D487CDA.24D9B1AE@san.rr.com>...
> If you're writing a long-lived application, the time it takes to write the
> string-fiddling subroutines (once) is minor.

At the linux.conf.au linux conference this year in perth, Australia,
there was a speaker who gave a talk on fallacies with modern
computering.

One of the problems he raised was programmers reinventing the wheel to
create algorithms for programs that have already been written before.

Not only does this waste time with writting the routine, debugging the
routine, but chances are that you have not used a very efficient
aalgorithm for it, because of the extra time required to research and
implement the most efficient algorithm.

This IMHO is Ada's major limitation. There are no standard set of
routines to use for basic data management. While different individuals
have written different routines for there own use, again there is no
standard. Also even though authors often put there source code online,
they often forget to license it to allow you to use their code in your
programs.

(hmmm.. I think I really should subscribe properly to this newgroup or
I am going to miss replies...)



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-07-31 20:56       ` Robert A Duff
  2002-08-01  0:11         ` Darren New
@ 2002-08-01 11:09         ` Oleg Goodyckov
  2002-08-01 14:08           ` Frank J. Lhota
  2002-08-01 14:57         ` Georg Bauhaus
  2 siblings, 1 reply; 86+ messages in thread
From: Oleg Goodyckov @ 2002-08-01 11:09 UTC (permalink / raw)

On Wed, Jul 31, 2002 at 08:56:10PM +0000, Robert A Duff wrote:
> Suppose I'm writing a long-lived application that does string fiddling.
> Should I choose Ada (because it has good type checking and whatnot,
> which helps make my code maintainable), or should I choose Perl, because
> it has useful string fiddling ops available (*portably* available)?
> It's annoying to have to make that choice, because the two issues are
> orthogonal (there's no reason why a language can't be good in both
> ways).

Yes!

> So I think the original poster's complaint is reasonable.  The complaint
> is, "X is not available"; the response, "well, you can write X yourself"
> is not impressive.

Yes!

> To be honest, I would never choose Perl for *anything*, because I value
> various "ilities" over having some useful operations available.
> (I think Perl is an abomination.)  But I can understand why some folks
> make the opposite choice.

O yes.

> By the way, a partial answer to the original poster's question is to
> look at the various GNAT packages, such as SNOBOL.  I have no idea
> whether they do what you want, but they do some kinds of string
> manipulation.  They may be compiler dependent, or they may be useful
> with other compilers.

Thanx. I've looked to that packages and have found a little.

It is very interest (for me at least), how much from all abilities of
string manipulation functions, written for Ada are really used? There are
so many functions and procedures and different their variants... How many
from them are using? This is rhythorical question. I'm not waiting answer.

From times of PL/1 I have stable abomination to string processing because
using of SUBSTRING, INDEX etc. is true perversion. It is very hard to
worki with. And enstead of it simple splitting string to list of tokens by
one operator transforms string processing miraculous to usual work. For me
it was very big surprize. I've found this simple trick solves most
problems in most cases. For example, after splitting I have number of
tokens in my string. In many cases it is very important information for
diagnostic purposes and frequently is enough for making decision. Then I
can subsequntly to split every string got on previous splitting and work
independently from other contents of original string. This makes an
algorythms more opaque and simple. The same work using of combinations of
SUBSTRING and INDEX looks awfull.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-01  0:11         ` Darren New
  2002-08-01  1:08           ` tmoran
  2002-08-01  9:25           ` Brian May
@ 2002-08-01 11:20           ` Oleg Goodyckov
  2002-08-01 15:43             ` Darren New
  2 siblings, 1 reply; 86+ messages in thread
From: Oleg Goodyckov @ 2002-08-01 11:20 UTC (permalink / raw)

On Thu, Aug 01, 2002 at 12:11:46AM +0000, Darren New wrote:
> Robert A Duff wrote:
> > Yeah, but in Ada you have to implement it
> > Suppose I'm writing a long-lived application that does string fiddling.
> 
> If you're writing a long-lived application, the time it takes to write the
> string-fiddling subroutines (once) is minor.

"Long-lived" or "long-writting"? In second case I'll agree with you.
People who earns long money along long time can allow himself to write
(once) some additional packages (and improve them in all rest of time) and
time to writting that will minor in compare to life time.

But it is a bit more acceptable to use somthing a bit more usable by
other. One head - good, two - better. Or no?

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-01 21:57         ` Dmitry A.Kazakov
@ 2002-08-01 13:10           ` Oleg Goodyckov
  2002-08-02 23:29             ` Dmitry A.Kazakov
  0 siblings, 1 reply; 86+ messages in thread
From: Oleg Goodyckov @ 2002-08-01 13:10 UTC (permalink / raw)


On Thu, Aug 01, 2002 at 11:57:04PM +0200, Dmitry A.Kazakov wrote:
> > Ok! How about write-once-use-always? For text data analyze applications.
> 
> Then, maybe it is worth to consider more advanced parsing techniques than 
> split? There are numerous Ada implementations of pattern matching. There 
> are also Ada subprograms to recognize data types in a string stream. It is 
> relatively easy to parse and evaluate expressions with brackets and 
> prioritized operations in Ada (an implementation of the twin-stack 
> argorithm is quite short) and note no things like split involved.

May be. In dreams it is possible almost all things.

> >> > While for splitting string like
> >> > "x=2*3" people will must be to write program enstead
> >> > split("=","x=2*3"), people will write in Perl, not Ada.
> >> 
> >> And what would you do in the case "x=/* An error, should be := */ 2*" and
> >> "3" continues on the next line?
> > 
> > Nothing. I know: I have data as described. If no - data is corrupted and
> > must be throwed out. It's simple.
> 
> To do so you should have an ability to recognize errors.

In couple of next steps of program error will be recognized and exception
rised. No problem.

> > But what would you do in the case, when data is correct yet?
> 
> In the given example data are correct. /*...*/ was a comment containing a 
> symbol supposed to be a delimiter. My point was that for almost any 
> real-life text parsing application, split is useless.

Why then most of my tasks are much easier solvable by using split, not
substring and similar? May be they are not from real life?

> > You'll build PROGRAMMMM, instead write "split(/=/,"x=2*3")".
> 
> You still need a program to process the output of split. Is its output the 
> final outcome? I suppose it is not. So there should a loop to iterate 
> through the list returned by split. Where is then a difference between 
> split + loop, and loop with Get_Next_Word inside? 

Difference is like difference between RANDOM and SEQUENTIAL acceses to
data.

In many cases it is not necessary to analyze all of string - enought to
know count of tokens or several from them on several well known positions.

> IMO, the difference is 
> that the second is faster and easier to understand.

Really? Have you seen that program (bcwords.ada)? And you'll assert that
that Ada's program is easier to understand? :-))))))
I have nothing to say...


Sorry me for long quotations. But look everybody and who will risk to say,
below Ada's program is simpler and easier to understand than equal Perl's
program (which is more than in 10 times smaller)? Don't think about how
much time needs that program to be written and debugged. Think, how much
time it needs to simply type it in text editor correctly. :-)))))


--  This demonstration is a response to a suggestion by John English
--  <J.English@bton.ac.uk> about assessing different component libraries, in
--  the context of the Ada Standard Component Library WG
--  (http://www.suffix.com/Ada/SCL/). It uses part of Corey Minford
--  <minyard@acm.org>'s solution (why reinvent that wheel?!)

--  John said:

--  As a way of objectively assessing the merits of different approaches,
--  perhaps the way to do this is to code some examples; one of my
--  favourites for this is a program to list the 10 most common words in a
--  file with the number of occurrences of each, where the length of words
--  and the size of the file can be arbitrarily large. In Perl it might look
--  something like this:

--   while (<>) {                     # for each line in the input file(s)
--     chomp;                         # trim the end of the line
--     tr/A-Z/a-z/;                   # fold uppercase to lowercase
--     @words = split /\W+/;          # break the line into words
--     foreach (@words) {
--       if (/^\w+$/) {               # ignore non-words
--         $wordlist{$_}++;           # increment count in associative array
--       }                            # (key = word, val = no. of occurrences)
--     }
--   }
--   $times = 0;
--   foreach (sort {$wordlist{$b} <=> $wordlist{$a}} (keys %wordlist)) {
--     last if (++$times > 10);       # exit loop after 10 iterations
--     print "$_ : $wordlist{$_}\n";  # process array in descending order
--   }                                # of value, printing keys and values

--  What would this look like in Ada using each of the libraries you've
--  listed?  Does anyone else have favourite examples like this?

--  $Id: bcwords.ada,v 1.6 2001/09/23 15:25:10 simon Exp $

with Ada.Strings.Unbounded;
with Ada.Text_IO;
with Word_Parser;
with Word_Count_Support;

procedure Word_Count is
   Word_Found : Boolean;
   File_Done : Boolean;
   Word : Ada.Strings.Unbounded.Unbounded_String;
   Word_Bag : Word_Count_Support.BU.Bag;
   Word_Tree : Word_Count_Support.ST.AVL_Tree;
   Word_Bag_Iter : Word_Count_Support.Containers.Iterator'Class
     := Word_Count_Support.BU.New_Iterator (Word_Bag);
   procedure Word_Processor (Item : Ada.Strings.Unbounded.Unbounded_String;
                             Ok : out Boolean);
   procedure Word_Processor (Item : Ada.Strings.Unbounded.Unbounded_String;
                             Ok : out Boolean) is
      Dummy : Boolean;
   begin
      Word_Count_Support.ST.Insert
        (Word_Tree,
         Word_Count_Support.Word_Stat'
         (Word => Item,
          Count => Word_Count_Support.BU.Count (Word_Bag, Item)),
         Dummy);
      Ok := True;
   end Word_Processor;
   procedure Word_Bag_Visitor
   is new Word_Count_Support.Containers.Visit (Word_Processor);
   Number_Output : Natural := 0;
   procedure Tree_Processor (Item : Word_Count_Support.Word_Stat;
                             OK : out Boolean);
   procedure Tree_Processor (Item : Word_Count_Support.Word_Stat;
                             OK : out Boolean) is
   begin
      Ada.Text_IO.Put_Line
        (Ada.Strings.Unbounded.To_String (Item.Word)
         & " =>"
         & Positive'Image (Item.Count));
      Number_Output := Number_Output + 1;
      OK := Number_Output < 10;          --  this is where we select the top 10
   end Tree_Processor;
   procedure Tree_Visitor is new Word_Count_Support.ST.Visit (Tree_Processor);
begin
   loop
      Word_Parser.Get_Next_Word
        (Ada.Text_IO.Standard_Input, Word, Word_Found, File_Done);
      exit when not Word_Found;
      Word_Count_Support.Bags.Add (Word_Bag, Word);
   end loop;
   Word_Count_Support.Containers.Reset (Word_Bag_Iter);
   Word_Bag_Visitor (Word_Bag_Iter);
   Tree_Visitor (Word_Tree);
end Word_Count;
with Ada.Strings.Unbounded;
with BC.Containers;
with BC.Containers.Bags;
with BC.Containers.Bags.Unbounded;
with BC.Containers.Trees;
with BC.Containers.Trees.AVL;
with Global_Heap;

package Word_Count_Support is

   package Containers is new BC.Containers
     (Item => Ada.Strings.Unbounded.Unbounded_String,
        "=" => Ada.Strings.Unbounded."=");

   package Bags is new Containers.Bags;

   function Hash (S : Ada.Strings.Unbounded.Unbounded_String) return Positive;

   package BU is new Bags.Unbounded (Hash => Hash,
                                     Buckets => 1,
                                     Storage => Global_Heap.Storage);

   type Word_Stat is record
      Word : Ada.Strings.Unbounded.Unbounded_String;
      Count : Positive;
   end record;

   function ">" (L, R : Word_Stat) return Boolean;
   function "=" (L, R : Word_Stat) return Boolean;

   package Stat_Containers is new BC.Containers (Word_Stat);

   package Trees is new Stat_Containers.Trees;

   package ST is new Trees.AVL
     ("<" => ">",     --  we need the most popular first
      Storage => Global_Heap.Storage);

end Word_Count_Support;
package body Word_Count_Support is

   --  This is extraordinarily lazy, of course we should really invent
   --  some better hash function!
   function Hash
     (S : Ada.Strings.Unbounded.Unbounded_String) return Positive is
   begin
      return 1;
   end Hash;

   function ">" (L, R : Word_Stat) return Boolean is
      use type Ada.Strings.Unbounded.Unbounded_String;
   begin
      return L.Count > R.Count
        or else (L.Count = R.Count
                 and then L.Word > R.Word);
   end ">";

   function "=" (L, R : Word_Stat) return Boolean is
      use type Ada.Strings.Unbounded.Unbounded_String;
   begin
      return L.Count = R.Count
        and then L.Word = R.Word;
   end "=";

end Word_Count_Support;
--  by Corey Minyard
package body Word_Parser is

   Big_A_Pos   : Integer := Character'Pos ('A');
   Small_A_Pos : Integer := Character'Pos ('a');

   procedure Xlat_To_Lower_Case (C : in out Character);
   procedure Xlat_To_Lower_Case (C : in out Character) is
   begin
      if (C in 'A' .. 'Z') then
         C := Character'Val (Character'Pos (C) - Big_A_Pos + Small_A_Pos);
      end if;
   end Xlat_To_Lower_Case;

   procedure Get_Next_Word
     (File       : in File_Type;
      Word       : out Ada.Strings.Unbounded.Unbounded_String;
      Word_Found : out Boolean;
      File_Done  : out Boolean) is

      Tmp_Str    : String (1 .. 10);
      Word_Pos   : Positive := Tmp_Str'First;
      Input_Char : Character;
      In_Word    : Boolean := False;
   begin
      --  Start with an empty word.
      Word := Ada.Strings.Unbounded.To_Unbounded_String ("");

      File_Done := False;
      Word_Found := False;

      if (End_Of_File (File)) then
         Word_Found := False;
         File_Done := True;
      else
         loop
            Get (File, Input_Char);
            Xlat_To_Lower_Case (Input_Char);

            if (not In_Word) then
               if (Input_Char in 'a' .. 'z') then
                  In_Word := True;
                  Word_Found := True;
                  Tmp_Str (Word_Pos) := Input_Char;
                  Word_Pos := Word_Pos + 1;
               end if;
            elsif (Input_Char in 'a' .. 'z') then
               Tmp_Str (Word_Pos) := Input_Char;
               if (Word_Pos = Tmp_Str'Last) then
                  Word := Word & Tmp_Str;
                  Word_Pos := Tmp_Str'First;
               else
                  Word_Pos := Word_Pos + 1;
               end if;
            else
               exit;
            end if;

            if (End_Of_File (File)) then
               File_Done := True;
               exit;
            elsif (End_Of_Line (File) and In_Word) then
               exit;
            end if;
         end loop;

         if (Word_Pos /= Tmp_Str'First) then
            --  If we have some stuff left in the temporary string, put it into
            --  the word.
            Word := Word & Tmp_Str (Tmp_Str'First .. Word_Pos - 1);
         end if;
      end if;
   end Get_Next_Word;

end Word_Parser;
--  by Corey Minyard
with Ada.Strings.Unbounded; use type Ada.Strings.Unbounded.Unbounded_String;
with Ada.Text_IO; use Ada.Text_IO;
package Word_Parser is

   procedure Get_Next_Word
     (File       : in File_Type;
      Word       : out Ada.Strings.Unbounded.Unbounded_String;
      Word_Found : out Boolean;
      File_Done  : out Boolean);

end Word_Parser;




^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-01 11:09         ` Oleg Goodyckov
@ 2002-08-01 14:08           ` Frank J. Lhota
  2002-08-01 15:06             ` Robert A Duff
  2002-08-01 16:05             ` Oleg Goodyckov
  0 siblings, 2 replies; 86+ messages in thread
From: Frank J. Lhota @ 2002-08-01 14:08 UTC (permalink / raw)


"Oleg Goodyckov" <og@videoproject.kiev.ua> wrote in message
news:20020801140909.I1080@videoproject.kiev.ua...
> From times of PL/1 I have stable abomination to string processing because
> using of SUBSTRING, INDEX etc. is true perversion. It is very hard to
> worki with. And enstead of it simple splitting string to list of tokens by
> one operator transforms string processing miraculous to usual work. For me
> it was very big surprize. I've found this simple trick solves most
> problems in most cases. For example, after splitting I have number of
> tokens in my string. In many cases it is very important information for
> diagnostic purposes and frequently is enough for making decision. Then I
> can subsequntly to split every string got on previous splitting and work
> independently from other contents of original string. This makes an
> algorythms more opaque and simple. The same work using of combinations of
> SUBSTRING and INDEX looks awfull.

It sounds like you would enjoy using the GNAT.SNOBOL package. Although it
comes packaged with the GNAT compiler, the source code is available, and you
should be able to compile it with any Ada 95 compiler with little or no
modification. This package will allow you to do string processing in the
SNOBOL4 style, where indices are almost never used.





^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-07-31  7:46   ` Oleg Goodyckov
                       ` (2 preceding siblings ...)
  2002-07-31 22:04     ` Dmitry A.Kazakov
@ 2002-08-01 14:29     ` Ted Dennison
  2002-08-01 16:47       ` Oleg Goodyckov
  2002-08-02  1:04     ` tmoran
  4 siblings, 1 reply; 86+ messages in thread
From: Ted Dennison @ 2002-08-01 14:29 UTC (permalink / raw)


Oleg Goodyckov <og@videoproject.kiev.ua> wrote in message news:<20020731104643.C1083@videoproject.kiev.ua>...
> strings. In Booch components set in demo files is present file
> bcwords.ada. Look at it. It contains full program for counting and
> printing of frequencies of words met in any given text file. But this file
> - bcwords.ada - begins from 15-line Perl's program, which does the same
> work. Look there, compare volumes of Perl's and Ada's programs, and think,
> why difference is so dramatically big in favor of Perl?

That's odd. All you'd really have to do to do this in Ada would be:
1) Read file into a string
2) Call Ada.Strings.Fixed.Count on the string

Doing 1 could be tricky, if you have no idea how big the file is. But
there are techniques for dealing with that. My personal favorite is
the recursive growing string trick. That only takes about 5 lines of
code. You could also use Ada.Streams.Stream_IO to open and find the
length of the file and String'Read to read it all into one perfectly
sized string.

Doing 2 is a one-liner.

> Perl we say @list=split(/ /,String) and that's all.  Is this Perl's own
> especiality? No. It can be realized in Ada. And I say more - without this

Well, it should be hardly shocking that the "standard" Perl has more
string handling goodies than the standard Ada. Perl was made for
string handling. If it can't even beat a compiled general-purpose
programming language like Ada in this reguard, it should give up its
scripting credentials and go run security in airports or something.

However, you are right that much of this *can* be done in Ada. If you
want more powerful string parsing capabilities, there are lots of
options for you.

One good one (which I see has already been mentioned) is the Gnat
string handling packages. They provide very powerful pattern matching
capability, which is probably the style of working you are used to if
you are a Perl user. The main drawback to this is that it isn't
avilable to you if you are using a compiler other than Gnat (iow: it's
not compiler-portable).

Another, even more powerful option, if you need something really
sophisiticated, is the OpenToken packages
(http://www.telepath.com/~dennison/Ted/OpenToken/OpenToken.html ).
They are written in standard Ada, and should be portable to any
compiler (although there have been minor issues in the past).

> So, by all of diversity of GENERIC string's handling tools in Ada,
> convinient tool is not present.

Well, we are currently looking at beefing up the standard library for
the next version of the language. Perhaps you are saying you believe
string handling needs attention?


-- 
T.E.D.
Home     -  mailto:dennison@telepath.com (Yahoo: Ted_Dennison)
Homepage -  http://www.telepath.com/~dennison/Ted/TED.html



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-07-31 20:56       ` Robert A Duff
  2002-08-01  0:11         ` Darren New
  2002-08-01 11:09         ` Oleg Goodyckov
@ 2002-08-01 14:57         ` Georg Bauhaus
  2 siblings, 0 replies; 86+ messages in thread
From: Georg Bauhaus @ 2002-08-01 14:57 UTC (permalink / raw)


Robert A Duff <bobduff@shell01.theworld.com> wrote:
: 
: By the way, a partial answer to the original poster's question is to
: look at the various GNAT packages, such as SNOBOL.  I have no idea
: whether they do what you want, but they do some kinds of string
: manipulation.  They may be compiler dependent, or they may be useful
: with other compilers.

I haven't seen a builtin split operation in any SNOBOL4, but
I'm quite certain that a lot of split operations have been done
or are done in SNOBOL4 programs, where that is necessary. :)

Same in SETL2 or ICON.

Georg



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-01 14:08           ` Frank J. Lhota
@ 2002-08-01 15:06             ` Robert A Duff
  2002-08-01 16:05             ` Oleg Goodyckov
  1 sibling, 0 replies; 86+ messages in thread
From: Robert A Duff @ 2002-08-01 15:06 UTC (permalink / raw)


"Frank J. Lhota" <NOSPAM.lhota.adarose@verizon.net> writes:

> It sounds like you would enjoy using the GNAT.SNOBOL package. Although it
> comes packaged with the GNAT compiler, the source code is available, and you
> should be able to compile it with any Ada 95 compiler with little or no
> modification. This package will allow you to do string processing in the
> SNOBOL4 style, where indices are almost never used.

Actually, I think it uses 'Unrestricted_Access, which is not supported
by most Ada compilers.

It's actually GNAT.Spitbol, not Snobol.  I got it wrong in my earlier
post.

- Bob



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-01 11:20           ` Oleg Goodyckov
@ 2002-08-01 15:43             ` Darren New
  2002-08-01 21:37               ` Robert A Duff
  2002-08-02  8:01               ` Oleg Goodyckov
  0 siblings, 2 replies; 86+ messages in thread
From: Darren New @ 2002-08-01 15:43 UTC (permalink / raw)


Oleg Goodyckov wrote:
> "Long-lived" or "long-writting"? In second case I'll agree with you.

Your experience may differ, but in my experience, I've never spent less than
a month writing a program that is still running 20 years later. :-)

Of course, having more sophisticated stuff built in (or at least predefined)
would be useful. But using Perl where Ada is appropriate because of a
function like split() is probably inappropriate.

Why there isn't an unbounded array type built in (given that unbounded
string *is* built in) is beyond me.

And of course, GNAT *does* come with all that good stuff, and it can be done
portably (even if the GNAT sources aren't portable, which I don't know
whether or not they are).

-- 
Darren New 
San Diego, CA, USA (PST). Cryptokeys on demand.
** http://home.san.rr.com/dnew/DNResume.html **
** http://images.fbrtech.com/dnew/ **

Things to be thankful for, #37:
   No sausage was served at the Last Supper.



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-01 14:08           ` Frank J. Lhota
  2002-08-01 15:06             ` Robert A Duff
@ 2002-08-01 16:05             ` Oleg Goodyckov
  1 sibling, 0 replies; 86+ messages in thread
From: Oleg Goodyckov @ 2002-08-01 16:05 UTC (permalink / raw)


On Thu, Aug 01, 2002 at 02:08:59PM +0000, Frank J. Lhota wrote:
> "Oleg Goodyckov" <og@videoproject.kiev.ua> wrote in message
> news:20020801140909.I1080@videoproject.kiev.ua...
> > From times of PL/1 I have stable abomination to string processing because
> > using of SUBSTRING, INDEX etc. is true perversion. It is very hard to
> > worki with. And enstead of it simple splitting string to list of tokens by
> > one operator transforms string processing miraculous to usual work. For me
> > it was very big surprize. I've found this simple trick solves most
> > problems in most cases. For example, after splitting I have number of
> > tokens in my string. In many cases it is very important information for
> > diagnostic purposes and frequently is enough for making decision. Then I
> > can subsequntly to split every string got on previous splitting and work
> > independently from other contents of original string. This makes an
> > algorythms more opaque and simple. The same work using of combinations of
> > SUBSTRING and INDEX looks awfull.
> 
> It sounds like you would enjoy using the GNAT.SNOBOL package. Although it
> comes packaged with the GNAT compiler, the source code is available, and you
> should be able to compile it with any Ada 95 compiler with little or no
> modification. This package will allow you to do string processing in the
> SNOBOL4 style, where indices are almost never used.

Spitbool and Spitbool.Pattern.
But only second has procedure, which can not only find substring matched
pettern but refer me to it's position too. But this is too few. It is
impossible to split string by delimiters in one operation. 
Or possible?



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-07-31  4:52   ` Brian May
@ 2002-08-01 16:09     ` Ted Dennison
  2002-08-02  0:21       ` Brian May
  0 siblings, 1 reply; 86+ messages in thread
From: Ted Dennison @ 2002-08-01 16:09 UTC (permalink / raw)


bam@snoopy.apana.org.au (Brian May) wrote in message news:<29e5ffff.0207302052.465a3193@posting.google.com>...
> I have two questions concerning strings that I can't find answers for
> in the RM:
> 
> 1. How do I split a string up into tokens and iterate through the list
> of tokens (compare with C's strtok function)?

The analogous routine is Ada.Strings.*.Find_Token (where * is your
choice of "Fixed", "Bounded", or "Unbounded").

> 2. Can this ugly looking code be simplified?
> 
>       declare
>          I    : Iterator;
>       begin
>          I := Element_Iterator(Class_Element,"parent");
>          while More(I) loop
>             declare
>                My_Node : Node renames Value(I);
>                Class_Ref : DOM_String renames
> Get_Attribute(My_Node,"ref");
>             begin
>                Put_Line(Class_Name(Package_Id,Class_Ref)&"($1,$2,$3)");
>             end;
>             Next(I);
>          end loop;
>          Free(I);
>       end;

I'm not a big fan of these kinds of renames. Why not instead just
write:

Put_Line (Class_Name (Package_ID, Get_Attribute (Value (I), "ref")) &
          "($1,$2,$3)");

I think that's much easier to read than your 4 line (semicolon)
declare block.

If you insist in splitting it up, turn those renames into constants.
I'd still skip the "My_Node" one, unless you are going to use it in
more than one place.



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-01 14:29     ` Ted Dennison
@ 2002-08-01 16:47       ` Oleg Goodyckov
  2002-08-02 14:05         ` Ted Dennison
  0 siblings, 1 reply; 86+ messages in thread
From: Oleg Goodyckov @ 2002-08-01 16:47 UTC (permalink / raw)

On Thu, Aug 01, 2002 at 07:29:33AM -0700, Ted Dennison wrote:
> 
> Well, it should be hardly shocking that the "standard" Perl has more
> string handling goodies than the standard Ada. Perl was made for
> string handling. If it can't even beat a compiled general-purpose

As I know, Perl was made for reach wide expression ability.

> programming language like Ada in this reguard, it should give up its
> scripting credentials and go run security in airports or something.

Perl can do it after Ada will be provided with simple split/join function.
For first time.

> However, you are right that much of this *can* be done in Ada. If you
> want more powerful string parsing capabilities, there are lots of
> options for you.

Very lot of...  But for pity nothing apropriate.

> One good one (which I see has already been mentioned) is the Gnat
> string handling packages. They provide very powerful pattern matching
> capability, which is probably the style of working you are used to if
> you are a Perl user. The main drawback to this is that it isn't
> avilable to you if you are using a compiler other than Gnat (iow: it's
> not compiler-portable).

Understand, please, pattern matching is very important and good ability.
But splitting is a little bit another. Even another at all.
Main purpose of splitting is to allow a DIRECT(RANDOM) ACCESS to any token
(in defined by pattern sence) in string. And for now (for me) nothing
more.  Split string, count items, independently process them and join 
back to string. This is very basic and simple operation. 

> Another, even more powerful option, if you need something really
> sophisiticated, is the OpenToken packages
> (http://www.telepath.com/~dennison/Ted/OpenToken/OpenToken.html ).

Ok. I'll try.

> They are written in standard Ada, and should be portable to any
> compiler (although there have been minor issues in the past).
> 
> > So, by all of diversity of GENERIC string's handling tools in Ada,
> > convinient tool is not present.
> 
> Well, we are currently looking at beefing up the standard library for
> the next version of the language. Perhaps you are saying you believe
> string handling needs attention?

Saing onestly, I'm very surprized, that so basic and simple operation as
splitting/joining of string is not present in Ada natively. And most
surprize for me is that most people don't understand me in my complaint.
So, may ba Ada community solves tasks which are very far from string
handling (I'm novice) and string handlig functions is not very necessary
in that kind of work. I don't know alredy.
For novice like me string handling is very poor without split/join pair.
It is only one (may be except not very good organization of package
documentation) condition which prevents to use Ada anywhere. By me.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-01 15:43             ` Darren New
@ 2002-08-01 21:37               ` Robert A Duff
  2002-08-03  0:42                 ` Ted Dennison
  2002-08-02  8:01               ` Oleg Goodyckov
  1 sibling, 1 reply; 86+ messages in thread
From: Robert A Duff @ 2002-08-01 21:37 UTC (permalink / raw)

Darren New <dnew@san.rr.com> writes:

> Your experience may differ, but in my experience, I've never spent less than
> a month writing a program that is still running 20 years later. :-)

We have scripts written in Perl (etc) that have lasted many years (since
before I joined the company).  I suspect they were written in less than
a month (each).  More like an hour each.  But we've spent untold hours
maintaining them.

I'm talking about things like, "zip up a bunch of files for a release,
put it in the version 1.234 release directory, run some regression
tests, and then send some mail to somebody-or-other notifying them of
something-or-other."

I wish they were all written in a language that had Perl's features for
doing stuff, and Ada's features for making it work reliably.

- Bob

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-07-31 15:23       ` Oleg Goodyckov
@ 2002-08-01 21:57         ` Dmitry A.Kazakov
  2002-08-01 13:10           ` Oleg Goodyckov
  0 siblings, 1 reply; 86+ messages in thread
From: Dmitry A.Kazakov @ 2002-08-01 21:57 UTC (permalink / raw)

Oleg Goodyckov wrote:

> On Thu, Aug 01, 2002 at 12:04:38AM +0200, Dmitry A.Kazakov wrote:
>> Oleg Goodyckov wrote:
>> 
>> > If we look at that Ada program carefully, we'll see, that half of it
>> > takes subprogram Get_Next_Word. What it does? It's clear from name - it
>> > parses next word from line. How it is done in Perl program? Simple - by
>> > splitting of line on words by space as delimiter. So, while on Ada we
>> > must make slice, "Array_Name(5..8)", loop, if, and other very important
>> > stuff, in
>> > Perl we say @list=split(/ /,String) and that's all.  Is this Perl's own
>> > especiality? No. It can be realized in Ada. And I say more - without
>> > this Ada will never be convinient language.
>> 
>> For which use? I would definitely not use something like split for
>> parsing. It is extremely inefficient. Ada was not designed for
>> write-once-use-once programs.
> 
> Ok! How about write-once-use-always? For text data analyze applications.

Then, maybe it is worth to consider more advanced parsing techniques than 
split? There are numerous Ada implementations of pattern matching. There 
are also Ada subprograms to recognize data types in a string stream. It is 
relatively easy to parse and evaluate expressions with brackets and 
prioritized operations in Ada (an implementation of the twin-stack 
argorithm is quite short) and note no things like split involved.

>> > While for splitting string like
>> > "x=2*3" people will must be to write program enstead
>> > split("=","x=2*3"), people will write in Perl, not Ada.
>> 
>> And what would you do in the case "x=/* An error, should be := */ 2*" and
>> "3" continues on the next line?
> 
> Nothing. I know: I have data as described. If no - data is corrupted and
> must be throwed out. It's simple.

To do so you should have an ability to recognize errors.

> But what would you do in the case, when data is correct yet?

In the given example data are correct. /*...*/ was a comment containing a 
symbol supposed to be a delimiter. My point was that for almost any 
real-life text parsing application, split is useless.

> You'll build PROGRAMMMM, instead write "split(/=/,"x=2*3")".

You still need a program to process the output of split. Is its output the 
final outcome? I suppose it is not. So there should a loop to iterate 
through the list returned by split. Where is then a difference between 
split + loop, and loop with Get_Next_Word inside? IMO, the difference is 
that the second is faster and easier to understand.

-- 
Regards,
Dmitry Kazakov
www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-01 16:09     ` Ted Dennison
@ 2002-08-02  0:21       ` Brian May
  2002-08-02  1:56         ` tmoran
  2002-08-02 13:59         ` Ted Dennison
  0 siblings, 2 replies; 86+ messages in thread
From: Brian May @ 2002-08-02  0:21 UTC (permalink / raw)


dennison@telepath.com (Ted Dennison) wrote in message news:<4519e058.0208010809.6a4c5e22@posting.google.com>...
> 
> The analogous routine is Ada.Strings.*.Find_Token (where * is your
> choice of "Fixed", "Bounded", or "Unbounded").

I haven't been able to find any documentation on how to use these
functions...

(except the function declaration).

Also something else I have been trying to work out, is how to replace,
say all instances of "/" in a string with, say "__".

There is a replace function for the strings, I am a bit doubtful
though if
you can use it to expand the size of a fixed string...

So I have a number of ideas of the best way, I am not really sure
which
is the most efficient though.

> I'm not a big fan of these kinds of renames. Why not instead just
> write:
> 
> Put_Line (Class_Name (Package_ID, Get_Attribute (Value (I), "ref")) &
>           "($1,$2,$3)");

Because if I need to use it multiple times, I think that means it will
have to call get_attribute multiple times, won't it?

Besides, I don't think the above makes it very clear what the
significance
is of the results of each function call.

> I think that's much easier to read than your 4 line (semicolon)
> declare block.
> 
> If you insist in splitting it up, turn those renames into constants.
> I'd still skip the "My_Node" one, unless you are going to use it in
> more than one place.

Are constants better then renames? If so, why?



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-07-31  7:46   ` Oleg Goodyckov
                       ` (3 preceding siblings ...)
  2002-08-01 14:29     ` Ted Dennison
@ 2002-08-02  1:04     ` tmoran
  4 siblings, 0 replies; 86+ messages in thread
From: tmoran @ 2002-08-02  1:04 UTC (permalink / raw)


> While for splitting string like "x=2*3" people will must be to write
> program enstead split("=","x=2*3"), people will write in Perl, not Ada.
  As they should.  If you want to go a half block down the street, you
ought to walk, not drive.  Cars are not designed to be fast or efficient
at half-block transportation.  Similarly, if you just have a simple
scanner and don't worry about erroneous input or speed of execution or
whether the program it's part of can be supported and modified over the
next 10 years, by all means use Perl for the job.  Don't object that
nobody will ever drive a car because they need to have keys, and get in
and close the door, and know how to drive, and perhaps scrape the ice
off the window, and find a parking place at the end of the trip.  Those
are good reasons to prefer walking a half block, but the considerations
for a 20 mile trip or a 200 mile trip will probably lead you to drive.
If you want something robust, efficient, and maintainable for a long
life, Ada is more likely appropriate.



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-02  0:21       ` Brian May
@ 2002-08-02  1:56         ` tmoran
  2002-08-02 13:59         ` Ted Dennison
  1 sibling, 0 replies; 86+ messages in thread
From: tmoran @ 2002-08-02  1:56 UTC (permalink / raw)


> > choice of "Fixed", "Bounded", or "Unbounded").
> I haven't been able to find any documentation on how to use these
> functions...
  The Ada.Strings.* are described in Appendix A of the ALRM (available
on-line) and ought to be in any complete Ada textbook.
  For iterating over tokens Ada.Strings.Fixed.Index might be more
convenient, as in:
with Ada.Strings.Fixed,
     Ada.Text_IO;
procedure Split is
  Test : constant String := "These are the times that try men's souls";
  Left : Natural := Test'first;
  Right : Natural;
begin
   while Left <= Test'last loop
     Right := Ada.Strings.Fixed.Index(Test(Left .. Test'last), " ");
     if Right = 0 then Right := Test'last+1;end if;
     Ada.Text_IO.Put_Line(Test(Left .. Right-1));
     Left := Right+1;
   end loop;
end Split;

>Also something else I have been trying to work out, is how to replace,
>say all instances of "/" in a string with, say "__".
with Ada.Text_IO,
     Ada.Strings.Fixed;
procedure Rep is
  Test : constant String := "now is the/a time for all/some good//men/women";
  function Replace(Source, Drop, Add : String) return String is
  -- replace each occurrence of Drop in Source with Add
  -- Note: The modified string is not rescanned, so, for instance,
  -- Replace("Hello World", Drop=>" ", Add=>"  "); will not
  -- attempt to insert an infinite number of blanks.
    Si : Natural := Ada.Strings.Fixed.Index(Source, Drop);
  begin
    if Si = 0 then
      return Source;
    else
      return Source(Source'First .. Si - 1)
             & Add
             & Replace(Source(Si + Drop'Length .. Source'Last), Drop, Add);
    end if;
  end Replace;
begin
  Ada.Text_IO.Put_Line(Replace(Test, Drop => "/", Add => "__"));
end Rep;



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-01 15:43             ` Darren New
  2002-08-01 21:37               ` Robert A Duff
@ 2002-08-02  8:01               ` Oleg Goodyckov
  2002-08-02 16:09                 ` Darren New
  1 sibling, 1 reply; 86+ messages in thread
From: Oleg Goodyckov @ 2002-08-02  8:01 UTC (permalink / raw)

On Thu, Aug 01, 2002 at 03:43:42PM +0000, Darren New wrote:
> Oleg Goodyckov wrote:
> > "Long-lived" or "long-writting"? In second case I'll agree with you.
> 
> Your experience may differ, but in my experience, I've never spent less than
> a month writing a program that is still running 20 years later. :-)

Yes, my experience contains not such cases because I've only 15 years of
experience at all.

But guys, how many so "long-live" applications have you seen in our low
cost use-and-throw products world, that you can allow your selfs going to
maintains they along 20 years? May be I'm mistaken, but today, when new
microprocessors are borning every year most long-needed-support are
language compilers itself.

Data becomes old before you develop tools to procces them. Ada is too
statically language for our so dinamically life (but, may be life is too
different in West and East :)

It seems epoch, when words "programms lives longer than we're suggesting"
was true becomes to it's end. Not because programs becomes old too fast.
Because data, which processing these programs are dedicated for becomes
old too fast.

> Of course, having more sophisticated stuff built in (or at least predefined)
> would be useful. But using Perl where Ada is appropriate because of a
> function like split() is probably inappropriate.
> 
> Why there isn't an unbounded array type built in (given that unbounded
> string *is* built in) is beyond me.
> 
> And of course, GNAT *does* come with all that good stuff, and it can be done
> portably (even if the GNAT sources aren't portable, which I don't know
> whether or not they are).

Ok! I've understood! Ada is language for prosessing of eternal data.
Sounds good: "Ada is language for programming of eternity" :)))

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-02  0:21       ` Brian May
  2002-08-02  1:56         ` tmoran
@ 2002-08-02 13:59         ` Ted Dennison
  1 sibling, 0 replies; 86+ messages in thread
From: Ted Dennison @ 2002-08-02 13:59 UTC (permalink / raw)


bam@snoopy.apana.org.au (Brian May) wrote in message news:<29e5ffff.0208011621.3193755f@posting.google.com>...
> There is a replace function for the strings, I am a bit doubtful
> though if
> you can use it to expand the size of a fixed string...

To do that kind of thing, you will generally need to use slicing and
catenation ("&"). This is not really "expanding the size of a fixed
string", its creating a new (possibly fixed) string based on an old
one.

Recursion is often useful for this kind of thing. For instance
(warning: not compiled or tested):

function Transform (Source : String; From_Pattern : String; 
                    To_Pattern : String) return String is
   -- Find the location of the source pattern
   Pattern_Start : constant Natural := 
      Ada.Strings.Fixed.Index (Source => Source, Pattern =>
From_Pattern);
begin
   if Pattern_Start = 0 then
      return Source;
   else
      -- Return the string before the pattern, the new pattern, and a
      -- transformation of the string after the pattern.
      return Source (Source'First..Pattern_Start - 1) & To_Pattern &
         Transform (Source (Pattern_Start + From_Pattern'length ..
Source'last),
                    From_Pattern, To_Pattern
                   );
   end if;
end Transform;

> > I'm not a big fan of these kinds of renames. Why not instead just
> > write:
> > 
> > Put_Line (Class_Name (Package_ID, Get_Attribute (Value (I), "ref")) &
> >           "($1,$2,$3)");
> 
> Because if I need to use it multiple times, I think that means it will
> have to call get_attribute multiple times, won't it?

Quite true. I only said this because each was only used once in the
code you presented. If you needed multiple calls to Get_Attribute with
the same parameters (presumably returning the same result), then
assigning that result into a constant once would probably be better.

> Besides, I don't think the above makes it very clear what the
> significance
> is of the results of each function call.

Fair enough. But as someone reading your code the first time, I have
to say I found your transformation more confusing than the original.
Perhaps that's just me though...

> > If you insist in splitting it up, turn those renames into constants.
> Are constants better then renames? If so, why?


The problem with renames is that they can easily lead to aliasing. You
might end up having accesses, or worse, updates, or even *both*, to
both the original version and the aliased version. That kind of thing
makes it really difficult to follow data flow in your program. In the
case of renamed functions, it also obfuscates control flow.



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-01 16:47       ` Oleg Goodyckov
@ 2002-08-02 14:05         ` Ted Dennison
  2002-08-02 16:11           ` Darren New
  2002-08-05  7:18           ` Oleg Goodyckov
  0 siblings, 2 replies; 86+ messages in thread
From: Ted Dennison @ 2002-08-02 14:05 UTC (permalink / raw)


Oleg Goodyckov <og@videoproject.kiev.ua> wrote in message news:<20020801194720.Q1080@videoproject.kiev.ua>...
> Saing onestly, I'm very surprized, that so basic and simple operation as
> splitting/joining of string is not present in Ada natively. And most

Well, stated that way, it *is* present. Splitting strings (or any
other array) is done with slices, and joining them (any array) is done
with the "&" operator. Those are indeed the basic operations available
to arrays (along with indexing, and bounds attributes like 'first). As
I understand it, your problem isn't with splitting and joining, its
with figuring out where to do the splitting and joining. Correct?


-- 
T.E.D.
Home     -  mailto:dennison@telepath.com (Yahoo: Ted_Dennison)
Homepage -  http://www.telepath.com/~dennison/Ted/TED.html



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-02  8:01               ` Oleg Goodyckov
@ 2002-08-02 16:09                 ` Darren New
  0 siblings, 0 replies; 86+ messages in thread
From: Darren New @ 2002-08-02 16:09 UTC (permalink / raw)

Oleg Goodyckov wrote:
> But guys, how many so "long-live" applications have you seen in our low
> cost use-and-throw products world,

I think most Boeing aircraft would not count as "use and throw away"
products. Hmmm... I take that back. Does Boeing make rocket boosters? ;-)

What you may be missing is the vast quantity of programming going on that
you never see. The software that runs your automobile dashboard. The
software that routes your baggage at the airport. The software planning
which gates to open at the dam. The software running the phone switch you're
dialed in to. The software keeping track of which wires your phone uses, and
under what streets they run. The software keeping track of whether you've
paid your bill, and whether you moved after not paying your bill, and etc.
And etc. All these programs run for years and years and years, and they're
all big enough that something like Perl isn't going to hack it. 

Seriously, yes, Perl isn't bad for a program you don't need to look at in 3
months. Personally, I use Tcl for programs I work on myself. I don't think
something like Tcl (or especially Perl) scales well enough that you can have
500 programmers from 20 different countries in 10 different timezones
working on the same program. I expect Ada is pretty good at that, tho.

> that you can allow your selfs going to
> maintains they along 20 years? May be I'm mistaken, but today, when new
> microprocessors are borning every year most long-needed-support are
> language compilers itself.

You're in a very narrow world, really. At the latest report I saw, ARM sells
more processors (processor cores, really) each year than Intel sells
Pentiums.

> Sounds good: "Ada is language for programming of eternity" :)))

Well, it was certainly designed more for that than Perl was. Perl, after
all, was a throw-away-report-generator, so of course it's going to have good
string chopping code and lousy maintainability.

-- 
Darren New 
San Diego, CA, USA (PST). Cryptokeys on demand.
** http://home.san.rr.com/dnew/DNResume.html **
** http://images.fbrtech.com/dnew/ **

Things to be thankful for, #37:
   No sausage was served at the Last Supper.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-02 14:05         ` Ted Dennison
@ 2002-08-02 16:11           ` Darren New
  2002-08-03  0:30             ` Ted Dennison
  2002-08-05  7:18           ` Oleg Goodyckov
  1 sibling, 1 reply; 86+ messages in thread
From: Darren New @ 2002-08-02 16:11 UTC (permalink / raw)

Ted Dennison wrote:
> I understand it, your problem isn't with splitting and joining, its
> with figuring out where to do the splitting and joining. Correct?

More like his problem is not having ubiquitous variable-sized arrays so you
could write a function that does the splitting and returns the result as an
array. That's the sort of thing that makes Perl quicker to put together than
Ada - you don't have to first program the component library before you can
use it. :-)

-- 
Darren New 
San Diego, CA, USA (PST). Cryptokeys on demand.
** http://home.san.rr.com/dnew/DNResume.html **
** http://images.fbrtech.com/dnew/ **

Things to be thankful for, #37:
   No sausage was served at the Last Supper.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-02 23:29             ` Dmitry A.Kazakov
@ 2002-08-02 16:35               ` Oleg Goodyckov
  2002-08-05 11:50                 ` Dmitry A. Kazakov
  0 siblings, 1 reply; 86+ messages in thread
From: Oleg Goodyckov @ 2002-08-02 16:35 UTC (permalink / raw)

On Sat, Aug 03, 2002 at 01:29:23AM +0200, Dmitry A.Kazakov wrote:
> 
> My implementation (for parsing unit expressions) is about 0.5K lines long. 
> Is that much?

500 bytes?

> > In couple of next steps of program error will be recognized and exception
> > rised. No problem.
> 
> Usually at that point, there is nothing to say about the error and its 
> location. Like an old Borland Pascal compiler, which promptly reported 
> "Line X, error in expression" for almost any error.

It is not right (as for me) to process EVERY error in input data. As for
me it is more effectively to process only correct data (which are reliably
recognized) and any other simply to drop nuffig.

> > Difference is like difference between RANDOM and SEQUENTIAL acceses to
> > data.
> 
> This is a good point. There is also a technical term for that. There are 
> global and local methods of processing texts, images etc. Global methods 
> (split is one) are working good for only small anount of data.

What here global and local methodes are for? For making conclusion "global
methods are working good almost never", so they are nuffig need not?
Config files of applications - are they small amount of data? Yes. But it
exists in every application. And to parse it splitting of string to
several independent fields is much more effective and convinient way than
make some sequential syntactical analyzing.

> > In many cases it is not necessary to analyze all of string - enought to
> > know count of tokens or several from them on several well known positions.
> 
> Well, pattern matching does the work. Others have pointed that. Note also, 

Of cause, touch left ear by right hand is possible. Somthing stupied, but
nothing impossible. Entirely acceptable in many cases.

> that as the complexity of syntax increases it becomes almost impossible at 
> some point to write a correct pattern and prove that it is correct.

Which nuffig "complexity of syntax"? Syntax is - no more simplest: fields
with separators (of one type) between of them. Take record, split it by
separators and enjoy.
No! Give me a syntax...

> [ example snipped ]
> 
> First, the example is not realistic but illustrative. A real-life example 
> would take into accout different spellings, typo errors, proper nouns, 
> multi-word tokens etc. It would probably work with a data base, it would 
> surely avoid unbounded strings (heap allocation) and so on and so far. I 
> doubt that a Perl implementation of all that would be simplier or shorter 
> than in Ada.

Really? Empty words. Try and show me. In skipped example I've seen one
attempt. Show me another - better.
Task solved in skipped example has name - building hystorgram of words
implementation. Why you name this task not realistic?

> Second, the 80% of the example code is dealing with s/w components like 
> containers etc. This has nothing to do with text processing. What is really 
> dedicated to parsing is quite short and transparent.

So, if that 80% of code throw out, then program will work? Or they are
necessary though?

> You might argue that Ada should have standard components standard (:-)), it 
> is questionable, but as you see (Ada Standard Component Library) there is a 
> work going in the direction of having that components, though maybe not as 
> a part of the standard.

So, my words have sence? Why then you argue?

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-01 13:10           ` Oleg Goodyckov
@ 2002-08-02 23:29             ` Dmitry A.Kazakov
  2002-08-02 16:35               ` Oleg Goodyckov
  0 siblings, 1 reply; 86+ messages in thread
From: Dmitry A.Kazakov @ 2002-08-02 23:29 UTC (permalink / raw)

Oleg Goodyckov wrote:

> On Thu, Aug 01, 2002 at 11:57:04PM +0200, Dmitry A.Kazakov wrote:
>> > Ok! How about write-once-use-always? For text data analyze
>> > applications.
>> 
>> Then, maybe it is worth to consider more advanced parsing techniques than
>> split? There are numerous Ada implementations of pattern matching. There
>> are also Ada subprograms to recognize data types in a string stream. It
>> is relatively easy to parse and evaluate expressions with brackets and
>> prioritized operations in Ada (an implementation of the twin-stack
>> argorithm is quite short) and note no things like split involved.
> 
> May be. In dreams it is possible almost all things.

My implementation (for parsing unit expressions) is about 0.5K lines long. 
Is that much?

>> >> > While for splitting string like
>> >> > "x=2*3" people will must be to write program enstead
>> >> > split("=","x=2*3"), people will write in Perl, not Ada.
>> >> 
>> >> And what would you do in the case "x=/* An error, should be := */ 2*"
>> >> and "3" continues on the next line?
>> > 
>> > Nothing. I know: I have data as described. If no - data is corrupted
>> > and must be throwed out. It's simple.
>> 
>> To do so you should have an ability to recognize errors.
> 
> In couple of next steps of program error will be recognized and exception
> rised. No problem.

Usually at that point, there is nothing to say about the error and its 
location. Like an old Borland Pascal compiler, which promptly reported 
"Line X, error in expression" for almost any error.

>> > But what would you do in the case, when data is correct yet?
>> 
>> In the given example data are correct. /*...*/ was a comment containing a
>> symbol supposed to be a delimiter. My point was that for almost any
>> real-life text parsing application, split is useless.
> 
> Why then most of my tasks are much easier solvable by using split, not
> substring and similar? May be they are not from real life?

Well, one customer does not count. I also wish some (other) things to be 
changed / added in Ada, but as Robert Dewar usually correctly points, my 
wish is no more than mine. You should convince a lot more people than only 
yourself before your wish become a part of the standard. Distressing, but 
how could it be otherwise?

>> > You'll build PROGRAMMMM, instead write "split(/=/,"x=2*3")".
>> 
>> You still need a program to process the output of split. Is its output
>> the final outcome? I suppose it is not. So there should a loop to iterate
>> through the list returned by split. Where is then a difference between
>> split + loop, and loop with Get_Next_Word inside?
> 
> Difference is like difference between RANDOM and SEQUENTIAL acceses to
> data.

This is a good point. There is also a technical term for that. There are 
global and local methods of processing texts, images etc. Global methods 
(split is one) are working good for only small anount of data.

> In many cases it is not necessary to analyze all of string - enought to
> know count of tokens or several from them on several well known positions.

Well, pattern matching does the work. Others have pointed that. Note also, 
that as the complexity of syntax increases it becomes almost impossible at 
some point to write a correct pattern and prove that it is correct.

>> IMO, the difference is
>> that the second is faster and easier to understand.
> 
> Really? Have you seen that program (bcwords.ada)? And you'll assert that
> that Ada's program is easier to understand? :-))))))
> I have nothing to say...
> 
> Sorry me for long quotations. But look everybody and who will risk to say,
> below Ada's program is simpler and easier to understand than equal Perl's
> program (which is more than in 10 times smaller)? Don't think about how
> much time needs that program to be written and debugged. Think, how much
> time it needs to simply type it in text editor correctly. :-)))))

[ example snipped ]

First, the example is not realistic but illustrative. A real-life example 
would take into accout different spellings, typo errors, proper nouns, 
multi-word tokens etc. It would probably work with a data base, it would 
surely avoid unbounded strings (heap allocation) and so on and so far. I 
doubt that a Perl implementation of all that would be simplier or shorter 
than in Ada.

Second, the 80% of the example code is dealing with s/w components like 
containers etc. This has nothing to do with text processing. What is really 
dedicated to parsing is quite short and transparent.

You might argue that Ada should have standard components standard (:-)), it 
is questionable, but as you see (Ada Standard Component Library) there is a 
work going in the direction of having that components, though maybe not as 
a part of the standard.

-- 
Regards,
Dmitry Kazakov
www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-02 16:11           ` Darren New
@ 2002-08-03  0:30             ` Ted Dennison
  2002-08-03  0:58               ` Darren New
  0 siblings, 1 reply; 86+ messages in thread
From: Ted Dennison @ 2002-08-03  0:30 UTC (permalink / raw)

Darren New wrote:
> Ted Dennison wrote:
> 
>>I understand it, your problem isn't with splitting and joining, its
>>with figuring out where to do the splitting and joining. Correct?
> 
> 
> More like his problem is not having ubiquitous variable-sized arrays so you
> could write a function that does the splitting and returns the result as an
> array. That's the sort of thing that makes Perl quicker to put together than

But we *do*, with Ada.Strings.Unbounded. However, you generally don't 
need to work with variable-sized arrays.

I really don't see what the problem with fixed array slicing is.

I'll agree that there is probably some room for improvement with Ada's 
string libraries. But slicing and joining are there just fine right now.

It looks like what he is talking about would be something like the 
following:

(in Ada.Strings.Unbounded)

type String_Pair is array (Left..Right) of Unbounded_String;

function Split (Around : String; Source : Unbounded_String) return 
String_Pair;

The issue *can't* be just that Ada doesn't have this one routine, 
because there are an infinite number of possible string handling 
routines that Ada (and Perl) doesn't have. The issue has to be that 
"common task XYZ is unreasonably tough in Ada due to lack of support". I 
don't see that case proven yet (but it could be).

The example task to use this for was (I believe) parsing an entire file 
into tokens around a given separator string. The above "Split" doesn't 
do that, its just part of an idiom for doing that which some Perl users 
might be accostomed to using. I don't think the whole task would be 
significantly tougher (or slower) if you were to use Ada.Strings.Index 
in your solution than it would be if you were to use our theoretical 
Ada.Strings.Unbounded.Split as the workhorse.

And I *have* written such routines before. It doesn't always happen, but 
I've been known to learn from the experience. :-)

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-01 21:37               ` Robert A Duff
@ 2002-08-03  0:42                 ` Ted Dennison
  2002-08-03 13:51                   ` Robert A Duff
                                     ` (2 more replies)
  0 siblings, 3 replies; 86+ messages in thread
From: Ted Dennison @ 2002-08-03  0:42 UTC (permalink / raw)

Robert A Duff wrote:
> I'm talking about things like, "zip up a bunch of files for a release,
> put it in the version 1.234 release directory, run some regression
> tests, and then send some mail to somebody-or-other notifying them of
> something-or-other."
> 
> I wish they were all written in a language that had Perl's features for
> doing stuff, and Ada's features for making it work reliably.

Generally, I've found it best to rewrite any shell scripts that get 
beyond a screen or two in Ada, using the "System" call (or its 
equivalent on that OS) to execute commands. Maintaing large TCL or sh 
scripts just isn't worth the hassle (a little lesson from the school of 
hard knocks here).

It would be nice to have a strongly-typed "make" language though. I 
can't really figure out a good way to do rule-based systems like 
rebuilding tools in Ada. So I have to learn all the gnarly dark corners 
in Make. Yech.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-03  0:30             ` Ted Dennison
@ 2002-08-03  0:58               ` Darren New
  2002-08-03  2:04                 ` Dale Stanbrough
                                   ` (2 more replies)
  0 siblings, 3 replies; 86+ messages in thread
From: Darren New @ 2002-08-03  0:58 UTC (permalink / raw)

Ted Dennison wrote:
> > More like his problem is not having ubiquitous variable-sized arrays so you
> > could write a function that does the splitting and returns the result as an
> > array. That's the sort of thing that makes Perl quicker to put together than
> 
> But we *do*, with Ada.Strings.Unbounded.

Errr, Ada.Strings.Unbounded is character arrays. Hardly ubiquitous. I think
if there had been a generic Ada.Unbounded that exports the same sort of
array stuff that Ada.Strings.Unbounded supports, all kinds of things could
have much more obvious interfaces.

> However, you generally don't
> need to work with variable-sized arrays.

No. *You* don't generally need to work with variable-sized arrays. :-) The
kinds of work done by people who find Perl effective *do* work with
variable-sized arrays, extensively.

> I really don't see what the problem with fixed array slicing is.
> 
> I'll agree that there is probably some room for improvement with Ada's
> string libraries. But slicing and joining are there just fine right now.

Yes. But when you chop up a string, what do you do with the result? You
can't just declare an array in Ada and say "I'm not sure how much this will
hold" or "add one more element to the end of this array" like you can with
Ada.Strings.Unbounded.

> It looks like what he is talking about

... is a specific case of a wider problem.

> (in Ada.Strings.Unbounded)
> 
> type String_Pair is array (Left..Right) of Unbounded_String;
> 
> function Split (Around : String; Source : Unbounded_String) return
> String_Pair;

No, I think the issue is (in part) when you say (for example) here's a
string, return me an array of strings, where each component of the returned
array is a whitespace-delimited work from the input string. So 
  Words("one two three four")
would return an array A such that
  A[1] = "one", A[2] = "two", and so on.

The problem is that there's no Ada.* declaration for anything like A.

At least, that's one of the problems I see.

> And I *have* written such routines before. It doesn't always happen, but
> I've been known to learn from the experience. :-)

Sure. I think the problem is that there's a host of low-efficiency
operations in Perl that take advantage of built-in data structures. That Ada
offers fixed strings, bounded strings, and unbounded strings indicates that
it has a focus on efficiency that something like Perl doesn't. If Ada didn't
have unbounded strings, people would have to keep reimplementing it. Ada
doesn't have unbounded arrays, and people have to keep reimplementing that
(when they need it). The assign-to-a-local-in-a-declaration doesn't really
work well when you have long-lived arrays. 

I've been working in scripting languages for the last few years, and I see a
lack in Ada of basic simple data structures, like variable sized arrays,
content-addressable arrays, and a few other things like that. I can see how
someone coming from Perl could miss all that. Once you've written programs
using built-in hashtables, arrays, etc, it's difficult to look at a language
that doesn't use such things and see how to do simple things. And that it
isn't built in means it's not going to get used everywhere it should. Even
if you build a library for UnboundedArrays, the (pulls example out of left
ear) MIME-parsing library isn't going to return an UnboundedArray compatible
with the one that goes into the XML parser. The MIME library's output
strings might be Ada.Strings.Unbounded, and the XML parser's input strings
might be Ada.Strings.Unbounded, but if you want to pass the array of lines
that's the body of the message into the array of lines that's the XML
parser's input, you're going to need to do conversions.

Yes, you *could* build all that. But from a "newbie" point of view, having
multitasking with extensive typing and all that, but lacking something as
simple as a variable-length array, really slows down learning the language,
because you're constantly stumbling when you're trying to do *simple* stuff.

Of course, Ada has excellent numeric support, type support, multithreading,
etc etc etc. It also looks like the support for large-scale programming is
excellent, altho I haven't had a chance to test that out.

-- 
Darren New 
San Diego, CA, USA (PST). Cryptokeys on demand.
   ** http://images.fbrtech.com/dnew/ **

They looked up at me like I was a slab of beef
  walking into an all-you-can-eat seafood buffet.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-03  0:58               ` Darren New
@ 2002-08-03  2:04                 ` Dale Stanbrough
  2002-08-03  2:32                 ` Ted Dennison
  2002-08-05 13:24                 ` Stephen Leake
  2 siblings, 0 replies; 86+ messages in thread
From: Dale Stanbrough @ 2002-08-03  2:04 UTC (permalink / raw)

Darren New wrote:

> No, I think the issue is (in part) when you say (for example) here's a
> string, return me an array of strings, where each component of the returned
> array is a whitespace-delimited work from the input string. So 
>   Words("one two three four")
> would return an array A such that
>   A[1] = "one", A[2] = "two", and so on.

When i was writing Ada web based programs that split up text files, 
I ended up writing a set of string splitting functions, but they still
retained a functional interface (i.e. Select_Field ("one two three", 2)
would give "two'). Having a predefined type that i could dump
the fields into would have been handy. I ended up writing some container
code, which of course highlights the poverty of the standardised Ada
offerings.

> Sure. I think the problem is that there's a host of low-efficiency
> operations in Perl that take advantage of built-in data structures. That Ada
> offers fixed strings, bounded strings, and unbounded strings indicates that
> it has a focus on efficiency that something like Perl doesn't. If Ada didn't
> have unbounded strings, people would have to keep reimplementing it.

As they constantly did with Ada83!

> Ada
> doesn't have unbounded arrays, and people have to keep reimplementing that
> (when they need it). The assign-to-a-local-in-a-declaration doesn't really
> work well when you have long-lived arrays. 

Ada -can- (sort of :-) have unbounded arrays, and it's not that hard to
implement. e.g. 

   type Unbounded_Array is array (Positive range <>) of Unbounded_String;

and then...

   declare
      fields : Unbounded_Array := Split ("one two three");
   begin
      ...

Alternatively you return a pointer to the object, allowing it to be more
long lived.

Dale

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-03  0:58               ` Darren New
  2002-08-03  2:04                 ` Dale Stanbrough
@ 2002-08-03  2:32                 ` Ted Dennison
  2002-08-03  2:47                   ` Darren New
  2002-08-03  5:07                   ` achrist
  2002-08-05 13:24                 ` Stephen Leake
  2 siblings, 2 replies; 86+ messages in thread
From: Ted Dennison @ 2002-08-03  2:32 UTC (permalink / raw)

Darren New wrote:
> Yes. But when you chop up a string, what do you do with the result? You
> can't just declare an array in Ada and say "I'm not sure how much this will
> hold" or "add one more element to the end of this array" like you can with
> Ada.Strings.Unbounded.

Yes you can. You just can't assign a new value into the array that is a 
different size than the old value. That's no huge hardship with declare 
blocks, and no problem at all if you program everthing functionally.

My problem with a lot of this discussion is that it seems to be "Ada 
doesn't support X idiom that Y programmers like to use", rather than 
"Ada can't perform X task nearly as easily as language Y can." There are 
probably examples of the latter that need to be addressed, so this 
discussion is important. But the former is just someone's ignorance of 
the language.

> I've been working in scripting languages for the last few years, and I see a
> lack in Ada of basic simple data structures, like variable sized arrays,
> content-addressable arrays, and a few other things like that. I can see how
> someone coming from Perl could miss all that. Once you've written programs

You are quite correct there. All that stuff is basicly equivalent to 
Unbounded Lists and Maps, which is a known deficiency that will 
(hopefully) be addressed in the next version of the language.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-03  2:32                 ` Ted Dennison
@ 2002-08-03  2:47                   ` Darren New
  2002-08-03 12:41                     ` Ted Dennison
                                       ` (2 more replies)
  2002-08-03  5:07                   ` achrist
  1 sibling, 3 replies; 86+ messages in thread
From: Darren New @ 2002-08-03  2:47 UTC (permalink / raw)

Ted Dennison wrote:
> 
> Darren New wrote:
> > Yes. But when you chop up a string, what do you do with the result? You
> > can't just declare an array in Ada and say "I'm not sure how much this will
> > hold" or "add one more element to the end of this array" like you can with
> > Ada.Strings.Unbounded.
> 
> Yes you can. 

Tell me how you declare a variable for an array whose bounds you don't know
until after you're past the declaration? Tell me how you add more elements
to the end of an array?

You can't do something like 
  X := X & Y
as far as *I* understand.

> You just can't assign a new value into the array that is a
> different size than the old value.

Well, that's the problem. It means you can't pass me an in out array and let
me make it bigger.

> That's no huge hardship with declare
> blocks, and no problem at all if you program everthing functionally.

Yes, actually, it can be a huge hardship, if that's how you think about
things. If you've got a variable of global lifetime (or whatever Ada calls
something declared at the package level so its value is valid for the
duration of the program) and you want to change how big it is, you can't.
You have to have a pointer to an array, like Ada.Strings.Unbounded does.

Sure, you can work around it. But that's true in any language. It's no huge
hardship to only have 1-D arrays - you can always keep track of the bounds
and multiply your first index by the second bound and add in the second
index. But it's the kind of thing that people go "I can't believe I can't
find this in the manual..."

> My problem with a lot of this discussion is that it seems to be "Ada
> doesn't support X idiom that Y programmers like to use", rather than
> "Ada can't perform X task nearly as easily as language Y can." There are
> probably examples of the latter that need to be addressed, so this
> discussion is important. But the former is just someone's ignorance of
> the language.

Well, X idiom is kind of what we're talking about here. Of course Ada
*could* do all this, if you write the code for it. But people coming from
scripting languages just kind of expect something that's supposedly as
powerful as Ada to be able to handle variable length arrays at least as
easily as it handles variable-length strings. And it doesn't.

> You are quite correct there. All that stuff is basicly equivalent to
> Unbounded Lists and Maps, which is a known deficiency that will
> (hopefully) be addressed in the next version of the language.

That's good news!  My concern is that while you *can* implement it, if it
doesn't come with the language, you're likely to wind up having incompatible
implementations between packages. Kind of like trying to mix modules out of
some unrelated but complex C packages, each of which wants to do memory
allocation and GC its own way so as to avoid the inevitable memory leaks you
get with big C programs. :-) You can do it, but you wind up learning a whole
new memory manager for each application. It shouldn't be that way. 

Dale Stanbrough wrote:
> Ada -can- (sort of :-) have unbounded arrays, and it's not that hard to
> implement. e.g.
> 
>    type Unbounded_Array is array (Positive range <>) of Unbounded_String;
> 
> and then...
> 
>    declare
>       fields : Unbounded_Array := Split ("one two three");
>    begin
>       ...

But this works poorly in many places where, for example, you want to
accumulate a bunch of results into an array. Like, I want to read lines from
the terminal into an array of strings until I get a blank line.

> Alternatively you return a pointer to the object, allowing it to be more
> long lived.

But the main problem is everyone's going to have a different type for doing
this. The XML parser is going to return a different kind of array than the
MIME parser.

Of course, you can do stuff based on the GNAT unbounded strings code, which
I did, but you really shouldn't have to.

-- 
Darren New 
San Diego, CA, USA (PST). Cryptokeys on demand.
   ** http://images.fbrtech.com/dnew/ **

They looked up at me like I was a T-bone steak
  walking into an all-you-can-eat seafood buffet.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-03  2:32                 ` Ted Dennison
  2002-08-03  2:47                   ` Darren New
@ 2002-08-03  5:07                   ` achrist
  2002-08-03 12:52                     ` Ted Dennison
  1 sibling, 1 reply; 86+ messages in thread
From: achrist @ 2002-08-03  5:07 UTC (permalink / raw)

Ted Dennison wrote:
> 
> no problem at all if you program everthing functionally.
> 

Just a few weeks ago I wrote an Ada program that does lots of string
splitting and concatenation, and I wrote it functionally, and it
runs very slowly.  Things like extracting the nth item in a string 
by extracting the 1st item n times and all that, all recursively. 

I'm guessing that some of the performance problem is related to Ada
not being a functional language and that if you want to write 
functional programs, Ada is not the best choice.  I'm not sure, 
just guessing.  

Is that true?  Does a typical Ada compiler, for example GNAT 3.14
public edition that I used, have optimizations for tail-recursive
functions?  Or will I get into trouble if I over-abuse recursion
in Ada?  For example, if I was recursively splitting a string 
of 100 kb into 1,000 pieces, might I eat up 100 Meg of stack or
even crash?  As I understand it, compilers for functional languages
generate code so that programs that are written functionally with 
proper tail recursion run pretty much as efficiently as they would 
if they were written loopishly.   That won't happen with Ada, will 
it?

Al

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-03  2:47                   ` Darren New
@ 2002-08-03 12:41                     ` Ted Dennison
  2002-08-03 16:53                       ` Darren New
  2002-08-05  9:56                     ` Lutz Donnerhacke
  2002-08-05 13:29                     ` Stephen Leake
  2 siblings, 1 reply; 86+ messages in thread
From: Ted Dennison @ 2002-08-03 12:41 UTC (permalink / raw)

Darren New wrote:
> Tell me how you declare a variable for an array whose bounds you don't know
> until after you're past the declaration? 

That's not what you said before. Its also a nearly irrelevant point, 
since you can almost always place declaration at the point where you 
*do* know the length.

> Tell me how you add more elements to the end of an array?

In Ada the idiom isn't to add elements to the end of an existing array; 
its to build a new one with the two old arrays catenated. If this needs 
to be done progressively, one uses recursion.

Of course that can be real slow. But if you care that much about speed, 
typically you'd want to be using your own bounded-style algorithms 
anyway. It might be nice to have some middle-speed-ground in there using 
dynamic allocation too. Hopefully the new list support in Ada0X will 
help out there.

> You can't do something like 
>   X := X & Y

No, but you can easily do:

declare
    New_X : constant String := X & Y;

> 
> Yes, actually, it can be a huge hardship, if that's how you think about
> things. If you've got a variable of global lifetime (or whatever Ada calls

We're back to my earlier point: Perhaps you should change how you think 
about things, rather than demand we "move the mountain to Mohhamed". You 
can't expect every language to support every idiom every other language 
supports. If you go into Lisp trying to program it like Perl, you won't 
have much fun either.

> Well, X idiom is kind of what we're talking about here. Of course Ada

It shouldn't be. If you can do the same tasks Idiom X is used for in Ada 
with idiom Y, and idiom Y isn't a royal pain comparativly, then I don't 
see a problem (except perhaps with training).

> But this works poorly in many places where, for example, you want to
> accumulate a bunch of results into an array. Like, I want to read lines from
> the terminal into an array of strings until I get a blank line.

Only if you refuse to use recursion. Note that the original list 
processing language (LISP) has variable length arrays (lists) as its 
basic data type, and it still encouraged using recursion for these kinds 
of tasks.

But as we said, that is something that will (hopefully) be corrected in 
the next Ada revision.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-03  5:07                   ` achrist
@ 2002-08-03 12:52                     ` Ted Dennison
  2002-08-05 15:34                       ` Ted Dennison
  0 siblings, 1 reply; 86+ messages in thread
From: Ted Dennison @ 2002-08-03 12:52 UTC (permalink / raw)

achrist@easystreet.com wrote:
> Ted Dennison wrote:
> 
>>no problem at all if you program everthing functionally.
>>
> 
> 
> Just a few weeks ago I wrote an Ada program that does lots of string
> splitting and concatenation, and I wrote it functionally, and it
> runs very slowly.  Things like extracting the nth item in a string 
> by extracting the 1st item n times and all that, all recursively. 

> Is that true?  Does a typical Ada compiler, for example GNAT 3.14
> public edition that I used, have optimizations for tail-recursive
> functions?  Or will I get into trouble if I over-abuse recursion

Why don't you check the docs? I just reinstalled my OS here, so I can't 
do it for you. But I seem to remember it doing tail-recursion 
optimizations at the higher optimization levels. At the default it is 
purposely extra slow (perhaps "deliberate" would be a better term), so 
as to not confuse debuggers.

There are some things you can do to help out though. For example, don't 
declare anything locally that you don't need a separate copy of in each 
recursive call. The same goes for passing parameters. This goes against 
the usual rule of not using globals, but so be it. You can mitiagate the 
maintainability pain by declaring the recursive routine inside of 
another routine, which  contains the declarations for all the recursive 
"globals". Also, try to code so that your algorithim is indeed 
tail-recursive (and use the optimization).

But you will still have to expect a certian amount of slowness, compared 
to just declaring a big honking string and using a "Last_Index" 
variable. That's why the example of functional Text_IO Get_Line on 
Adapower used both approaches (a big honking string with a Last_Index, 
and a recursive call if that wasn't big enough).

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-03  0:42                 ` Ted Dennison
@ 2002-08-03 13:51                   ` Robert A Duff
  2002-08-03 16:43                   ` Darren New
  2002-08-05 13:37                   ` Stephen Leake
  2 siblings, 0 replies; 86+ messages in thread
From: Robert A Duff @ 2002-08-03 13:51 UTC (permalink / raw)

Ted Dennison <dennison@telepath.com> writes:

> Generally, I've found it best to rewrite any shell scripts that get 
> beyond a screen or two in Ada, using the "System" call (or its 
> equivalent on that OS) to execute commands. Maintaing large TCL or sh 
> scripts just isn't worth the hassle (a little lesson from the school of 
> hard knocks here).

I've been writing most "scripty" stuff in Ada lately, too.
I believe Robert Dewar also advocated that approach recently.

It beats Perl, sh, awk, .bat files, etc,
but it's far from a perfect solution.

You mention "system" call.  Fine, but in many cases (if I invoke the
same external program twice, or think I might later), I tend to
encapsulate that in some sort of abstraction, if possible.

- Bob

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-03  0:42                 ` Ted Dennison
  2002-08-03 13:51                   ` Robert A Duff
@ 2002-08-03 16:43                   ` Darren New
  2002-08-05 13:37                   ` Stephen Leake
  2 siblings, 0 replies; 86+ messages in thread
From: Darren New @ 2002-08-03 16:43 UTC (permalink / raw)

Ted Dennison wrote:
> Maintaing large TCL or sh
> scripts just isn't worth the hassle (a little lesson from the school of
> hard knocks here).

From my experience with Tcl, I expect you're making the same mistake as many
others. You are probably trying to write Tcl like you write Ada, just as
lots of people are trying to write Ada like they write Perl. ;-)

If your Tcl script is large, you're likely not using enough introspection
and code generation. I've done entire multi-user multimedia
web/mail/ftp-accessible ecommerce content management systems in Tcl, and it
didn't get to be what I'd call a "large" program. 

But if you don't take advantage of the ability to do things like having
procedures looking at their own code, iterating over all the global
variables in a particular package, etc., then yeah, I can see where you'd
wind up with code several times the size. 

Personally, I can't imagine Ada doing a better job of substituting for shell
scripts than Tcl does, but I suppose it's possible I'm missing something
obvious.

-- 
Darren New 
San Diego, CA, USA (PST). Cryptokeys on demand.
   ** http://images.fbrtech.com/dnew/ **

They looked up at me like I was a T-bone steak
  walking into an all-you-can-eat seafood buffet.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-03 12:41                     ` Ted Dennison
@ 2002-08-03 16:53                       ` Darren New
  2002-08-04  1:08                         ` Ted Dennison
  0 siblings, 1 reply; 86+ messages in thread
From: Darren New @ 2002-08-03 16:53 UTC (permalink / raw)

Ted Dennison wrote:
> 
> Darren New wrote:
> > Tell me how you declare a variable for an array whose bounds you don't know
> > until after you're past the declaration?
> 
> That's not what you said before. Its also a nearly irrelevant point,
> since you can almost always place declaration at the point where you
> *do* know the length.

Unless it's global to a package. I mean, if you never really need
variable-length arrays, why did Ada95 include unbounded strings? Clearly
there *is* a need for it.

> > Tell me how you add more elements to the end of an array?
> 
> In Ada the idiom isn't to add elements to the end of an existing array;
> its to build a new one with the two old arrays catenated. If this needs
> to be done progressively, one uses recursion.

OK, so when I said "you can't add one more element to the end of this array"
and you said "yes you can", then you were mistaken.

> > You can't do something like
> >   X := X & Y
> 
> No, but you can easily do:
> 
> declare
>     New_X : constant String := X & Y;

Which doesn't help if the point is to change X.

> > Yes, actually, it can be a huge hardship, if that's how you think about
> > things. If you've got a variable of global lifetime (or whatever Ada calls
> 
> We're back to my earlier point: Perhaps you should change how you think
> about things, rather than demand we "move the mountain to Mohhamed".

Well, I'm not demanding anything. I'm quite happy working within the bounds
of Ada when I need to use something Ada is strong at. I'm just trying to
point out that saying "it's not there because you don't need it" is probably
not going to get many people used to more convenient languages interested in
Ada.

> > Well, X idiom is kind of what we're talking about here. Of course Ada
> It shouldn't be.

In the sense that it was the OP's original question. Surely you can't mean
"don't ask that question."

> If you can do the same tasks Idiom X is used for in Ada
> with idiom Y, and idiom Y isn't a royal pain comparativly, then I don't
> see a problem (except perhaps with training).

I think how painful it is depends on what you're trying to do and how you're
trying to do it. Personally, I see *no* relationship between
  X := X & A
and
  declare new_X : blah := X & A

I mean, wouldn't the same argument hold if someone said "Why doesn't
language Blah have a while loop?" and you gave the answer "Well, the idiom
in Blah is to use conditional gotos, which is not a royal pain."

> > But this works poorly in many places where, for example, you want to
> > accumulate a bunch of results into an array. Like, I want to read lines from
> > the terminal into an array of strings until I get a blank line.
> 
> Only if you refuse to use recursion. 

Recursion is not always possible either, if other tasks need to be handled
coherently, if (say) other tasks need to look at the value of the list while
you're waiting for more input to come in, if you want to update a GUI based
on the lines of text, if you're filling in the array based on the processing
of a state machine (like a parser), etc etc. I.e., if you have something
that's difficult to program in a functional style.

> But as we said, that is something that will (hopefully) be corrected in
> the next Ada revision.

Yep. That would be nice.

-- 
Darren New 
San Diego, CA, USA (PST). Cryptokeys on demand.
   ** http://images.fbrtech.com/dnew/ **

They looked up at me like I was a T-bone steak
  walking into an all-you-can-eat seafood buffet.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-03 16:53                       ` Darren New
@ 2002-08-04  1:08                         ` Ted Dennison
  2002-08-04 16:23                           ` Darren New
  0 siblings, 1 reply; 86+ messages in thread
From: Ted Dennison @ 2002-08-04  1:08 UTC (permalink / raw)

Darren New wrote:
> In the sense that it was the OP's original question. Surely you can't mean
> "don't ask that question."

No. The OP asked about general string processing routines in Ada, 
listing several examples from Perl. If the question is "where are the 
string routines" the answer is "Ada.String.*, and the string attributes 
(and built in slicing)". If the question is, "In general, how am I 
supposed to deal with strings in Ada", then the answer is unfortunately 
rather long. But either way, whether Ada has routines exactly matching 
all the Perl routines is definitely not the point.

> I mean, wouldn't the same argument hold if someone said "Why doesn't
> language Blah have a while loop?" and you gave the answer "Well, the idiom
> in Blah is to use conditional gotos, which is not a royal pain."

Actually, Lisp originally did not have built in looping (or so I've been 
told). The idea behind the language was that you were supposed to do 
iteration with recursion instead. Sound familiar? I would imagine the 
Lisp folks had dicussions very much like the one we're having now, 
before they broke down and added procedural looping constructs to the 
language. :-)

> Recursion is not always possible either, if other tasks need to be handled
> coherently, if (say) other tasks need to look at the value of the list while
> you're waiting for more input to come in, if you want to update a GUI based

That would be completely unsafe, unless you lock the data structure 
somehow. In order to do that you'd probably want to make the list a 
protected object, which I don't think anyone is seriously talking about 
doing right now.

> on the lines of text, if you're filling in the array based on the processing
> of a state machine (like a parser), etc etc. I.e., if you have something
> that's difficult to program in a functional style.

I think you mean a lexical analyzer, not a parser. Lexical analyzers are 
usually some kind of state machine (unless they are very simple). 
Parsers aren't always state machines; in fact most hand-written ones are 
probably recursive-descent.

I've done functional lexical analyzers. You just give the 
"tokenize_once"  function its input string, and have it return the slice 
of it that contains a token.

However, its usually a lot quicker to just work out of a big buffer, and 
use indices to designate tokens within the buffer. That way you aren't 
copying strings around until you need to.

The only really good example of something that's tough to program 
functionally in Ada that I can think of is stuff involving 
Text_IO.Get_Line, and that's only because you are forced to use buffers 
and data lengths to use that routine. A functional version of Get_Line 
(and one exists on AdaPower) would have solved this problem.

There *are* times where functional use of String is way too slow. For 
those instances you can use buffers and lengths. There are also times 
where you could deal with the overhead of dynamic allocations, but you 
just can't deal with all the extra data copying that funtional use of 
String would cause. For those instances, its nice to have 
Ada.Strings.Unbounded. (and hopefully one day Ada.Lists or whatever). 
But Ada programmers should teach themselves to work with perfectly sized 
constant strings where feasable, as that's the language's native idiom.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-04  1:08                         ` Ted Dennison
@ 2002-08-04 16:23                           ` Darren New
  2002-08-05  2:16                             ` Robert Dewar
  0 siblings, 1 reply; 86+ messages in thread
From: Darren New @ 2002-08-04 16:23 UTC (permalink / raw)

Ted Dennison wrote:
> 
> Darren New wrote:
> > In the sense that it was the OP's original question. Surely you can't mean
> > "don't ask that question."
> 
> No. The OP asked about general string processing routines in Ada,

Yes. But I believe I understood what he was asking better than the Ada
folks. He was asking about the string parsing routines, but the *reason*
they're different in Ada is a lack of a general variable-sized array. 

> If the question is, "In general, how am I
> supposed to deal with strings in Ada", then the answer is unfortunately
> rather long.

Perhaps I was reading more into it, but I believe thats what he was asking.
As one poster said, "surely the lack of this one routine is not the
problem."

> Lisp folks had dicussions very much like the one we're having now,
> before they broke down and added procedural looping constructs to the
> language. :-)

Yep. That's my point. :-) LISP certainly isn't what I'd call a powerhouse of
popularity even *with* the procedural constructs. The answer of "change how
you do things to fit our compiler" isn't a good answer, is all I'm trying to
say.

> That would be completely unsafe, unless you lock the data structure
> somehow.

Actually, I meant to say "concurrently" rather than "coherently". It has
nothing to do with tasks. You can be doing several different tasks (in the
sense of "things to do" rather than "concurrent program counters with
independent stacks") at the same time without using Ada tasks. You can (for
example) be reading input, parsing it, generating a parse tree, generating
code off the parse tree, doing optimization, and writing out object code
concurrently, without using tasks. 

It has to do with the fact that it's an inherently iterative process, not a
recursive one. I don't think you want to recurse a million times to read a
million lines of text from the terminal. I really suspect you don't want 1.5
million lines of text on the stack, either.

If you're building up one array, it's conceivable you could do it
recursively. 

> > on the lines of text, if you're filling in the array based on the processing
> > of a state machine (like a parser), etc etc. I.e., if you have something
> > that's difficult to program in a functional style.
> 
> I think you mean a lexical analyzer, not a parser. Lexical analyzers are
> usually some kind of state machine (unless they are very simple).

A lexical analyzer is a simple parser. (I'm a bit of a formalist, ya see, so
I tend to mean the technical definitions when I use the technical words.)

> However, its usually a lot quicker to just work out of a big buffer, and
> use indices to designate tokens within the buffer. That way you aren't
> copying strings around until you need to.

That's a good point, and something that's hard to do in many other
languages. However, it's really not enough. For example, implement the
Ada.Strings.Unbounded package using only recursion to allocate record sizes.
I haven't tried, but I don't think it would be easy, even if possible.

> But Ada programmers should teach themselves to work with perfectly sized
> constant strings where feasable, as that's the language's native idiom.

Agreed. And it's easier to do in Ada than it is in C. But that doesn't make
Ada a good string-processing language compared to something like Perl or
Tcl. :-) It's clearly a difficult idiom, given the number of newbie
questions about how to do it I see here. Once you're used to doing it that
way, yes, it can be done that way. I'm not disputing that.

Anyway, this horse is a greasy spot on the sidewalk. You can have the last
word. :-)

-- 
Darren New 
San Diego, CA, USA (PST). Cryptokeys on demand.
   ** http://images.fbrtech.com/dnew/ **

They looked up at me like I was a T-bone steak
  walking into an all-you-can-eat seafood buffet.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-04 16:23                           ` Darren New
@ 2002-08-05  2:16                             ` Robert Dewar
  2002-08-05  3:45                               ` Darren New
  0 siblings, 1 reply; 86+ messages in thread
From: Robert Dewar @ 2002-08-05  2:16 UTC (permalink / raw)


Darren New <dnew@san.rr.com> wrote in message news:<3D4D551F.190A19C0@san.rr.com>...
> I don't think you want to recurse a million times to read 
> million lines of text from the terminal. I really suspect 
> you don't want 1.5 million lines of text on the stack,
> either.

A very strange statement indeed. Using recursion to read
a million lines of text is perfectly natural, and with
a decent compiler that does tail recursion elimination,
you will get identical code for loops and for simple
recursion (recursive languages like Haskell absolutely
require this optimization of course). It is simply a
matter of taste whether you like the syntactic sugar
of a looping construct. That's all a loop is, syntactic sugar for tail
recursion.

>  I.e., if you have something
> > > that's difficult to program in a functional style.

None of the examples you give are even vaguely in this
category (unless you have difficulty programming in a functional
style, in which case the statement is sort of
vacuously true :-)

> A lexical analyzer is a simple parser. (I'm a bit of a 
> formalist, ya see, so
> I tend to mean the technical definitions when I use the 
> technical words.)

Well then unless you intend to emulate Humpty-Dumpty in
Alice, please use technical defintions the same way as
the rest of the world.

It is absolutely standard in the world of compilers to
draw a sharp distinction between lexical analysis, typically based on
type 3 (regular) languages, and parsing, typically based on type 2
(context free) languages.

It is *distinctly* unhelpful to use the term parsing for
the former, and will simply confuse other people. If that
is your goal, fine you will succeed, but if you are a "bit
of a formalist" then you will want to use formal terms in
their accepted meaning if you want to be understood.

> That's a good point, and something that's hard to do in 
> many other languages. However, it's really not enough. 
> For example, implement the Ada.Strings.Unbounded package 
> using only recursion to allocate record sizes. I haven't 
> tried, but I don't think it would be easy, even if 
> possible.

Of course it's possible, and easy if you have a reasonable
familiarity with writing in a recursive style. I always
like to give my students the task of writing programs in
a procedural language avoiding the use of assignments and
loops. It is definitely useful to learn familiarity with
recursion as a style of programming. I am not in favor of
forcing everyone to use ONLY this style, but I do find that
people overuse imperative constructs.

In particular, people very much overuse variables, and do
not make sufficient use of constants that are computed 
once, a powerful feature in Ada. 

(digression. I prefer Algol 68 to Ada in this respect. In
Algol-68, it is easier to write a constant declaration

    int a = expression;

than to write an initialized variable

    ref int a = loc int := expression;

and even in the shortened form:

    int a := expression;

is one character longer than the constant. In Ada, we have
to write an extra keyword to make things constant, so the
lazy path is to make an unnecessary variable)

Back on topic: the latest version of GNAT at least warns
if you write

   X : type := expression;

and never modify X (the compiler warns that you could
make this declaration into a constant declaration).

But it does not go so far as warning that if you write

   X : type;
   ..
   X := expression; 

where there are no other assignments to X, that X could
be made into a constant initialized with the expression
(the conditions for that are harder to figure out)


>  
> > But Ada programmers should teach themselves to work with perfectly sized
> > constant strings where feasable, as that's the language's native idiom.
> 
> Agreed. And it's easier to do in Ada than it is in C. But that doesn't make
> Ada a good string-processing language compared to something like Perl or
> Tcl. :-) It's clearly a difficult idiom, given the number of newbie
> questions about how to do it I see here. Once you're used to doing it that
> way, yes, it can be done that way. I'm not disputing that.
> 
> Anyway, this horse is a greasy spot on the sidewalk. You can have the last
> word. :-)



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-05  2:16                             ` Robert Dewar
@ 2002-08-05  3:45                               ` Darren New
  0 siblings, 0 replies; 86+ messages in thread
From: Darren New @ 2002-08-05  3:45 UTC (permalink / raw)

Robert Dewar wrote:
> That's all a loop is, syntactic sugar for tail
> recursion.

This is only true if it's actually tail recursion. However, ensuring you
have only tail recursion can be difficult in a procedural language like Ada.
If I have a local variable that is controlled, or an in-out variable of the
procedure, or the array or its components are controlled, I don't think Ada
can safely eliminate the tail recursion. So if, for example, it's an array
of unbounded_string as implemented in GNAT, I expect the compiler would have
a hard time eliminating the tail recursion, would it not?

I'm not sure I'd want to write algorithms that only work right when the
compiler eliminates tail recursion, also, since the compiler doing that
isn't specified in the ARM, is it? Or am I wrong on that?

> >  I.e., if you have something
> > > > that's difficult to program in a functional style.
> 
> None of the examples you give are even vaguely in this
> category (unless you have difficulty programming in a functional
> style,

No, simple problems that can't be done recursively are difficult to find.
Difficult problems that can't easily be done recursively are difficult to
post to a newsgroup.

> Well then unless you intend to emulate Humpty-Dumpty in
> Alice, please use technical defintions the same way as
> the rest of the world.

http://www.netlingo.com/lookup.cfm?term=parse

"Parsing is often divided into lexical analysis and semantic parsing..."

> > For example, implement the Ada.Strings.Unbounded package
> > using only recursion to allocate record sizes. 

> Of course it's possible, and easy if you have a reasonable
> familiarity with writing in a recursive style.

Can you show me how to implement
procedure Ada.String.Unbounded.Append(source:in out Unbounded_String; 
   New_Item : in Unbounded_String)
using recursion? That is, how would you write this if Unbounded_String
didn't use an access value?
How about procedure insert(source:in out unbounded_string;
  before : in Positive; new_item : in String)?

Seriously, I'd like to see how it's done in Ada.

-- 
Darren New 
San Diego, CA, USA (PST). Cryptokeys on demand.
   ** http://images.fbrtech.com/dnew/ **

They looked up at me like I was a T-bone steak
  walking into an all-you-can-eat seafood buffet.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-02 14:05         ` Ted Dennison
  2002-08-02 16:11           ` Darren New
@ 2002-08-05  7:18           ` Oleg Goodyckov
  1 sibling, 0 replies; 86+ messages in thread
From: Oleg Goodyckov @ 2002-08-05  7:18 UTC (permalink / raw)


On Fri, Aug 02, 2002 at 07:05:33AM -0700, Ted Dennison wrote:
> Oleg Goodyckov <og@videoproject.kiev.ua> wrote in message news:<20020801194720.Q1080@videoproject.kiev.ua>...
> > Saing onestly, I'm very surprized, that so basic and simple operation as
> > splitting/joining of string is not present in Ada natively. And most
> 
> Well, stated that way, it *is* present. Splitting strings (or any
> other array) is done with slices, and joining them (any array) is done
> with the "&" operator. Those are indeed the basic operations available
> to arrays (along with indexing, and bounds attributes like 'first). As
> I understand it, your problem isn't with splitting and joining, its
> with figuring out where to do the splitting and joining. Correct?

Of cause not in splitting/joining itself.
Look at drop of water, where problem is mirrored. Below is slice of
Gnat.Spitbol.Patterns package (g-spipat.ads):

 --    For an example of a recursive pattern, let's define a pattern
 --    that is like the built in Bal, but the string matched is balanced
 --    with respect to square brackets or curly brackets.
 --    The language for such strings might be defined in extended BNF as
 --      ELEMENT ::= &lt;any character other than [] or {}&gt;
 --                  | '[' BALANCED_STRING ']'
 --                  | '{' BALANCED_STRING '}'
 --      BALANCED_STRING ::= ELEMENT {ELEMENT}
 --    Here we use {} to indicate zero or more occurrences of a term, as
 --    is common practice in extended BNF. Now we can translate the above
 --    BNF into recursive patterns as follows:
 --      Element, Balanced_String : aliased Pattern;
 --      .
 --      .
 --      .
 --      Element := NotAny ("[]{}")
 --                   or
 --                 ('[' & (+Balanced_String) & ']')
 --                   or
 --                 ('{' & (+Balanced_String) & '}');
  --      Balanced_String := Element & Arbno (Element);
  --    Note the important use of + here to refer to a pattern not yet
  --    defined. Note also that we use assignments precisely because we
  --    cannot refer to as yet undeclared variables in initializations.
  --    Now that this pattern is constructed, we can use it as though it
  --    were a new primitive pattern element, and for example, the match:
  --      Match ("xy[ab{cd}]", Balanced_String * Current_Output & Fail);
  --    will generate the output:
  --       x
  --       xy
  --       xy[ab{cd}]
  --       y
  --       y[ab{cd}]
  --       [ab{cd}]
  --       a
  --       ab
  --       ab{cd}
  --       b
  --       b{cd}
  --       {cd}
  --       c
  --       cd
  --       d

Good! But where can I find this last output? Somhere in "Current_Output".
Where is it? In current output file? Why there? 
Suggest: because Ada has a lack of dynamically infrastructure.
And external file is such a most apropriate implementation of it. It likes
as pushing a problem out of mind: data pushed out of program, problem
pushed out of mind. No data - no problem.

But I'll be glad to find this convinient dynaic infrastructure. Having it
- it will be very simple to build any variation of split/join of string.
Suggest: big amount of code in support of realization of function "split"
is hidden realization of dynamic infrastructure needed for this purpose.



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-03  2:47                   ` Darren New
  2002-08-03 12:41                     ` Ted Dennison
@ 2002-08-05  9:56                     ` Lutz Donnerhacke
  2002-08-05 16:02                       ` Darren New
                                         ` (2 more replies)
  2002-08-05 13:29                     ` Stephen Leake
  2 siblings, 3 replies; 86+ messages in thread
From: Lutz Donnerhacke @ 2002-08-05  9:56 UTC (permalink / raw)


* Darren New wrote:
>Tell me how you declare a variable for an array whose bounds you don't know
>until after you're past the declaration? Tell me how you add more elements
>to the end of an array?
>
>You can't do something like
>  X := X & Y
>as far as *I* understand.

--test_unbound.ads--
with Unbound;

package Test_Unbound is
   type Int is range 0 .. 1000;
   type Ind is range 1 .. 20;
   package UI is new Unbound (Int, Ind);
   subtype Int_Array is UI.Unbounded_Array;

   procedure Test;
end Test_Unbound;
\f
--test_unbound.adb--

package body Test_Unbound is
   procedure Test is
      use UI;
      a, b : Int_Array;
   begin
      a := To_Unbounded ((1, 2, 3, 4));
      b := To_Unbounded ((5, 6, 7, 8));
      a := a & b;
      Add_Back (b, To_Unbounded ((9, 10)));
   end Test;
end Test_Unbound;
\f
--unbound.ads--
with Ada.Finalization;

generic
   type Item is private;
   type Index is range <>;
package Unbound is
   type Unbounded_Array is private;
   Empty_Array : constant Unbounded_Array;      
   procedure Swap (a, b : in out Unbounded_Array);      
   function "&" (a, b : Unbounded_Array) return Unbounded_Array;
   procedure Add_Back (a : in out Unbounded_Array; b : Unbounded_Array);
   type Bounded_Array is array (Index range <>) of Item;
   function To_Unbounded (a : Bounded_Array) return Unbounded_Array;
   function To_Bounded (a : Unbounded_Array) return Bounded_Array;
private
   type Array_Access is access Bounded_Array;
   type Unbounded_Array is new Ada.Finalization.Controlled with record
      array_p : array_access := null;
   end record;
   procedure Initialize (a : in out Unbounded_Array);
   procedure Adjust     (a : in out Unbounded_Array);
   procedure Finalize   (a : in out Unbounded_Array);
   Empty_Array : constant Unbounded_Array :=
     (Ada.Finalization.Controlled with array_p => null);
end Unbound;
\f
--unbound.adb--
with Unchecked_Deallocation;

package body Unbound is
   ----------
   -- Swap --
   ----------
   
   procedure Swap (a, b : in out Unbounded_Array) is
      temp : constant Array_Access := a.array_p;
   begin
      a.array_p := b.array_p;
      b.array_p := temp;
   end Swap;

   ---------
   -- "&" --
   ---------

   function "&" (a, b : Unbounded_Array) return Unbounded_Array is
      res : Unbounded_Array;
   begin
      if a.array_p = null then
         if b.array_p /= null then
            res.array_p := new Bounded_Array'(b.array_p.all);
         end if;
      else
         if b.array_p /= null then
            res.array_p := new Bounded_Array'(a.array_p.all & b.array_p.all);
         else
            res.array_p := new Bounded_Array'(a.array_p.all);
         end if;
      end if;
      return res;
   end "&";

   --------------
   -- Add_Back --
   --------------

   procedure Add_Back (a : in out Unbounded_Array; b : Unbounded_Array) is
      new_a : Unbounded_Array := a & b;
   begin
      Swap (a, new_a);
   end Add_Back;

   ------------
   -- Adjust --
   ------------

   procedure Adjust (a : in out Unbounded_Array) is
      new_a : Unbounded_Array := a & Empty_Array;
   begin
      Swap (a, new_a);
      new_a.array_p := null;
   end Adjust;

   --------------
   -- Finalize --
   --------------

   procedure Finalize (a : in out Unbounded_Array) is
      procedure Free is
         new Unchecked_Deallocation (Bounded_Array, Array_Access);
   begin
      if a.array_p /= null then
         free (a.array_p);
      end if;
   end Finalize;

   ----------------
   -- Initialize --
   ----------------

   procedure Initialize (a : in out Unbounded_Array) is
   begin
      null;
   end Initialize;

   ------------------
   -- To_Unbounded --
   ------------------
   function To_Unbounded (a : Bounded_Array) return Unbounded_Array is
   begin
      return (Ada.Finalization.Controlled with
        array_p => new Bounded_Array'(a));
   end To_Unbounded;

   ----------------
   -- To_Bounded --
   ----------------
   function To_Bounded (a : Unbounded_Array) return Bounded_Array is
   begin
      return a.array_p.all;
   end To_Bounded;
end Unbound;



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-02 16:35               ` Oleg Goodyckov
@ 2002-08-05 11:50                 ` Dmitry A. Kazakov
  2002-08-05 14:29                   ` Larry Kilgallen
                                     ` (2 more replies)
  0 siblings, 3 replies; 86+ messages in thread
From: Dmitry A. Kazakov @ 2002-08-05 11:50 UTC (permalink / raw)


On Fri, 2 Aug 2002 19:35:35 +0300, Oleg Goodyckov
<og@videoproject.kiev.ua> wrote:

>On Sat, Aug 03, 2002 at 01:29:23AM +0200, Dmitry A.Kazakov wrote:
>> 
>> My implementation (for parsing unit expressions) is about 0.5K lines long. 
>> Is that much?
>
>500 bytes?

How big is the run-time library then?

>It is not right (as for me) to process EVERY error in input data. As for
>me it is more effectively to process only correct data (which are reliably
>recognized) and any other simply to drop nuffig.

Ah, that practice, which makes HTML a disaster because browsers
silently ignore what they do not understand. The results are known.

>> > Difference is like difference between RANDOM and SEQUENTIAL acceses to
>> > data.
>> 
>> This is a good point. There is also a technical term for that. There are 
>> global and local methods of processing texts, images etc. Global methods 
>> (split is one) are working good for only small anount of data.
>
>What here global and local methodes are for? For making conclusion "global
>methods are working good almost never", so they are nuffig need not?

The problem of all global methods is that the parameters they need
cannot be optimal in a  large context. Split is an example. It
requires a separator and a notion of a token which may vary from point
to point, making the approach useless.

>Config files of applications - are they small amount of data? Yes. But it
>exists in every application. And to parse it splitting of string to
>several independent fields is much more effective and convinient way than
>make some sequential syntactical analyzing.

I remember a project with a config file of ~2MBytes big. (it was a
Windows registry folder). I wonder how much time it would take to
parse it using split technique.

>> that as the complexity of syntax increases it becomes almost impossible at 
>> some point to write a correct pattern and prove that it is correct.
>
>Which nuffig "complexity of syntax"? Syntax is - no more simplest: fields
>with separators (of one type) between of them.

It is not a real syntax.

>Take record, split it by separators and enjoy.

Well, how long a record is allowed to be?

>No! Give me a syntax...

An argument in a call of a subroutine in C++.

>> First, the example is not realistic but illustrative. A real-life example 
>> would take into accout different spellings, typo errors, proper nouns, 
>> multi-word tokens etc. It would probably work with a data base, it would 
>> surely avoid unbounded strings (heap allocation) and so on and so far. I 
>> doubt that a Perl implementation of all that would be simplier or shorter 
>> than in Ada.
>
>Really? Empty words. Try and show me. In skipped example I've seen one
>attempt. Show me another - better.
>Task solved in skipped example has name - building hystorgram of words
>implementation. Why you name this task not realistic?

Because histogram is also a global method (used for I suppose sort of
clustering) which also has great limitations and is by no means an end
product of the program.

>> Second, the 80% of the example code is dealing with s/w components like 
>> containers etc. This has nothing to do with text processing. What is really 
>> dedicated to parsing is quite short and transparent.
>
>So, if that 80% of code throw out, then program will work? Or they are
>necessary though?

Not for text processing. I supposed that it does something more than
only that.

Generally, if you have a problem to solve you must first decompose it
into subproblems. You should do it properly. Surely one could use
eigenvalues and vectors to invert a matrix but this would be a *bad*
idea. To decompose some text analysing problem into a bunch of split
operations as also a *bad* idea. This is my point.

>> You might argue that Ada should have standard components standard (:-)), it 
>> is questionable, but as you see (Ada Standard Component Library) there is a 
>> work going in the direction of having that components, though maybe not as 
>> a part of the standard.
>
>So, my words have sence? Why then you argue?

Because I doubt that split should be a part of any standard library.
As I said, I count it for useless.

---
Regards,
Dmitry Kazakov
www.dmitry-kazakov.de



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-03  0:58               ` Darren New
  2002-08-03  2:04                 ` Dale Stanbrough
  2002-08-03  2:32                 ` Ted Dennison
@ 2002-08-05 13:24                 ` Stephen Leake
  2002-08-05 16:02                   ` Darren New
  2 siblings, 1 reply; 86+ messages in thread
From: Stephen Leake @ 2002-08-05 13:24 UTC (permalink / raw)


Darren New <dnew@san.rr.com> writes:

> Sure. I think the problem is that there's a host of low-efficiency
> operations in Perl that take advantage of built-in data structures. That Ada
> offers fixed strings, bounded strings, and unbounded strings indicates that
> it has a focus on efficiency that something like Perl doesn't. If Ada didn't
> have unbounded strings, people would have to keep reimplementing it. Ada
> doesn't have unbounded arrays, and people have to keep reimplementing that
> (when they need it). The assign-to-a-local-in-a-declaration doesn't really
> work well when you have long-lived arrays. 

Sounds like a good component for Grace.

> I've been working in scripting languages for the last few years, and
> I see a lack in Ada of basic simple data structures, like variable
> sized arrays, content-addressable arrays, and a few other things
> like that. I can see how someone coming from Perl could miss all
> that. Once you've written programs using built-in hashtables,
> arrays, etc, it's difficult to look at a language that doesn't use
> such things and see how to do simple things. And that it isn't built
> in means it's not going to get used everywhere it should. Even if
> you build a library for UnboundedArrays, the (pulls example out of
> left ear) MIME-parsing library isn't going to return an
> UnboundedArray compatible with the one that goes into the XML
> parser. The MIME library's output strings might be
> Ada.Strings.Unbounded, and the XML parser's input strings might be
> Ada.Strings.Unbounded, but if you want to pass the array of lines
> that's the body of the message into the array of lines that's the
> XML parser's input, you're going to need to do conversions.

Unless they all use Grace components. That's the point; Perl has a
_standard_ library for doing unbounded arrays. Ada needs one. Let's
write it!

> Yes, you *could* build all that. But from a "newbie" point of view,
> having multitasking with extensive typing and all that, but lacking
> something as simple as a variable-length array, really slows down
> learning the language, because you're constantly stumbling when
> you're trying to do *simple* stuff.

I think we need to distinguish between the "language" and the
"library". I realize languages like Perl and Java deliberately try to
confuse the two, but we don't have to buy into that.

> Of course, Ada has excellent numeric support, type support,
> multithreading, etc etc etc. It also looks like the support for
> large-scale programming is excellent, altho I haven't had a chance
> to test that out.
> 
> -- 
> Darren New 
> San Diego, CA, USA (PST). Cryptokeys on demand.
>    ** http://images.fbrtech.com/dnew/ **
> 
> They looked up at me like I was a slab of beef
>   walking into an all-you-can-eat seafood buffet.

-- 
-- Stephe



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-03  2:47                   ` Darren New
  2002-08-03 12:41                     ` Ted Dennison
  2002-08-05  9:56                     ` Lutz Donnerhacke
@ 2002-08-05 13:29                     ` Stephen Leake
  2 siblings, 0 replies; 86+ messages in thread
From: Stephen Leake @ 2002-08-05 13:29 UTC (permalink / raw)


Darren New <dnew@san.rr.com> writes:

> Tell me how you declare a variable for an array whose bounds you don't know
> until after you're past the declaration? Tell me how you add more elements
> to the end of an array?

Use SAL.Poly.Unbounded_Array. As has been said before, you just need
the appropriate library.

> You can't do something like X := X & Y as far as *I* understand.

Hmm. SAL.Poly.Unbounded_Array does not currently provide this operator
(no one has asked for it :). But it clearly can.

> > You are quite correct there. All that stuff is basicly equivalent to
> > Unbounded Lists and Maps, which is a known deficiency that will
> > (hopefully) be addressed in the next version of the language.
> 
> That's good news!  My concern is that while you *can* implement it, if it
> doesn't come with the language, you're likely to wind up having incompatible
> implementations between packages. 

Exactly. But the only way to ensure that it gets into the next version
of the language is to start implementing it now, so we have a solid
proposal.

-- 
-- Stephe



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-03  0:42                 ` Ted Dennison
  2002-08-03 13:51                   ` Robert A Duff
  2002-08-03 16:43                   ` Darren New
@ 2002-08-05 13:37                   ` Stephen Leake
  2 siblings, 0 replies; 86+ messages in thread
From: Stephen Leake @ 2002-08-05 13:37 UTC (permalink / raw)

Ted Dennison <dennison@telepath.com> writes:

> It would be nice to have a strongly-typed "make" language though. I
> can't really figure out a good way to do rule-based systems like
> rebuilding tools in Ada. So I have to learn all the gnarly dark
> corners in Make. Yech.

I agree; Gnu make is "gnarly". It is also extremely powerful; I could
not work without it. How people that only use MS VC++ manage, I don't
know :).

Maybe we should look at the internals of 'gnatmake'; it does some of
the same kinds of thing. Maybe we could write Ada scripts using the
"gnatmake library" that would replace Gnu make.

Someday, when I have lots of time ...

-- 
-- Stephe

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-05 11:50                 ` Dmitry A. Kazakov
@ 2002-08-05 14:29                   ` Larry Kilgallen
  2002-08-05 14:57                     ` Dmitry A. Kazakov
  2002-08-05 15:12                   ` Oleg Goodyckov
  2002-08-05 16:20                   ` Darren New
  2 siblings, 1 reply; 86+ messages in thread
From: Larry Kilgallen @ 2002-08-05 14:29 UTC (permalink / raw)


In article <b0osku0tktsihgp0hoih183250hq3pjhq5@4ax.com>, Dmitry A. Kazakov <mailbox@dmitry-kazakov.de> writes:
> On Fri, 2 Aug 2002 19:35:35 +0300, Oleg Goodyckov
> <og@videoproject.kiev.ua> wrote:
> 
>>On Sat, Aug 03, 2002 at 01:29:23AM +0200, Dmitry A.Kazakov wrote:

>>It is not right (as for me) to process EVERY error in input data. As for
>>me it is more effectively to process only correct data (which are reliably
>>recognized) and any other simply to drop nuffig.
> 
> Ah, that practice, which makes HTML a disaster because browsers
> silently ignore what they do not understand. The results are known.

http://validator.w3.org/



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-05 14:29                   ` Larry Kilgallen
@ 2002-08-05 14:57                     ` Dmitry A. Kazakov
  0 siblings, 0 replies; 86+ messages in thread
From: Dmitry A. Kazakov @ 2002-08-05 14:57 UTC (permalink / raw)


On 5 Aug 2002 08:29:07 -0600, Kilgallen@SpamCop.net (Larry Kilgallen)
wrote:

>In article <b0osku0tktsihgp0hoih183250hq3pjhq5@4ax.com>, Dmitry A. Kazakov <mailbox@dmitry-kazakov.de> writes:
>> Ah, that practice, which makes HTML a disaster because browsers
>> silently ignore what they do not understand. The results are known.
>
>http://validator.w3.org/

Thank you for the link!

---
Regards,
Dmitry Kazakov
www.dmitry-kazakov.de



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-05 11:50                 ` Dmitry A. Kazakov
  2002-08-05 14:29                   ` Larry Kilgallen
@ 2002-08-05 15:12                   ` Oleg Goodyckov
  2002-08-05 16:20                   ` Darren New
  2 siblings, 0 replies; 86+ messages in thread
From: Oleg Goodyckov @ 2002-08-05 15:12 UTC (permalink / raw)


On Mon, Aug 05, 2002 at 01:50:38PM +0200, Dmitry A. Kazakov wrote:
> >me it is more effectively to process only correct data (which are reliably
> >recognized) and any other simply to drop nuffig.
> 
> Ah, that practice, which makes HTML a disaster because browsers
> silently ignore what they do not understand. The results are known.

Seems, you don't like this "known" result? Why?

> The problem of all global methods is that the parameters they need
> cannot be optimal in a  large context. Split is an example. It
> requires a separator and a notion of a token which may vary from point
> to point, making the approach useless.

Baseless assertions. Again.

> I remember a project with a config file of ~2MBytes big. (it was a
> Windows registry folder). I wonder how much time it would take to
> parse it using split technique.

Why you've took so nasty example? 

> >> that as the complexity of syntax increases it becomes almost impossible at 
> >> some point to write a correct pattern and prove that it is correct.
> >
> >Which nuffig "complexity of syntax"? Syntax is - no more simplest: fields
> >with separators (of one type) between of them.
> 
> It is not a real syntax.

It is what I try to tell.

> >Take record, split it by separators and enjoy.
> 
> Well, how long a record is allowed to be?

It is no need in such constrain. Any.

> >Really? Empty words. Try and show me. In skipped example I've seen one
> >attempt. Show me another - better.
> >Task solved in skipped example has name - building hystorgram of words
> >implementation. Why you name this task not realistic?
> 
> Because histogram is also a global method (used for I suppose sort of
> clustering) which also has great limitations and is by no means an end
> product of the program.

Ok. It is answer on my second question (not very impressive, BTW). Now how 
about first - about better realization of task?

> >So, if that 80% of code throw out, then program will work? Or they are
> >necessary though?
> 
> Not for text processing. I supposed that it does something more than
> only that.
> 
> Generally, if you have a problem to solve you must first decompose it
> into subproblems. You should do it properly. Surely one could use
> eigenvalues and vectors to invert a matrix but this would be a *bad*
> idea. To decompose some text analysing problem into a bunch of split
> operations as also a *bad* idea. This is my point.

Baseless point. Exorcisms.



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-03 12:52                     ` Ted Dennison
@ 2002-08-05 15:34                       ` Ted Dennison
  0 siblings, 0 replies; 86+ messages in thread
From: Ted Dennison @ 2002-08-05 15:34 UTC (permalink / raw)


Ted Dennison <dennison@telepath.com> wrote in message news:<3D4BD17A.5000004@telepath.com>...
> There are some things you can do to help out though. For example, don't 
> declare anything locally that you don't need a separate copy of in each 
> recursive call. The same goes for passing parameters. This goes against 
> the usual rule of not using globals, but so be it. You can mitiagate the 
> maintainability pain by declaring the recursive routine inside of 
> another routine, which  contains the declarations for all the recursive 
> "globals". Also, try to code so that your algorithim is indeed 
> tail-recursive (and use the optimization).

Just to give an example, following is a rewrite of the algorithim I
gave in this post -
http://groups.google.com/groups?q=g:thl3084553553d&dq=&lr=&ie=UTF-8&selm=4519e058.0208020559.63f58040%40posting.google.com

earlier in this thread, but transformed a bit to not pass unneeded
data and to be (hopefully) tail-recursive. Again, this is uncompiled
and untested, so treat with a grain of salt.

-----------------------------------------------------------------------
-- Return a transformation of the Source string in such a way that all
-- (non-overlapping) occurances of From_Pattern are replaced by 
-- To_Pattern
function Transform (Source       : String; 
                    From_Pattern : String; 
                    To_Pattern   : String) return String is

   ------------------------------------------------------------------
   -- Return a string which is So_Far, with a transformation of Rest
   -- (based on the From_Pattern and To_Pattern) appended to it
   ------------------------------------------------------------------
   Transform_Rest (So_Far : String;
                   Rest   : String) return String is

      -- Find the location of the source pattern
      Pattern_Start : constant Natural := 
         Ada.Strings.Fixed.Index (Source  => Rest, 
                                  Pattern => From_Pattern);
   begin
      if Pattern_Start = 0 then
         return So_Far & Rest;
      else  
         -- Append the stuff before the From_Pattern, along with the
         -- To_Pattern, onto our So_Far transformed string, then have
         -- Transform_Rest transform the rest of it.
         return 
            Transform_Rest
              (So_Far => So_Far & 
                         Rest (Rest'First..Pattern_Start - 1) & 
                         To_Pattern,
               Rest   => Rest (Pattern_Start + From_Pattern'length ..
                               Rest'last)
              );
      end if;
   end Transform_Rest;
begin
   return Transform_Rest 
            (So_Far => "",
             Rest   => Source
            );
end Transform;

Note the following:

  1) We are no longer putting identical copies of the source and
target pattern on the stack with each recursive call. They are instead
global to Transform_Rest (but local to Transform). You'd be suprised
how much this can save you.

  2) We now accumulate the result string as we go, rather than tack it
onto the end of the return string.

  3) Because of 2, the recursive call no longer has any work to do
after it makes its own recursive call, other than return the result.
Therefore it is now tail-recursive.



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-05 13:24                 ` Stephen Leake
@ 2002-08-05 16:02                   ` Darren New
  0 siblings, 0 replies; 86+ messages in thread
From: Darren New @ 2002-08-05 16:02 UTC (permalink / raw)

Stephen Leake wrote:
> Sounds like a good component for Grace.

Yes.

> Unless they all use Grace components.

Yep.

> That's the point; Perl has a
> _standard_ library for doing unbounded arrays.

Well, it actually *is* part of the language, not the library, but yes the
point is that it's available everywhere Perl is. I don't worry too much the
distinction between what's part of the language and what's part of the
library available for no additional cost for every implementation of the
language. :-)

> I think we need to distinguish between the "language" and the
> "library". I realize languages like Perl and Java deliberately try to
> confuse the two, but we don't have to buy into that.

That might be worthwhile, but why? I think as long as there's a good quality
library freely available that's sufficiently good and well-known and widely
used that nobody hesitates to use its types in their own work, that's
sufficient. But arguing over whether Ada.Strings.Unbounded is part of the
language or part of the library seems unproductive. :-) If it were in Grace,
that would be great.

-- 
Darren New 
San Diego, CA, USA (PST). Cryptokeys on demand.
   ** http://images.fbrtech.com/dnew/ **

They looked up at me like I was a T-bone steak
  walking into an all-you-can-eat seafood buffet.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-05  9:56                     ` Lutz Donnerhacke
@ 2002-08-05 16:02                       ` Darren New
  2002-08-14  0:42                         ` Randy Brukardt
       [not found]                         ` <jb1vkustkugeutalhvrhv1n0k9hqn2fpip@4ax.com>
  2002-08-14  1:05                       ` Robert A Duff
       [not found]                       ` <3D4EA1AC.80D17170@s <wccofc6b66u.fsf@shell01.TheWorld.com>
  2 siblings, 2 replies; 86+ messages in thread
From: Darren New @ 2002-08-05 16:02 UTC (permalink / raw)

Lutz Donnerhacke wrote:
> 
> * Darren New wrote:
> >Tell me how you declare a variable for an array whose bounds you don't know
> >until after you're past the declaration? 

>    type Unbounded_Array is new Ada.Finalization.Controlled with record
>       array_p : array_access := null;

And at which point did you declare a and b to be an array?

>       a, b : Int_Array;

This is declaring a and b to be an access type.

Yes, you can program around the lack of unbounded arrays. The point remains
that a lot of people are used to languages where you don't have to do that.
When they ask why Ada doesn't come with that functionality, the right answer
isn't "you don't really need that." The right answer, at worst, is "take a
look at adahome.com/library/hither/yon for a package that implements
unbounded arrays; that's the right idiom for extending Ada." Saying the new
idiom is the "right" one is only effective when the new idiom really is
trivial, rather than a change in how you think about solving the problem.

And yes, I've already implemented my own unbounded arrays, thanks. :-)
They're not compatible with your unbounded arrays. *That* is the problem. I
can't easily take your library and make it work with my library, because the
fundamental underlying types, even if implemented identically, are not
compatible.

-- 
Darren New 
San Diego, CA, USA (PST). Cryptokeys on demand.
   ** http://images.fbrtech.com/dnew/ **

They looked up at me like I was a T-bone steak
  walking into an all-you-can-eat seafood buffet.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-05 11:50                 ` Dmitry A. Kazakov
  2002-08-05 14:29                   ` Larry Kilgallen
  2002-08-05 15:12                   ` Oleg Goodyckov
@ 2002-08-05 16:20                   ` Darren New
  2002-08-05 17:01                     ` Georg Bauhaus
       [not found]                     ` <slrnakv3q9.p2.lutz@taranis.iks-jena.de>
  2 siblings, 2 replies; 86+ messages in thread
From: Darren New @ 2002-08-05 16:20 UTC (permalink / raw)


"Dmitry A. Kazakov" wrote:
> I remember a project with a config file of ~2MBytes big. (it was a
> Windows registry folder). I wonder how much time it would take to
> parse it using split technique.

This is a pentium III 733MHz running Win2K:

D:\Documents and Settings\DNew\Desktop>tclsh83
% proc x {} {
set f [open saved-registry.reg]
global d
set d [read $f]
close $f
}
% time x
1943000 microseconds per iteration
% time {set y [split $d =]}
1653000 microseconds per iteration
% time {set y [split $d ,]}
3626000 microseconds per iteration
% file size saved-registry.reg
17999301
%

So you have an 18meg text version dump of the registry. Reading it took 1.9
seconds, splitting it on all the = signs took 1.6 seconds, and splitting it
on all the commas took 3.6 seconds. 

How long did it take you to write the unbounded array package? To compile
it? :-)

-- 
Darren New 
San Diego, CA, USA (PST). Cryptokeys on demand.
   ** http://images.fbrtech.com/dnew/ **

They looked up at me like I was a T-bone steak
  walking into an all-you-can-eat seafood buffet.



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-05 16:20                   ` Darren New
@ 2002-08-05 17:01                     ` Georg Bauhaus
  2002-08-05 17:48                       ` Darren New
       [not found]                     ` <slrnakv3q9.p2.lutz@taranis.iks-jena.de>
  1 sibling, 1 reply; 86+ messages in thread
From: Georg Bauhaus @ 2002-08-05 17:01 UTC (permalink / raw)


Darren New <dnew@san.rr.com> wrote:
: So you have an 18meg text version dump of the registry. Reading it took 1.9
: seconds, splitting it on all the = signs took 1.6 seconds, and splitting it
: on all the commas took 3.6 seconds. 

This says something about fast processors and a nice Tcl library
working with a lot of memory. Now the point is, I think, what
do you intend to do with a DAT-tape sized text file? When is
split useful?

The result of a split in your example is, I presume, a list in each
case (correct?). This list has a lot of entries some of which might
be of interest. Which one do you need?

In this case you would either have to know the correct index or
use a table, and then use that value, maybe named.
The values in the table are, well, lists...

And so on.

Now if your software happens to be storing matrix rows as foo-separated
lines in a text file of known-to-be-intact content then the builtin split
operation might save you some I/O programming time, o.K.

Splitting is not parsing :-)

: How long did it take you to write the unbounded array package? To compile
: it? :-)

How long will it take to write the code that actually works with
the values split into a list?

In how many cases can you assume these values are homogenous enough
to justify the omission of separate treatment?

-- Georg



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-05 17:01                     ` Georg Bauhaus
@ 2002-08-05 17:48                       ` Darren New
  2002-08-05 19:06                         ` tmoran
  0 siblings, 1 reply; 86+ messages in thread
From: Darren New @ 2002-08-05 17:48 UTC (permalink / raw)

Georg Bauhaus wrote:
> This says something about fast processors and a nice Tcl library
> working with a lot of memory. 

Why do you think it would be exceptionally slower to do it one line at a
time? Indeed, it might very well be faster, with fewer calls to realloc()
internally.

The question was "Yeah, but what if you had two meg of data?" I answered
that. It's not a problem. 

> Now the point is, I think, what
> do you intend to do with a DAT-tape sized text file?

while {-1 != [gets $dat line]} {
  set x [split $line =]
  set y [split [lindex $x 1] ,]
}

Same as you would in Ada. So? What's the point of asking the question?

You seem to be making the statement "your method is not useful because there
are some situations where you need something more sophisticated."  But that
doesn't make it useless, any more than arrays in Ada are useless just
because they have to fit in memory, or integers are useless because GNAT
can't declare one ranging from 0..2**360. 

> When is split useful?

Asking such a thing is like asking if arrays are useful, given that
sometimes you have DAT tapes that won't fit in memory. It's useful whenever
you have a string and you want to break it into substrings based on
separator characters. Happens all the time. 

> The result of a split in your example is, I presume, a list in each
> case (correct?).

Yes.

> This list has a lot of entries some of which might
> be of interest. Which one do you need?

Perhaps all of them. Perhaps only some of them. So? If I only needed a few
specific values, I'd probably already be storing them in a built-in map
(hastable) structure. :-)

"Your result is an array. Which element do you need?" It's a pointless
question. That "split" doesn't implement your entire program's semantics
shouldn't be surprising. Hashtables don't tell you what value you should be
using to look up data of interest either. That doesn't make hashtables
useless.

I'm really confused. First someone argues that you don't need variable-sized
arrays because you can use recursion to build up the array you want,
pointing out that recursion is natural in LISP. Then I introduce something
like split, and the argument is "to work with it, you might need something
like map()", which is natural in LISP. Huh?

> In this case you would either have to know the correct index or
> use a table,

Or if you're iterating over all of them for some reason, you just iterate
over all of them. How is the lack of a "split" function superior to having a
split function, if you don't know what you're looking for in either case?

> Splitting is not parsing :-)

Not very sophisticated parsing, no. So?

Note that if you expect (for example) exactly one = in the list, you do
things like

set x [split $y =]
if {2 != [llength $x]} {error "Need exactly one ="}
.... use $x ....

> : How long did it take you to write the unbounded array package? To compile
> : it? :-)
> 
> How long will it take to write the code that actually works with
> the values split into a list?

Depends what you want to do with them. Just like in Ada. So? "How long will
it take you to write the code that indexes into an array and uses the
value?"  What kind of answer are you expecting?

Part of my point in asking that was "say it takes me 2 seconds to split the
file, and your Ada can do it in 1 second, but you spend 10,000 seconds
writing the code to support that". How often are you going to run your code?

The other part of my point in asking is that you still have to write that
code, and I don't. :-) Of course, once you no longer need to write that code
because it's widely available in a widely-used library, then Ada becomes
easy for newbies to get used to again. :-)

> In how many cases can you assume these values are homogenous enough
> to justify the omission of separate treatment?

Almost always that I'm not reading from a file. When I *am* reading from a
file, it's no harder to check it's accurate when I'm using split than when
I'm using something else. 

-- 
Darren New 
San Diego, CA, USA (PST). Cryptokeys on demand.
   ** http://images.fbrtech.com/dnew/ **

They looked up at me like I was a T-bone steak
  walking into an all-you-can-eat seafood buffet.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-05 17:48                       ` Darren New
@ 2002-08-05 19:06                         ` tmoran
  2002-08-05 20:08                           ` Darren New
  0 siblings, 1 reply; 86+ messages in thread
From: tmoran @ 2002-08-05 19:06 UTC (permalink / raw)


> and your Ada can do it in 1 second, but you spend 10,000 seconds writing
> the code to support that".  How often are you going to run your code?
  Will the 1 second difference substantially cut your web server's
capacity, or cause your trading system to sell those million shares after
they've had a chance to drop another 10 cents apiece?  Will it let the
rocket crash, instead of soft landing on Mars?  Is there room in your
embedded sytem's memory for a Perl system?  Has the Perl system passed the
kind of examination and testing you would want before trusting it with
your rocket?  How long does it take to integrate Perl and that little
program with the rest of the system?
  If it's separable, run rarely, not safety or time critical, and
there's plenty of RAM & disk, then Perl certainly seems a better choice.



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-05 19:06                         ` tmoran
@ 2002-08-05 20:08                           ` Darren New
  0 siblings, 0 replies; 86+ messages in thread
From: Darren New @ 2002-08-05 20:08 UTC (permalink / raw)

tmoran@acm.org wrote:
>   If it's separable, run rarely, not safety or time critical, and
> there's plenty of RAM & disk, then Perl certainly seems a better choice.

I think this describes a great deal of software. (I'd also add something
about maintainability, especially talking about Perl but with all scripting
languages to some extent.)

On the other hand, it seems clear to me that Ada wins hands down when you
need something embedded, real-time, safe, etc.  

If you could add the convenient parts of Perl and Tcl and etc to Ada without
ruining the Ada parts, you'd be in even better shape *and* probably attract
more newbie programmers to the language.  I think Grace would help that a
bunch. :-)  {Altho nothing beats generating code on the fly inside a loop
and running it ;-}

-- 
Darren New 
San Diego, CA, USA (PST). Cryptokeys on demand.
   ** http://images.fbrtech.com/dnew/ **

They looked up at me like I was a T-bone steak
  walking into an all-you-can-eat seafood buffet.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
       [not found]                       ` <3D4FEFCB.3B74F5E5@san.rr.com>
@ 2002-08-14  0:07                         ` Randy Brukardt
  0 siblings, 0 replies; 86+ messages in thread
From: Randy Brukardt @ 2002-08-14  0:07 UTC (permalink / raw)


Darren New wrote in message <3D4FEFCB.3B74F5E5@san.rr.com>...
>For that matter, will the Ada compiler even *run* in 16 meg?


The MS-DOS version of Janus/Ada 83 runs in 640K (indeed, it can run in
about 480K of RAM).

Janus/Ada 95 on Windows never uses more than about 3 Meg of RAM (as it
was based on the aforementioned Janus/Ada 83).

               Randy.






^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-05 16:02                       ` Darren New
@ 2002-08-14  0:42                         ` Randy Brukardt
  2002-08-14  1:45                           ` Darren New
  2002-08-14 20:22                           ` Stephen Leake
       [not found]                         ` <jb1vkustkugeutalhvrhv1n0k9hqn2fpip@4ax.com>
  1 sibling, 2 replies; 86+ messages in thread
From: Randy Brukardt @ 2002-08-14  0:42 UTC (permalink / raw)


Darren New wrote in message <3D4EA1AC.80D17170@san.rr.com>...
>Yes, you can program around the lack of unbounded arrays.
>...

I think your missing the Ada philosophy that something expensive should
look expensive. An unbounded array (as you put it) is going to be
expensive, and that expensive shouldn't be covered up in glossy syntax.

Whether that philosophy is still appropriate is an interesting question.

In any case, as with many other "containers" issues, I don't see the
point. There is no advantage to even having a "unbounded arrays"
package, as an access to an array works fine, and there is little
advantage to the package (only an avoidance of memory leaks, easy to
avoid in this case).

I perfer to have packages that actually do something that makes it
worthwhile to learn their interfaces, and things like "unbounded arrays"
and "lists" just don't measure up. I'd rather build a tailored data
structure for each purpose, because then I can control the efficiency
(and the effort to write it is not that different). But I realize that
many other people feel differently (perhaps people aren't learning how
to create data structures anymore, just use them??)

>And yes, I've already implemented my own unbounded arrays, thanks. :-)
>They're not compatible with your unbounded arrays. *That* is the
problem.

I agree. These things should be packaged in the first place; they're
part of a larger abstraction in your program -- and *that* is what
should be packaged. (Breaking programs into too small chunks is just as
bad as not decomposing enough.)

But I suppose I am getting to be an old fuddy duddy in this way. I
recall thinking how annoying old programmers were when I first started
working on Janus/Ada; now (nearly 22 years later), I sound like them.
Sigh. Probably should go bag groceries. :-)

            Randy.






^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
       [not found]                           ` <3D4FF351.8F4A6C0A@san.rr.com>
@ 2002-08-14  1:03                             ` Randy Brukardt
  0 siblings, 0 replies; 86+ messages in thread
From: Randy Brukardt @ 2002-08-14  1:03 UTC (permalink / raw)


Darren New wrote in message <3D4FF351.8F4A6C0A@san.rr.com>...

I should stop commenting on week-old messages, but I've been on vacation
and can't resist...

>No. Let me ask you this. Do you think Unbounded_String should not be in
the
>standard libraries that come with every Ada compiler? Do you think that
it
>shouldn't be used, or that everyone should reprogram it themselves?
>
>If someone comes along and says "I found unbounded strings, but I can't
find
>the same thing for arrays of integers instead of arrays of characters",
do
>you *really* think the right answer is "well, you never need to do that
in
>Ada, and if you think you need to do that, you just don't know Ada well
>enough". If that's what you think the answer is, can you explain why
>unbounded arrays of characters are standard and unbounded arrays of
integers
>are not?


No, that's a silly answer. The correct answer is that "you need to use
access types for that. Ada allows only a limited amount of automatic
dynamic allocation because it is concerned about making inefficient
constructs visible" followed by an example like the one given earlier in
this thread. If you want to mention packages as well, that's fine, but
it is making an easy problem harder IMHO.

The original poster, who seems to be more concerned about how easy the
code is to write than about its performance or its maintainability, is
simply looking at the wrong tool for the job. Ada is not about writing
code quickly or easily. It is about writing code correctly for the long
haul. (i.e. the Janus/Ada compiler, which will turn 22 in early October,
or even Claw, which is about 6 1/2 years old now). The extent for which
writing Ada code for other purposes is easy is a pleasant side-effect,
not the purpose or reason.

                Randy.







^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-05  9:56                     ` Lutz Donnerhacke
  2002-08-05 16:02                       ` Darren New
@ 2002-08-14  1:05                       ` Robert A Duff
       [not found]                       ` <3D4EA1AC.80D17170@s <wccofc6b66u.fsf@shell01.TheWorld.com>
  2 siblings, 0 replies; 86+ messages in thread
From: Robert A Duff @ 2002-08-14  1:05 UTC (permalink / raw)

"Randy Brukardt" <randy@rrsoftware.com> writes:

> I think your missing the Ada philosophy that something expensive should
> look expensive. An unbounded array (as you put it) is going to be
> expensive, and that expensive shouldn't be covered up in glossy syntax.

IMHO, that philosophy is exactly the opposite of what high-level
languages are all about.  The parts of Ada that disobey this philosophy
are the better parts.

Here's an example: In Ada, one can write "A := B;".  If they are
integers, it's fast.  If they are gigantic arrays, it could be a million
times slower, or more.  That's good -- assignment uses a single notation
no matter how fast or slow.  Contrast with C, where copying arrays uses
a different syntax from copying ints.

- Bob

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-14  0:42                         ` Randy Brukardt
@ 2002-08-14  1:45                           ` Darren New
  2002-08-14 19:37                             ` Randy Brukardt
  2002-08-14 20:22                           ` Stephen Leake
  1 sibling, 1 reply; 86+ messages in thread
From: Darren New @ 2002-08-14  1:45 UTC (permalink / raw)


Randy Brukardt wrote:
> I think your missing the Ada philosophy that something expensive should
> look expensive. An unbounded array (as you put it) is going to be
> expensive, and that expensive shouldn't be covered up in glossy syntax.

So you feel that including Ada.Strings.Unbounded as a standard part of the
language was unwise? That everyone should rewrite that library so everyone
knows how difficult and inefficient it is?

-- 
Darren New 
San Diego, CA, USA (PST). Cryptokeys on demand.
   ** http://images.fbrtech.com/dnew/ **

Humility? Why would I need to show some humility?



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-14  1:45                           ` Darren New
@ 2002-08-14 19:37                             ` Randy Brukardt
  2002-08-14 20:25                               ` Stephen Leake
  0 siblings, 1 reply; 86+ messages in thread
From: Randy Brukardt @ 2002-08-14 19:37 UTC (permalink / raw)

Darren New wrote in message <3D59B62F.CB30AA51@san.rr.com>...
>Randy Brukardt wrote:
>> I think your missing the Ada philosophy that something expensive
should
>> look expensive. An unbounded array (as you put it) is going to be
>> expensive, and that expensive shouldn't be covered up in glossy
syntax.
>
>So you feel that including Ada.Strings.Unbounded as a standard part of
the
>language was unwise? That everyone should rewrite that library so
everyone
>knows how difficult and inefficient it is?

No, to both questions. Ada.Strings.Unbounded is useful because it
provides a lot of useful functionality beyond "unboundedness". The
"unboundedness" alone is (IMHO) insufficient reason for a library. That
is especially true as other types of arrays are much less likely to be
modified repeatedly than strings. And the cost of access (which requires
copying) is usually not an issue with strings (copying characters is
cheap). Moreover, other types of arrays are rarely an abstraction unto
themselves as strings are, so I believe that the entire thing (including
the unbounded array) should be encapsulated in that larger abstraction.

No, writing unbounded arrays is not difficult in Ada. The idiom for
making "unbounded" array objects in Ada is so fundamental that it should
be learned early by Ada programmers. And it is so simple that no one
should need a crutch (a library) to do it. It is ineffecient (because of
all of the calls to New and Free), and writing it explicitly makes that
obvious -- and also encourages the programmer to minimize these costs
(certainly Perl's syntax does not do that!). The only advantage of a
library is improved memory management (can't leak memory), but it makes
indexing and slicing harder and more expensive (as the data must copied
for each operation - which may be quite expensive). It is rare (other
than in quick and dirty programs, not what Ada is for) that the
performance hit in access can be justifed for the ease of memory
management.

I don't object to a library (it looks expensive enough), but I wouldn't
use it. Doesn't stop you from using it. And I don't think it would
satisify anyone that wants Perl-like consiseness - for that you would
have to have native syntax. But native syntax would definitely violate
the philosophy in this case. So, I don't see the value - it's not going
to attract scripting language users to Ada. But this isn't the only
supposedly important area of Ada that I don't see the value of;
certainly everyone's priorities differ, and I certainly wouldn't oppose
adding such a thing to the standard if a decent design was proposed.

                      Randy Brukardt.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-14  0:42                         ` Randy Brukardt
  2002-08-14  1:45                           ` Darren New
@ 2002-08-14 20:22                           ` Stephen Leake
  2002-08-15 19:24                             ` Randy Brukardt
  1 sibling, 1 reply; 86+ messages in thread
From: Stephen Leake @ 2002-08-14 20:22 UTC (permalink / raw)

"Randy Brukardt" <randy@rrsoftware.com> writes:

> In any case, as with many other "containers" issues, I don't see the
> point. There is no advantage to even having a "unbounded arrays"
> package, as an access to an array works fine, and there is little
> advantage to the package (only an avoidance of memory leaks, easy to
> avoid in this case).

I disagree. There are many little details about containers that you
have to get right, and I prefer to just get them right once, and test
them just once. 

Since I wrote the SAL containers, I have reused them many times, and
been very pleased that I did not have to rewrite and retest them.
There are 852 lines (comments and whitespace included) in
SAL.Poly.Lists.Double; that's big enough to be worth saving and
reusing.

However, there is significant overhead in the general-purpose
containers, so I agree that if I am truly concerned about memory or
speed efficiency, I will re-implement a specific container. But even
then, it's nice to have SAL available for a prototype, untill I figure
out exactly what the final data structure should be. And the SAL test
code is a first draft of the specific test code; also very useful.

> I perfer to have packages that actually do something that makes it
> worthwhile to learn their interfaces, and things like "unbounded
> arrays" and "lists" just don't measure up. I'd rather build a
> tailored data structure for each purpose, because then I can control
> the efficiency (and the effort to write it is not that different).
> But I realize that many other people feel differently (perhaps
> people aren't learning how to create data structures anymore, just
> use them??)

People should be aware of data structures, but I feel they should not
have to recreate them; they should be able to reuse them.

> >And yes, I've already implemented my own unbounded arrays, thanks.
> >:-) They're not compatible with your unbounded arrays. *That* is
> >the > problem.
> 
> I agree. These things should be packaged in the first place; they're
> part of a larger abstraction in your program -- and *that* is what
> should be packaged. (Breaking programs into too small chunks is just as
> bad as not decomposing enough.)

There certainly is a trade-off here.

> But I suppose I am getting to be an old fuddy duddy in this way. I
> recall thinking how annoying old programmers were when I first
> started working on Janus/Ada; now (nearly 22 years later), I sound
> like them. Sigh. Probably should go bag groceries. :-)

No need to go that far. Just alow us to disagree :).

-- 
-- Stephe

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-14 19:37                             ` Randy Brukardt
@ 2002-08-14 20:25                               ` Stephen Leake
  0 siblings, 0 replies; 86+ messages in thread
From: Stephen Leake @ 2002-08-14 20:25 UTC (permalink / raw)


"Randy Brukardt" <randy@rrsoftware.com> writes:

> <snip>
> ... The only advantage of a
> library is improved memory management (can't leak memory), 

Not necessarily true; libraries will of course provide other useful
functions (like iterating, concatenating).

> but it makes indexing and slicing harder and more expensive (as the
> data must copied for each operation - which may be quite expensive).
> It is rare (other than in quick and dirty programs, not what Ada is
> for) that the performance hit in access can be justifed for the ease
> of memory management.

That may be true for the applications you have written; it is _not_
true for _all_ applications.

-- 
-- Stephe



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
       [not found]                       ` <3D4EA1AC.80D17170@s <wccofc6b66u.fsf@shell01.TheWorld.com>
@ 2002-08-14 20:29                         ` Stephen Leake
  2002-08-26 17:53                           ` Robert A Duff
  0 siblings, 1 reply; 86+ messages in thread
From: Stephen Leake @ 2002-08-14 20:29 UTC (permalink / raw)


Robert A Duff <bobduff@shell01.TheWorld.com> writes:

> "Randy Brukardt" <randy@rrsoftware.com> writes:
> 
> > I think your missing the Ada philosophy that something expensive should
> > look expensive. An unbounded array (as you put it) is going to be
> > expensive, and that expensive shouldn't be covered up in glossy syntax.
> 
> IMHO, that philosophy is exactly the opposite of what high-level
> languages are all about.  The parts of Ada that disobey this philosophy
> are the better parts.
> 
> Here's an example: In Ada, one can write "A := B;".  If they are
> integers, it's fast.  If they are gigantic arrays, it could be a million
> times slower, or more.  That's good -- assignment uses a single notation
> no matter how fast or slow.  Contrast with C, where copying arrays uses
> a different syntax from copying ints.

That's not a fair example. For assignment, the cost is clearly
proportional to the size of the data, which is easily visible in the
source.

Hidden memory allocation is not clearly visible (because it's hidden
:), but is expensive.

A more valid example of hidden expense for assignment is a Controlled
type; Finalize and Adjust could be expensive. But even there, the
definition of the type clearly says "Controlled", so you are warned
that it might be expensive.

-- 
-- Stephe



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-14 20:22                           ` Stephen Leake
@ 2002-08-15 19:24                             ` Randy Brukardt
  0 siblings, 0 replies; 86+ messages in thread
From: Randy Brukardt @ 2002-08-15 19:24 UTC (permalink / raw)

Stephen Leake wrote in message ...
>"Randy Brukardt" <randy@rrsoftware.com> writes:
>
>> In any case, as with many other "containers" issues, I don't see the
>> point. There is no advantage to even having a "unbounded arrays"
>> package, as an access to an array works fine, and there is little
>> advantage to the package (only an avoidance of memory leaks, easy to
>> avoid in this case).
>
>I disagree. There are many little details about containers that you
>have to get right, and I prefer to just get them right once, and test
>them just once.

This is certainly true, but in many cases, it occurs because you are
building a general purpose library rather than something for a specific
use. That is certainly true in Claw, I'd say about 1/3 of the code is
"protection against user errors" which is simply not included in code
created for a specific use (and need not be).

>Since I wrote the SAL containers, I have reused them many times, and
>been very pleased that I did not have to rewrite and retest them.
>There are 852 lines (comments and whitespace included) in
>SAL.Poly.Lists.Double; that's big enough to be worth saving and
>reusing.

But how much of that do you actually use in your apps? My uses of lists
generally use only a handleful of operations: Declare the types (four
lines specifically for the list); Iterate (one line plus loop),
Allocate/insert at head (one line), Deallocate all (five lines), and
(more rarely) Insert at specific location or tail (seven lines). Only
the last is at all complicated. (Note that access to the items is not
included, because .all is automatically inserted by Ada -- so it
includes no code whatsoever.) I doubt that any list package is going to
save any effort here, particularly because it complicates the access to
the items so severely.

I want to make it clear that this does *not* apply to more complicated
containers like Maps, which I would probably use if I had them.

>People should be aware of data structures, but I feel they should not
>have to recreate them; they should be able to reuse them.

I disagree. This is the same principle as knowing how to divide numbers
(it is *not* pushing the divide key on your calculator) and knowing
about assembler/machine language (so that programmers have an
understanding of what is expensive and what is cheap). Everyone needs to
know how to recreate them;
its important to know how they are implemented in order to choose the
best one for the job.

...
>> But I suppose I am getting to be an old fuddy duddy in this way. I
>> recall thinking how annoying old programmers were when I first
>> started working on Janus/Ada; now (nearly 22 years later), I sound
>> like them. Sigh. Probably should go bag groceries. :-)
>
>No need to go that far. Just alow us to disagree :).

I might allow that. :-)

               Randy.

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-14 20:29                         ` Stephen Leake
@ 2002-08-26 17:53                           ` Robert A Duff
  2002-08-26 18:40                             ` Chad R. Meiners
  0 siblings, 1 reply; 86+ messages in thread
From: Robert A Duff @ 2002-08-26 17:53 UTC (permalink / raw)

Stephen Leake <stephen.a.leake.1@gsfc.nasa.gov> writes:
> Robert A Duff <bobduff@shell01.TheWorld.com> writes:
> 
> > "Randy Brukardt" <randy@rrsoftware.com> writes:
> > 
> > > I think your missing the Ada philosophy that something expensive should
> > > look expensive. An unbounded array (as you put it) is going to be
> > > expensive, and that expensive shouldn't be covered up in glossy syntax.
> > 
> > IMHO, that philosophy is exactly the opposite of what high-level
> > languages are all about.  The parts of Ada that disobey this philosophy
> > are the better parts.
> > 
> > Here's an example: In Ada, one can write "A := B;".  If they are
> > integers, it's fast.  If they are gigantic arrays, it could be a million
> > times slower, or more.  That's good -- assignment uses a single notation
> > no matter how fast or slow.  Contrast with C, where copying arrays uses
> > a different syntax from copying ints.
> 
> That's not a fair example. For assignment, the cost is clearly
> proportional to the size of the data, which is easily visible in the
> source.

Not necessarily:

    procedure P(X: String) is
        Y: String_Ptr := new String'(1..X'Length);
    begin
        Y.all := X; -- How fast is this?

The amount of data copied is probably a run-time calculated value (not
"easily visible in the source"), and is different for each call to P.
In fact, one call to P might take a million times longer than another.

But anyway, what do you think is a fair example?  My claim is that the
philosophy Randy mentioned above is *not* a good language-design
philosophy, and that Ada doesn't follow it very much.  The cases where
Ada follows that philosophy tend to be ugly, and the cases where that
philosophy is disobeyed tend to be nice and clean.

In fact, the only languages that truly obey the philosophy are assembly
languages.  To get the benefits of a high level language, I claim that
you must give up the obvious correspondence between source code and
efficiency that comes with low level languages.

> Hidden memory allocation is not clearly visible (because it's hidden
> :), but is expensive.

I think the issue of memory allocation is not efficiency, but
predictability of efficiency -- heap allocation/deallocation tends to be
less predictable than stack allocation/deallocation.  So it does make
sense to avoid hidden memory allocation in a language for real-time
systems.

> A more valid example of hidden expense for assignment is a Controlled
> type; Finalize and Adjust could be expensive. But even there, the
> definition of the type clearly says "Controlled", so you are warned
> that it might be expensive.

I suppose, although you have to look into private parts to find that
information, and at all the component types.  And you have to know how
your compiler does it -- compilers differ wildly in the cost of
finalization.  That's unfortunate, but I don't see how to avoid that
problem in a high-level language.

- Bob

^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-26 17:53                           ` Robert A Duff
@ 2002-08-26 18:40                             ` Chad R. Meiners
  2002-08-26 18:52                               ` Robert A Duff
  0 siblings, 1 reply; 86+ messages in thread
From: Chad R. Meiners @ 2002-08-26 18:40 UTC (permalink / raw)



"Robert A Duff" <bobduff@shell01.TheWorld.com> wrote in message
news:wccd6s5y09r.fsf@shell01.TheWorld.com...
> Not necessarily:
>
>     procedure P(X: String) is
>         Y: String_Ptr := new String'(1..X'Length);
>     begin
>         Y.all := X; -- How fast is this?
>
> The amount of data copied is probably a run-time calculated value (not
> "easily visible in the source"), and is different for each call to P.
> In fact, one call to P might take a million times longer than another.

Yes, but the time bound on the assignment is blatantly visible; thus your
"Not necessarily" doesn't hold for this example.





^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-26 18:40                             ` Chad R. Meiners
@ 2002-08-26 18:52                               ` Robert A Duff
  2002-08-26 21:46                                 ` Chad R. Meiners
  0 siblings, 1 reply; 86+ messages in thread
From: Robert A Duff @ 2002-08-26 18:52 UTC (permalink / raw)


"Chad R. Meiners" <crmeiners@hotmail.com> writes:

> "Robert A Duff" <bobduff@shell01.TheWorld.com> wrote in message
> news:wccd6s5y09r.fsf@shell01.TheWorld.com...
> > Not necessarily:
> >
> >     procedure P(X: String) is
> >         Y: String_Ptr := new String'(1..X'Length);
> >     begin
> >         Y.all := X; -- How fast is this?
> >
> > The amount of data copied is probably a run-time calculated value (not
> > "easily visible in the source"), and is different for each call to P.
> > In fact, one call to P might take a million times longer than another.
> 
> Yes, but the time bound on the assignment is blatantly visible; thus your
> "Not necessarily" doesn't hold for this example.

It's not blatantly visible to me.  Please explain.

Unless you mean that it copies at most Integer'Last bytes, which is not
a very *useful* bound.  More to the point, this time bound is very
different from the time bound for ":=" on, say, Integers.  Yet both use
the same ":=" notation.

- Bob



^ permalink raw reply	[flat|nested] 86+ messages in thread

* Re: FAQ and string functions
  2002-08-26 18:52                               ` Robert A Duff
@ 2002-08-26 21:46                                 ` Chad R. Meiners
  0 siblings, 0 replies; 86+ messages in thread
From: Chad R. Meiners @ 2002-08-26 21:46 UTC (permalink / raw)



"Robert A Duff" <bobduff@shell01.TheWorld.com> wrote in message
news:wcc4rdhxxj8.fsf@shell01.TheWorld.com...
> It's not blatantly visible to me.  Please explain.

Well given the input of size X'Length, the assignment take c * X'Length
time.

> Unless you mean that it copies at most Integer'Last bytes, which is not
> a very *useful* bound.  More to the point, this time bound is very
> different from the time bound for ":=" on, say, Integers.  Yet both use
> the same ":=" notation.

Well I meant that it copies at most X'Length bytes.  This is useful since we
can see that the complexity of the assignment is linear on the input of P.
Note that with the following procedure

Procedure Z(X : Integer) is
    Y : Integer;
begin
    Y := X;
    ...

the assignment is also linear on the input of Z.  So they actually have the
same time bound (linear).  The difference lies in the fact that the P's
input size is variable while Z's is not.

-CRM





^ permalink raw reply	[flat|nested] 86+ messages in thread

end of thread, other threads:[~2002-08-26 21:46 UTC | newest]

Thread overview: 86+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-07-30  6:32 FAQ and string functions Oleg Goodyckov
2002-07-30  8:52 ` Colin Paul Gloster
2002-07-30 13:48 ` Ted Dennison
2002-07-31  4:52   ` Brian May
2002-08-01 16:09     ` Ted Dennison
2002-08-02  0:21       ` Brian May
2002-08-02  1:56         ` tmoran
2002-08-02 13:59         ` Ted Dennison
2002-07-31  7:46   ` Oleg Goodyckov
2002-07-31  9:04     ` Lutz Donnerhacke
2002-07-31  9:39       ` Pascal Obry
2002-07-31 15:06         ` Oleg Goodyckov
2002-07-31 16:50       ` Oleg Goodyckov
2002-07-31 20:16     ` Simon Wright
2002-07-31 20:56       ` Robert A Duff
2002-08-01  0:11         ` Darren New
2002-08-01  1:08           ` tmoran
2002-08-01  9:25           ` Brian May
2002-08-01 11:20           ` Oleg Goodyckov
2002-08-01 15:43             ` Darren New
2002-08-01 21:37               ` Robert A Duff
2002-08-03  0:42                 ` Ted Dennison
2002-08-03 13:51                   ` Robert A Duff
2002-08-03 16:43                   ` Darren New
2002-08-05 13:37                   ` Stephen Leake
2002-08-02  8:01               ` Oleg Goodyckov
2002-08-02 16:09                 ` Darren New
2002-08-01 11:09         ` Oleg Goodyckov
2002-08-01 14:08           ` Frank J. Lhota
2002-08-01 15:06             ` Robert A Duff
2002-08-01 16:05             ` Oleg Goodyckov
2002-08-01 14:57         ` Georg Bauhaus
2002-07-31 22:04     ` Dmitry A.Kazakov
2002-07-31 15:23       ` Oleg Goodyckov
2002-08-01 21:57         ` Dmitry A.Kazakov
2002-08-01 13:10           ` Oleg Goodyckov
2002-08-02 23:29             ` Dmitry A.Kazakov
2002-08-02 16:35               ` Oleg Goodyckov
2002-08-05 11:50                 ` Dmitry A. Kazakov
2002-08-05 14:29                   ` Larry Kilgallen
2002-08-05 14:57                     ` Dmitry A. Kazakov
2002-08-05 15:12                   ` Oleg Goodyckov
2002-08-05 16:20                   ` Darren New
2002-08-05 17:01                     ` Georg Bauhaus
2002-08-05 17:48                       ` Darren New
2002-08-05 19:06                         ` tmoran
2002-08-05 20:08                           ` Darren New
     [not found]                     ` <slrnakv3q9.p2.lutz@taranis.iks-jena.de>
     [not found]                       ` <3D4FEFCB.3B74F5E5@san.rr.com>
2002-08-14  0:07                         ` Randy Brukardt
2002-08-01 14:29     ` Ted Dennison
2002-08-01 16:47       ` Oleg Goodyckov
2002-08-02 14:05         ` Ted Dennison
2002-08-02 16:11           ` Darren New
2002-08-03  0:30             ` Ted Dennison
2002-08-03  0:58               ` Darren New
2002-08-03  2:04                 ` Dale Stanbrough
2002-08-03  2:32                 ` Ted Dennison
2002-08-03  2:47                   ` Darren New
2002-08-03 12:41                     ` Ted Dennison
2002-08-03 16:53                       ` Darren New
2002-08-04  1:08                         ` Ted Dennison
2002-08-04 16:23                           ` Darren New
2002-08-05  2:16                             ` Robert Dewar
2002-08-05  3:45                               ` Darren New
2002-08-05  9:56                     ` Lutz Donnerhacke
2002-08-05 16:02                       ` Darren New
2002-08-14  0:42                         ` Randy Brukardt
2002-08-14  1:45                           ` Darren New
2002-08-14 19:37                             ` Randy Brukardt
2002-08-14 20:25                               ` Stephen Leake
2002-08-14 20:22                           ` Stephen Leake
2002-08-15 19:24                             ` Randy Brukardt
     [not found]                         ` <jb1vkustkugeutalhvrhv1n0k9hqn2fpip@4ax.com>
     [not found]                           ` <3D4FF351.8F4A6C0A@san.rr.com>
2002-08-14  1:03                             ` Randy Brukardt
2002-08-14  1:05                       ` Robert A Duff
     [not found]                       ` <3D4EA1AC.80D17170@s <wccofc6b66u.fsf@shell01.TheWorld.com>
2002-08-14 20:29                         ` Stephen Leake
2002-08-26 17:53                           ` Robert A Duff
2002-08-26 18:40                             ` Chad R. Meiners
2002-08-26 18:52                               ` Robert A Duff
2002-08-26 21:46                                 ` Chad R. Meiners
2002-08-05 13:29                     ` Stephen Leake
2002-08-03  5:07                   ` achrist
2002-08-03 12:52                     ` Ted Dennison
2002-08-05 15:34                       ` Ted Dennison
2002-08-05 13:24                 ` Stephen Leake
2002-08-05 16:02                   ` Darren New
2002-08-05  7:18           ` Oleg Goodyckov
2002-08-02  1:04     ` tmoran

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox