comp.lang.ada
 help / color / mirror / Atom feed
From: mheaney@on2.com (Matthew Heaney)
Subject: Re: getting words from file
Date: 8 Mar 2003 09:40:18 -0800
Date: 2003-03-08T17:40:18+00:00	[thread overview]
Message-ID: <1ec946d1.0303080940.3fac2464@posting.google.com> (raw)
In-Reply-To: %8oaa.11967$F1.92@sccrnsc04

Mark Biggar <mark.a.biggar@attbi.com> wrote in message news:<%8oaa.11967$F1.92@sccrnsc04>...
> 
> First you probably want an array of words not singlton variables.

But an array has fixed size.  You'll need a data structure that
expands as necessary, such as a vector or list.

The Charles library has both of these data structures:

http://home.earthlink.net/~matthewjheaney/charles/index.html

For example:

with Charles.Strings.Unbounded;
package Word_Lists is
   new Charles.Lists.Unbounded 
    (Charles.Strings.Unbounded.Container_Type);

Now you can say something like:

declare
   Words : Word_Lists.Container_Type;
begin
   loop
      <consume whitespace>

      Push_Back 
        (Container => Words, 
         Item => Charles.Strings.To_Container (Length => 0));

      declare
         Word : Charles.Strings.Container_Type renames 
            To_Access (Last (Words)).all;

         C : Character;
      begin
         loop
            Get (C);
             
            if <C is whitespace then
               exit;
            end if;

            Push_Back (Word, C);
         end loop;
      end;
   end loop;
end;


> The first is that as the words in the file are variable length,
> besides the actual characters of each word you will also need to
> store its length (or equivalent) somewhere.

You need an unbounded string.  You can either use
Ada.Strings.Unbounded or Charles.Strings.Unbounded.



> The second is related to the first: how are you going to handle a word
> that is larger then you expect?

Doesn't matter if you use an unbounded string.


> The last time I had to solve a similar problem I didn't store the
> words in the variables at all.  I read the whole file into one big
> string and then kept an array of records that recorded the start and end
> position of each word into that large string.  I then used slicing
> to extract the actual characters as needed.

Suppose your file is big?  How do you even know how big to allocate
the array?

This problem calls for unbounded data structures, not fixed-length
arrays.

Note that to use Get_Token, you'll probably have to use Get_Line to
read in an entire line, and then extract words from the line.

It would be nice to be able to say "read a whitespace-delimited
lexeme" from the file; that's that the string extractor for an istream
does in C++:

string s;

while (fin >> s)
   <do something with s>

I should add something like this to Charles.



  reply	other threads:[~2003-03-08 17:40 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-03-08  3:58 getting words from file cookie
2003-03-08  4:36 ` David C. Hoos, Sr.
2003-03-08 15:38   ` cookie
2003-03-08 19:51     ` Pascal Obry
     [not found]       ` <canqj-ff3.ln1@beastie.ix.netcom.com>
2003-03-09 13:42         ` Marin David Condic
2003-03-08 19:52     ` Pascal Obry
2003-03-08 15:39 ` Mark Biggar
2003-03-08 17:40   ` Matthew Heaney [this message]
2003-03-08 19:16 ` Jeffrey Carter
  -- strict thread matches above, loose matches on Subject: below --
2003-03-08 18:42 David C. Hoos, Sr.
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox