comp.lang.ada
 help / color / mirror / Atom feed
* getting words from file
@ 2003-03-08  3:58 cookie
  2003-03-08  4:36 ` David C. Hoos, Sr.
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: cookie @ 2003-03-08  3:58 UTC (permalink / raw)


Does anyone know how I would store a word in a specific variable after
a space occurs inside a textfile? (say there was a line in the
textfile like this: one two three four five... I'd want it to get
these words and store it in an already defined var like varOne varTwo
and so on.

This is the idea I was playing with to get the actual word (scans
through the characters until it hits a space and tries to merge all of
those characters into a word): http://www.guff.org/ada.txt ..or am I
way off?



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: getting words from file
  2003-03-08  3:58 cookie
@ 2003-03-08  4:36 ` David C. Hoos, Sr.
  2003-03-08 15:38   ` cookie
  2003-03-08 15:39 ` Mark Biggar
  2003-03-08 19:16 ` Jeffrey Carter
  2 siblings, 1 reply; 10+ messages in thread
From: David C. Hoos, Sr. @ 2003-03-08  4:36 UTC (permalink / raw)
  To: comp.lang.ada mail to news gateway

Look at the procedure Ada.Strings.Fixed.Find_Token.

Tokens are groups of characters delimited by certain other characters.

For example if you specify whitespace characters (e.g., space, tab, etc.)
plus punctuation marks as delimiters, then the found tokens will be words,
if the file is ordinary text.

Find_Token will tell you the index of the first and last characters
of the next token after some starting index.  Thus, after you find a token
you start looking at the next character following the last character of the
token just found, you'll find the next token, and so on.

----- Original Message ----- 
From: "cookie" <ggroups@guff.org>
Newsgroups: comp.lang.ada
To: <comp.lang.ada@ada.eu.org>
Sent: March 07, 2003 9:58 PM
Subject: getting words from file


> Does anyone know how I would store a word in a specific variable after
> a space occurs inside a textfile? (say there was a line in the
> textfile like this: one two three four five... I'd want it to get
> these words and store it in an already defined var like varOne varTwo
> and so on.
> 
> This is the idea I was playing with to get the actual word (scans
> through the characters until it hits a space and tries to merge all of
> those characters into a word): http://www.guff.org/ada.txt ..or am I
> way off?
> _______________________________________________
> comp.lang.ada mailing list
> comp.lang.ada@ada.eu.org
> http://ada.eu.org/mailman/listinfo/comp.lang.ada
> 
> 




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: getting words from file
  2003-03-08  4:36 ` David C. Hoos, Sr.
@ 2003-03-08 15:38   ` cookie
  2003-03-08 19:51     ` Pascal Obry
  2003-03-08 19:52     ` Pascal Obry
  0 siblings, 2 replies; 10+ messages in thread
From: cookie @ 2003-03-08 15:38 UTC (permalink / raw)


Sorry.. where can I find the Ada.Strings.Fixed.Find_Token procedure?

Cheers


"David C. Hoos, Sr." <david.c.hoos.sr@ada95.com> wrote in message news:<mailman.21.1047098216.4331.comp.lang.ada@ada.eu.org>...
> Look at the procedure Ada.Strings.Fixed.Find_Token.
> 
> Tokens are groups of characters delimited by certain other characters.
> 
> For example if you specify whitespace characters (e.g., space, tab, etc.)
> plus punctuation marks as delimiters, then the found tokens will be words,
> if the file is ordinary text.
> 
> Find_Token will tell you the index of the first and last characters
> of the next token after some starting index.  Thus, after you find a token
> you start looking at the next character following the last character of the
> token just found, you'll find the next token, and so on.
> 
> ----- Original Message ----- 
> From: "cookie" <ggroups@guff.org>
> Newsgroups: comp.lang.ada
> To: <comp.lang.ada@ada.eu.org>
> Sent: March 07, 2003 9:58 PM
> Subject: getting words from file
> 
> 
> > Does anyone know how I would store a word in a specific variable after
> > a space occurs inside a textfile? (say there was a line in the
> > textfile like this: one two three four five... I'd want it to get
> > these words and store it in an already defined var like varOne varTwo
> > and so on.
> > 
> > This is the idea I was playing with to get the actual word (scans
> > through the characters until it hits a space and tries to merge all of
> > those characters into a word): http://www.guff.org/ada.txt ..or am I
> > way off?
> > _______________________________________________
> > comp.lang.ada mailing list
> > comp.lang.ada@ada.eu.org
> > http://ada.eu.org/mailman/listinfo/comp.lang.ada
> > 
> >



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: getting words from file
  2003-03-08  3:58 cookie
  2003-03-08  4:36 ` David C. Hoos, Sr.
@ 2003-03-08 15:39 ` Mark Biggar
  2003-03-08 17:40   ` Matthew Heaney
  2003-03-08 19:16 ` Jeffrey Carter
  2 siblings, 1 reply; 10+ messages in thread
From: Mark Biggar @ 2003-03-08 15:39 UTC (permalink / raw)


cookie wrote:
> Does anyone know how I would store a word in a specific variable after
> a space occurs inside a textfile? (say there was a line in the
> textfile like this: one two three four five... I'd want it to get
> these words and store it in an already defined var like varOne varTwo
> and so on.
> 
> This is the idea I was playing with to get the actual word (scans
> through the characters until it hits a space and tries to merge all of
> those characters into a word): http://www.guff.org/ada.txt ..or am I
> way off?

First you probably want an array of words not singlton variables.

There are two other main considerations needed for this problem.

The first is that as the words in the file are variable length,
besides the actual characters of each word you will also need to
store its length (or equivalent) somewhere.

The second is related to the first: how are you going to handle a word
that is larger then you expect?

The last time I had to solve a similar problem I didn't store the
words in the variables at all.  I read the whole file into one big
string and then kept an array of records that recorded the start and end
position of each word into that large string.  I then used slicing
to extract the actual characters as needed.

-- 
Mark Biggar
mark.a.biggar@attbi.com




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: getting words from file
  2003-03-08 15:39 ` Mark Biggar
@ 2003-03-08 17:40   ` Matthew Heaney
  0 siblings, 0 replies; 10+ messages in thread
From: Matthew Heaney @ 2003-03-08 17:40 UTC (permalink / raw)


Mark Biggar <mark.a.biggar@attbi.com> wrote in message news:<%8oaa.11967$F1.92@sccrnsc04>...
> 
> First you probably want an array of words not singlton variables.

But an array has fixed size.  You'll need a data structure that
expands as necessary, such as a vector or list.

The Charles library has both of these data structures:

http://home.earthlink.net/~matthewjheaney/charles/index.html

For example:

with Charles.Strings.Unbounded;
package Word_Lists is
   new Charles.Lists.Unbounded 
    (Charles.Strings.Unbounded.Container_Type);

Now you can say something like:

declare
   Words : Word_Lists.Container_Type;
begin
   loop
      <consume whitespace>

      Push_Back 
        (Container => Words, 
         Item => Charles.Strings.To_Container (Length => 0));

      declare
         Word : Charles.Strings.Container_Type renames 
            To_Access (Last (Words)).all;

         C : Character;
      begin
         loop
            Get (C);
             
            if <C is whitespace then
               exit;
            end if;

            Push_Back (Word, C);
         end loop;
      end;
   end loop;
end;


> The first is that as the words in the file are variable length,
> besides the actual characters of each word you will also need to
> store its length (or equivalent) somewhere.

You need an unbounded string.  You can either use
Ada.Strings.Unbounded or Charles.Strings.Unbounded.



> The second is related to the first: how are you going to handle a word
> that is larger then you expect?

Doesn't matter if you use an unbounded string.


> The last time I had to solve a similar problem I didn't store the
> words in the variables at all.  I read the whole file into one big
> string and then kept an array of records that recorded the start and end
> position of each word into that large string.  I then used slicing
> to extract the actual characters as needed.

Suppose your file is big?  How do you even know how big to allocate
the array?

This problem calls for unbounded data structures, not fixed-length
arrays.

Note that to use Get_Token, you'll probably have to use Get_Line to
read in an entire line, and then extract words from the line.

It would be nice to be able to say "read a whitespace-delimited
lexeme" from the file; that's that the string extractor for an istream
does in C++:

string s;

while (fin >> s)
   <do something with s>

I should add something like this to Charles.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: getting words from file
@ 2003-03-08 18:42 David C. Hoos, Sr.
  0 siblings, 0 replies; 10+ messages in thread
From: David C. Hoos, Sr. @ 2003-03-08 18:42 UTC (permalink / raw)
  To: comp.lang.ada mail to news gateway; +Cc: ggroups


----- Original Message -----
From: "cookie" <ggroups@guff.org>
Newsgroups: comp.lang.ada
To: <comp.lang.ada@ada.eu.org>
Sent: March 08, 2003 9:38 AM
Subject: Re: getting words from file


> Sorry.. where can I find the Ada.Strings.Fixed.Find_Token procedure?

It comes with your ada compiler.

>
> Cheers
>
>
> "David C. Hoos, Sr." <david.c.hoos.sr@ada95.com> wrote in message
news:<mailman.21.1047098216.4331.comp.lang.ada@ada.eu.org>...
> > Look at the procedure Ada.Strings.Fixed.Find_Token.
> >
> > Tokens are groups of characters delimited by certain other characters.
> >
> > For example if you specify whitespace characters (e.g., space, tab, etc.)
> > plus punctuation marks as delimiters, then the found tokens will be words,
> > if the file is ordinary text.
> >
> > Find_Token will tell you the index of the first and last characters
> > of the next token after some starting index.  Thus, after you find a token
> > you start looking at the next character following the last character of the
> > token just found, you'll find the next token, and so on.
> >
> > ----- Original Message -----
> > From: "cookie" <ggroups@guff.org>
> > Newsgroups: comp.lang.ada
> > To: <comp.lang.ada@ada.eu.org>
> > Sent: March 07, 2003 9:58 PM
> > Subject: getting words from file
> >
> >
> > > Does anyone know how I would store a word in a specific variable after
> > > a space occurs inside a textfile? (say there was a line in the
> > > textfile like this: one two three four five... I'd want it to get
> > > these words and store it in an already defined var like varOne varTwo
> > > and so on.
> > >
> > > This is the idea I was playing with to get the actual word (scans
> > > through the characters until it hits a space and tries to merge all of
> > > those characters into a word): http://www.guff.org/ada.txt ..or am I
> > > way off?
> > > _______________________________________________
> > > comp.lang.ada mailing list
> > > comp.lang.ada@ada.eu.org
> > > http://ada.eu.org/mailman/listinfo/comp.lang.ada
> > >
> > >
> _______________________________________________
> comp.lang.ada mailing list
> comp.lang.ada@ada.eu.org
> http://ada.eu.org/mailman/listinfo/comp.lang.ada
>
>





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: getting words from file
  2003-03-08  3:58 cookie
  2003-03-08  4:36 ` David C. Hoos, Sr.
  2003-03-08 15:39 ` Mark Biggar
@ 2003-03-08 19:16 ` Jeffrey Carter
  2 siblings, 0 replies; 10+ messages in thread
From: Jeffrey Carter @ 2003-03-08 19:16 UTC (permalink / raw)


cookie wrote:
 > Does anyone know how I would store a word in a specific variable after
 > a space occurs inside a textfile? (say there was a line in the
 > textfile like this: one two three four five... I'd want it to get
 > these words and store it in an already defined var like varOne varTwo
 > and so on.

If you define words as separated by whitespace, then look at
PragmARC.Word_Input, available from

http://home.earthlink.net/~jrcarter010/pragmarc.htm

which does just that.

-- 
Jeff Carter
"Son of a window-dresser."
Monty Python & the Holy Grail




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: getting words from file
  2003-03-08 15:38   ` cookie
@ 2003-03-08 19:51     ` Pascal Obry
       [not found]       ` <canqj-ff3.ln1@beastie.ix.netcom.com>
  2003-03-08 19:52     ` Pascal Obry
  1 sibling, 1 reply; 10+ messages in thread
From: Pascal Obry @ 2003-03-08 19:51 UTC (permalink / raw)



ggroups@guff.org (cookie) writes:

> Sorry.. where can I find the Ada.Strings.Fixed.Find_Token procedure?

In  Ada.Strings.Fixed :)

Pascal.

-- 

--|------------------------------------------------------
--| Pascal Obry                           Team-Ada Member
--| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE
--|------------------------------------------------------
--|         http://perso.wanadoo.fr/pascal.obry
--| "The best way to travel is by means of imagination"
--|
--| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: getting words from file
  2003-03-08 15:38   ` cookie
  2003-03-08 19:51     ` Pascal Obry
@ 2003-03-08 19:52     ` Pascal Obry
  1 sibling, 0 replies; 10+ messages in thread
From: Pascal Obry @ 2003-03-08 19:52 UTC (permalink / raw)



ggroups@guff.org (cookie) writes:

> Sorry.. where can I find the Ada.Strings.Fixed.Find_Token procedure?

Maybe you should start reading some books or at least Annex A of the Reference
Manual !

Pascal.

-- 

--|------------------------------------------------------
--| Pascal Obry                           Team-Ada Member
--| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE
--|------------------------------------------------------
--|         http://perso.wanadoo.fr/pascal.obry
--| "The best way to travel is by means of imagination"
--|
--| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: getting words from file
       [not found]       ` <canqj-ff3.ln1@beastie.ix.netcom.com>
@ 2003-03-09 13:42         ` Marin David Condic
  0 siblings, 0 replies; 10+ messages in thread
From: Marin David Condic @ 2003-03-09 13:42 UTC (permalink / raw)


It might be helpful to observe that Ada's string handling capabilities are
documented in Appendix A.4 of the Ada Reference Manual. Also that if the OP
is using Gnat he almost certainly has a copy on-line but if one is needed,
going to www.AdaPower.com is where it could be obtained.

Its always a good idea to send the newbies to the ARM since they stand a
good chance of getting accurate answers there and will likely find lots of
other cool stuff they can use in the process.

MDC
--
======================================================================
Marin David Condic
I work for: http://www.belcan.com/
My project is: http://www.jsf.mil/

Send Replies To: m c o n d i c @ a c m . o r g

    "Going cold turkey isn't as delicious as it sounds."
        -- H. Simpson
======================================================================

Dennis Lee Bieber <wlfraed@ix.netcom.com> wrote in message
news:canqj-ff3.ln1@beastie.ix.netcom.com...
>         Which will, of course, be found as a branch off of Ada.Strings,
said
> being a branch off Ada -- the last being the standard "library"
> supplied with the compiler.
>






^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2003-03-09 13:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-03-08 18:42 getting words from file David C. Hoos, Sr.
  -- strict thread matches above, loose matches on Subject: below --
2003-03-08  3:58 cookie
2003-03-08  4:36 ` David C. Hoos, Sr.
2003-03-08 15:38   ` cookie
2003-03-08 19:51     ` Pascal Obry
     [not found]       ` <canqj-ff3.ln1@beastie.ix.netcom.com>
2003-03-09 13:42         ` Marin David Condic
2003-03-08 19:52     ` Pascal Obry
2003-03-08 15:39 ` Mark Biggar
2003-03-08 17:40   ` Matthew Heaney
2003-03-08 19:16 ` Jeffrey Carter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox