From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,MAILING_LIST_MULTI,
	REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,b34ecb04700058dd
X-Google-Attributes: gid103376,public
X-Google-ArrivalTime: 2002-11-14 06:57:03 PST
Path: 
 archiver1.google.com!news1.google.com!newsfeed.stanford.edu!skynet.be!skynet.be!freenix!enst.fr!not-for-mail
From: "David C. Hoos" <david.c.hoos.sr@ada95.com>
Newsgroups: comp.lang.ada
Subject: Re: how to parse words from a string
Date: Thu, 14 Nov 2002 08:56:25 -0600
Organization: ENST, France
Sender: comp.lang.ada-admin@ada.eu.org
Message-ID: <mailman.1037285822.25493.comp.lang.ada@ada.eu.org>
References: <a04a773e.0211121133.2f662b3e@posting.google.com>
 <aqv0q2$ic1$1@msunews.cl.msu.edu> <L2EA9.21326$nB.2140@sccrnsc03>
 <mailman.1037243581.1145.comp.lang.ada@ada.eu.org>
 <recvqa.c84.ln@beastie.ix.netcom.com>
 <a04a773e.0211140540.5bcaa2ad@posting.google.com>
Reply-To: comp.lang.ada@ada.eu.org
NNTP-Posting-Host: marvin.enst.fr
Mime-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Trace: avanie.enst.fr 1037285822 11293 137.194.161.2 (14 Nov 2002 14:57:02
 GMT)
X-Complaints-To: usenet@enst.fr
NNTP-Posting-Date: Thu, 14 Nov 2002 14:57:02 +0000 (UTC)
Return-Path: <david.c.hoos.sr@ada95.com>
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2720.3000
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000
Errors-To: comp.lang.ada-admin@ada.eu.org
X-BeenThere: comp.lang.ada@ada.eu.org
X-Mailman-Version: 2.0.13
Precedence: bulk
List-Unsubscribe: <http://ada.eu.org/mailman/listinfo/comp.lang.ada>,
	<mailto:comp.lang.ada-request@ada.eu.org?subject=unsubscribe>
List-Id: comp.lang.ada mail<->news gateway <comp.lang.ada.ada.eu.org>
List-Post: <mailto:comp.lang.ada@ada.eu.org>
List-Help: <mailto:comp.lang.ada-request@ada.eu.org?subject=help>
List-Subscribe: <http://ada.eu.org/mailman/listinfo/comp.lang.ada>,
	<mailto:comp.lang.ada-request@ada.eu.org?subject=subscribe>
Errors-To: comp.lang.ada-admin@ada.eu.org
X-BeenThere: comp.lang.ada@ada.eu.org
X-Original-Cc: mabes180@aol.com
Xref: archiver1.google.com comp.lang.ada:30874
Date: 2002-11-14T08:56:25-06:00


----- Original Message -----
From: "Sarah Thomas" <mabes180@aol.com>
Newsgroups: comp.lang.ada
To: <comp.lang.ada@ada.eu.org>
Sent: Thursday, November 14, 2002 7:40 AM
Subject: Re: how to parse words from a string


> Interesting follow ups! thanks for the input and help !
> I have succesfully extracted words from a string.
> I read each line in from the file
> and used find_token, followed by slice, followed by deleting the word
> from the string..and then stored them in a fixed array for now..
> this is just an outline of how i did it.....
>
> loop
> Find_Token(Temp, Ada.Strings.Maps.To_Set(Ada.Characters.Latin_1.HT),
> Ada.Strings.Outside, From, To);
> exit when To = 0;
>
> My_word := To_Unbounded_String(Slice(Temp, From, To));
> Put_line(Output_File,To_String(My_word));
>
> IF (Length(temp) /= To ) then
> Delete(Temp, 1, To + 1);
> else
> Delete(Temp, 1, To);
> end if;
>
> Store_data(Line_Number,Word_Number) := (My_Word);
>
> end loop;
You could have saved the work of repeatedly deleting from the
Temp string by initially setting To := Length (Temp) - 1;.

Then if you make your call to Find_Token like this:

Find_Token(Slice (Temp, To + 1, Length (Temp)),
Ada.Strings.Maps.To_Set(Ada.Characters.Latin_1.HT),
Ada.Strings.Outside, From, To);

you just specify the unprocessed slice of Temp each time you look for a new
token.

I'm sure you realize you could have declared a String of word delimiters to
include
spaces, punctuation marks and other white space characters in the set of
possible
word delimiters, and used that String to initialize a Word_Delimiter_Set --
e.g.:

Word_Delimiter_Set : constant Ada.Strings.Maps.Character_Set :=
Ada.Strings.Maps.To_Set
(" ,.;:/!&()" & Ada.Characters.Latin_1.HT & Ada.Characters.Latin_1.FF);