comp.lang.ada
 help / color / mirror / Atom feed
From: "David C. Hoos" <david.c.hoos.sr@ada95.com>
To: "wave" <mutilation@bonbon.net>,
	"comp.lang.ada@ada.eu.org" <comp.lang.ada@ada-france.org>
Subject: Re: Word counting
Date: Thu, 11 Dec 2003 16:45:05 -0600
Date: 2003-12-11T16:45:05-06:00	[thread overview]
Message-ID: <mailman.100.1071182728.31149.comp.lang.ada@ada-france.org> (raw)
In-Reply-To: 4d01ad29.0312111401.32ec5297@posting.google.com

Here is some code I originally posted On March 8, 2003 which
does the word parsing using the facilites of the Ada language
standard libraries.

The function "Words" returns an array with an element for
each word in the line.  Each array element contains the first
and last indices of each word.  This would make determination
of the length of each word very easy.

package Word_Parser
is

   type Word_Boundaries is record
      First : Positive;
      Last  : Natural;
   end record;

   type Word_Boundaries_Array is
     array (Positive range <>) of Word_Boundaries;

   -- Limitation: No more than 1024 words per text string.
   function Words (Text : String) return Word_Boundaries_Array;

end Word_Parser;
with Ada.Strings.Fixed;
with Ada.Strings.Maps;
package body Word_Parser is

   Whitespace : constant String := ' ' &
     ASCII.Ht & ASCII.Cr & ASCII.LF;
   Punctuation : constant String := ",./?<>:;'""[]{}!@#$%^&*()_+|-=\~~";
   Delimiters : constant Ada.Strings.Maps.Character_Set :=
     Ada.Strings.Maps.To_Set (Whitespace & Punctuation);

   -----------
   -- Words --
   -----------

   function Words (Text : String) return Word_Boundaries_Array
   is
      Word_Boundaries_List : Word_Boundaries_Array (1 .. 1024);
      Word_Count : Natural := 0;
      First : Positive := Text'First;
   begin
      loop
         Ada.Strings.Fixed.Find_Token
           (Source => Text (First .. Text'Last),
            Set    => Delimiters,
            Test   => Ada.Strings.Outside,
            First  => Word_Boundaries_List (Word_Count + 1).First,
            Last   => Word_Boundaries_List (Word_Count + 1).Last);
         exit when Word_Boundaries_List (Word_Count + 1).Last = 0;
         First := Word_Boundaries_List (Word_Count + 1).Last + 1;
         Word_Count := Word_Count + 1;
      end loop;
      return Word_Boundaries_List (1 .. Word_Count);
   end Words;

end Word_Parser;
with Ada.Command_Line;
with Ada.Text_IO;
with Word_Parser;
procedure Test_Word_Parser
is
   File : Ada.Text_IO.File_Type;
   Line : String (1 .. 10240);
   Last : Natural;
   use type Ada.Text_IO.Count;
begin
   if Ada.Command_Line.Argument_Count /= 1 then
      Ada.Text_IO.Put_Line
        (Ada.Text_IO.Standard_Error,
         "USAGE: " & Ada.Command_Line.Command_Name &
         " <text-file-name>");
      Ada.Command_Line.Set_Exit_Status (0);
      return;
   end if;
   Ada.Text_IO.Open
     (File => File,
      Name => Ada.Command_Line.Argument (1),
      Mode => Ada.Text_IO.In_File);
   while not Ada.Text_IO.End_Of_File (File) loop
      Ada.Text_IO.Get_Line
        (Item => Line,
         File => File,
         Last => Last);
      declare
         Word_Boundary_List :
           constant Word_Parser.Word_Boundaries_Array :=
           Word_Parser.Words (Line (Line'First .. Last));
      begin
         Ada.Text_IO.Put_Line
           ("Words in line" &
            Ada.Text_IO.Count'Image (Ada.Text_IO.Line (File) - 1));
         for W in Word_Boundary_List'Range loop
            Ada.Text_IO.Put_Line
              ("""" & Line
               (Word_Boundary_List (W).First ..
                Word_Boundary_List (W).Last) & """");
         end loop;
      end;
   end loop;
end Test_Word_Parser;

----- Original Message ----- 
From: "wave" <mutilation@bonbon.net>
Newsgroups: comp.lang.ada
To: <comp.lang.ada@ada-france.org>
Sent: Thursday, December 11, 2003 4:01 PM
Subject: Word counting


<snip>




  reply	other threads:[~2003-12-11 22:45 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-12-11 22:01 Word counting wave
2003-12-11 22:45 ` David C. Hoos [this message]
2003-12-12  1:17 ` Jeffrey Carter
2003-12-12  3:22 ` Steve
2003-12-12 14:33 ` Martin Krischik
  -- strict thread matches above, loose matches on Subject: below --
2003-12-11 22:47 amado.alves
2003-12-11 22:53 amado.alves
2003-12-12  2:39 ada_wizard
2003-12-12  9:49 ` wave
2003-12-12 18:26   ` Jeffrey Carter
2003-12-13  3:40   ` Steve
2003-12-13  6:09     ` tmoran
2003-12-12 12:56 amado.alves
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox