From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-2.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM, MAILING_LIST_MULTI autolearn=unavailable autolearn_force=no version=3.4.4 X-Google-Thread: 103376,8de7eedad50552f1 X-Google-Attributes: gid103376,public X-Google-Language: ENGLISH,ASCII-7-bit Path: g2news1.google.com!news4.google.com!news.glorb.com!proxad.net!freenix!enst.fr!melchior!cuivre.fr.eu.org!melchior.frmug.org!not-for-mail From: Marius Amado Alves Newsgroups: comp.lang.ada Subject: Re: Ada bench : count words Date: Tue, 22 Mar 2005 11:57:22 +0000 Organization: Cuivre, Argent, Or Message-ID: References: <87vf7n5njs.fsf@code-hal.de> <423f5813$0$9224$9b4e6d93@newsread4.arcor-online.net> <18arnvu705ly4$.1wz6ybz1jt70y$.dlg@40tude.net> NNTP-Posting-Host: lovelace.ada-france.org Mime-Version: 1.0 (Apple Message framework v619.2) Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit X-Trace: melchior.cuivre.fr.eu.org 1111492667 51902 212.85.156.195 (22 Mar 2005 11:57:47 GMT) X-Complaints-To: usenet@melchior.cuivre.fr.eu.org NNTP-Posting-Date: Tue, 22 Mar 2005 11:57:47 +0000 (UTC) To: comp.lang.ada@ada-france.org Return-Path: In-Reply-To: <18arnvu705ly4$.1wz6ybz1jt70y$.dlg@40tude.net> X-Mailer: Apple Mail (2.619.2) X-OriginalArrivalTime: 22 Mar 2005 11:57:23.0857 (UTC) FILETIME=[4F51D810:01C52ED6] X-Virus-Scanned: by amavisd-new-20030616-p10 (Debian) at ada-france.org X-BeenThere: comp.lang.ada@ada-france.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Gateway to the comp.lang.ada Usenet newsgroup" List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Xref: g2news1.google.com comp.lang.ada:9720 Date: 2005-03-22T11:57:22+00:00 >> ... To implement buffering, I have resorted to >> Ada.Direct_IO, which I think cannot apply to standard input. > > Is Text_IO that bad? No, if you can solve The Get_Line puzzle :-) >> procedure Process (S : in String) is >> begin >> Lines := Lines + Ada.Strings.Fixed.Count (S, EOL); > > Isn't it an extra pass? I think you should do parsing using FSM. > Character > classes are: EOL, delimiter, letter. It is either two character map > tests > or one case statement. I don't know what is faster. Probably you should > test both. > >> for I in S'Range loop >> if Is_Separator (S (I)) then >> if In_Word then Finish_Word; end if; >> else >> if not In_Word then Start_Word; end if; >> end if; >> end loop; >> end; Note EOL is not a character, but a string, because in some environments the thing is a combination of two characters. This one-character version improves speed (but still only to 1/2 of C): for I in S'Range loop if S (I) = EOL then Lines := Lines + 1; end if; if Is_Separator (S (I)) then if In_Word then Finish_Word; end if; else if not In_Word then Start_Word; end if; end if; end loop; I have not tried with string matching (if that's what you mean with "FSM") because the iteration was already there, and I doubt the standard library implements it more efficiently than that.