From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,8f802583e5c84fa X-Google-Attributes: gid103376,public X-Google-Language: ENGLISH,ASCII-7-bit Path: g2news1.google.com!news1.google.com!proxad.net!newsfeed.stueberl.de!newsfeed.vmunix.org!news1.optus.net.au!optus!newsfeeder.syd.optusnet.com.au!news.optusnet.com.au!newsfeed.pacific.net.au!nasal.pacific.net.au!not-for-mail Newsgroups: comp.lang.ada Subject: Re: String filtering From: David Trudgett Organization: Very little? References: <1j92wa9843ylq.16j89wuqatbaj$.dlg@40tude.net> <433924a2$1_1@glkas0286.greenlnk.net> <43392732$1_1@glkas0286.greenlnk.net> <1jd30obyohnp6$.41tz3funikly.dlg@40tude.net> <43394a3e$1_1@glkas0286.greenlnk.net> Message-ID: User-Agent: Gnus/5.1006 (Gnus v5.10.6) Emacs/21.4 (gnu/linux) Cancel-Lock: sha1:Yw9JgpY2ps9+3nZ6/gW3hnBS6d0= MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Wed, 28 Sep 2005 10:06:44 +1000 NNTP-Posting-Host: 61.8.35.118 X-Complaints-To: news@pacific.net.au X-Trace: nasal.pacific.net.au 1127866293 61.8.35.118 (Wed, 28 Sep 2005 10:11:33 EST) NNTP-Posting-Date: Wed, 28 Sep 2005 10:11:33 EST Xref: g2news1.google.com comp.lang.ada:5211 Date: 2005-09-28T10:06:44+10:00 List-Id: "Dmitry A. Kazakov" writes: > On Tue, 27 Sep 2005 14:42:02 +0100, Martin Dowie wrote: >> 2nd sentence of ARM 95 A.4.5 (76) reads: >> >> "The function To_Unbounded_String(Length : in Natural) >> returns an Unbounded_String that represents an uninitialized >> String whose length is Length." > > Ah, now I see what you meant! Yep, that's what I meant, too. > > Right, though my point was that one should use either > > New_Str : String (1...Count (...)); > > or > > New_Str : Unbounded_String; -- "Uninitialized" I've made some revisions based on various comments, and this is what I have at the moment (incorporating both a string and unbounded_string version): Space_Char : constant Character_Range := (' ', ' '); Lower_Chars : constant Character_Range := ('a', 'z'); Upper_Chars : constant Character_Range := ('A', 'Z'); Numer_Chars : constant Character_Range := ('0', '9'); Alpha_Num_Space : constant Character_Ranges := (Space_Char, Lower_Chars, Upper_Chars, Numer_Chars); Alpha_Num_Space_Set : constant Character_Set := To_Set(Alpha_Num_Space); function Strip_Non_Alphanumeric (Str : in Unbounded_String) return Unbounded_String is Dest_Size : Natural := Count(Str, Alpha_Num_Space_Set); New_Str : Unbounded_String := Null_Unbounded_String; Dest_Char : Natural := 0; begin if Dest_Size > 0 then New_Str := To_Unbounded_String(Dest_Size); for Src_Char in 1 .. Length(Str) loop if Is_In(Element(Str, Src_Char), Alpha_Num_Space_Set) then Dest_Char := Dest_Char + 1; Replace_Element (New_Str, Dest_Char, Element(Str, Src_Char)); end if; end loop; end if; return New_Str; end Strip_Non_Alphanumeric; function Strip_Non_Alphanumeric (Str : in String) return String is New_Str : String(1 .. Count(Str, Alpha_Num_Space_Set)); Dest_Char : Natural := 0; begin if New_Str'Last > 0 then for Src_Char in Str'Range loop if Is_In(Str(Src_Char), Alpha_Num_Space_Set) then Dest_Char := Dest_Char + 1; New_Str(Dest_Char) := Str(Src_Char); end if; end loop; else New_Str := ""; end if; return New_Str; end Strip_Non_Alphanumeric; In the unbounded version, I decided to use replace_element instead of append (with its assignment to "", which might perhaps unallocate memory, depending on implementation??, thus potentially undoing the purpose of the preallocation). This version might gain a bit in efficiency, but code-wise, it is a bit more complex. The string version also seems to work under testing. Profiling would show no difference in performance between the two for my current purposes, but in a different situation, involving large amounts of data, for instance, the fixed string version would no doubt out-perform speed-wise. Space-wise, the unbounded strings would probably win out in many situations. > > So my comment about empty string concerned the latter case. > > If To_Unbounded_String (Count) is ever used then of course > Replace_Element should be in place of assignment + Append. This is what I have done. Have I mucked up anything else while doing so? > Because, assignment might reclaim the memory allocated by > To_Unbounded_String (Count). I assume this is left up to the compiler implementation? David -- David Trudgett http://www.zeta.org.au/~wpower/ We come here upon what, in a large proportion of cases, forms the source of the grossest errors of mankind. Men on a lower level of understanding, when brought into contact with phenomena of a higher order, instead of making efforts to understand them, to raise themselves up to the point of view from which they must look at the subject, judge it from their lower standpoint, and the less they understand what they are talking about, the more confidently and unhesitatingly they pass judgment on it. -- Leo Tolstoy, "The Kingdom of God is Within You"