* GNAT: Performance of String functions @ 1997-07-18 0:00 Jakob Heinemann 1997-07-18 0:00 ` Robert Dewar 0 siblings, 1 reply; 36+ messages in thread From: Jakob Heinemann @ 1997-07-18 0:00 UTC (permalink / raw) Hello Gnaters. I wonder if there exist some data concerning speed issuses on different types of string handling. To be specific what type och string mechanism is considered fastest and how big is the difference between strings, bounded strings an unbounded strings? Speed measurement in comparing two strings, passing as a parameter, passing as a parameter to procedure in another package. I'm running GNAT 3.09 on a Sun Sparc Ultra 1 with Solaris 2.5. I'm using unbounded strings and I'm comparing a lot. In fact I have an equvalent program running the same algoritm in VAX Pascal, and on the VAX the program runs about ten times faster. Can I expect a dramatic speed improvement if I use another string concept or is GNAT just slow? Hope someone can help me with this question /Jakob PS: Please maka a CC: to me. mailto:Jakob.Heinemann@ericsson.com ESB Linkoping, Sweden ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-18 0:00 GNAT: Performance of String functions Jakob Heinemann @ 1997-07-18 0:00 ` Robert Dewar 1997-07-19 0:00 ` Tarjei T. Jensen 1997-07-22 0:00 ` Jakob Heinemann 0 siblings, 2 replies; 36+ messages in thread From: Robert Dewar @ 1997-07-18 0:00 UTC (permalink / raw) Jakob asks <<I wonder if there exist some data concerning speed issuses on different types of string handling. To be specific what type och string mechanism is considered fastest and how big is the difference between strings, bounded strings an unbounded strings? Speed measurement in comparing two strings, passing as a parameter, passing as a parameter to procedure in another package. I'm running GNAT 3.09 on a Sun Sparc Ultra 1 with Solaris 2.5. I'm using unbounded strings and I'm comparing a lot. In fact I have an equvalent program running the same algoritm in VAX Pascal, and on the VAX the program runs about ten times faster. Can I expect a dramatic speed improvement if I use another string concept or is GNAT just slow? >> A couple of notes. First all strings are passed by reference in all cases, so that is definitely not an issue. I doubt you could have an equivalent algorithm in VAX Pascal, although I don't know VAX Pascal that well. In particular, what does the VAX Pascal use -- I bet it either uses the equivalent of bounded strings, or it does not have automatic finalization. The unboundedness you certainly pay a price for, and you also pay a price for the automatic finalization. And also, it would be interesting to see a distilled benchmark. When we look at such code, we often find it is apples vs oranges comparisons, and that in particular if some foreign language code has been simplistically translated to Ada, it has ended up as non-idiomatic Ada which can be expected to run slower. A factor of 10x smacks of that kind of phenomenon. The GNAT code for unbounded strings is relatively straightforward (you can look at it easily!) There may be places it can be speeded up, but there is nothing like a 10x factor of inefficiency in the coding that I know of ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-18 0:00 ` Robert Dewar @ 1997-07-19 0:00 ` Tarjei T. Jensen 1997-07-19 0:00 ` Matthew Heaney ` (2 more replies) 1997-07-22 0:00 ` Jakob Heinemann 1 sibling, 3 replies; 36+ messages in thread From: Tarjei T. Jensen @ 1997-07-19 0:00 UTC (permalink / raw) > Jakob asks: > > I'm running GNAT 3.09 on a Sun Sparc Ultra 1 with Solaris 2.5. > I'm using unbounded strings and I'm comparing a lot. In fact > I have an equvalent program running the same algoritm in VAX > Pascal, and on the VAX the program runs about ten times faster. > Can I expect a dramatic speed improvement if I use another > string concept or is GNAT just slow? >Robert Dewar wrote: > > A couple of notes. First all strings are passed by reference in all cases, > so that is definitely not an issue. > > I doubt you could have an equivalent algorithm in VAX Pascal, although I don't > know VAX Pascal that well. In particular, what does the VAX Pascal use -- I > bet it either uses the equivalent of bounded strings, or it does not have > automatic finalization. The unboundedness you certainly pay a price for, > and you also pay a price for the automatic finalization. > VAX Pascal passes strings by descriptors. The string pointed to by descriptor is of type counted string. A counted string has a maximum size, an actual size and the string itself. I'm not sure anymore if the descriptor contains the maximum size of the string or if the programmer have to keep track of this. This means that comparing Ada bounded strings and VAX Pascal strings compares two different ways of doing things. Each time you change the size of an Ada bounded string there can be serious overhead. The apparent overhead of bounded strings is an strong argument for providing a standard counted string package in Ada. Greetings, -- // Tarjei T. Jensen // tarjei@online.no || voice +47 51 62 85 58 // Support you local rescue centre: GET LOST! ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-19 0:00 ` Tarjei T. Jensen @ 1997-07-19 0:00 ` Matthew Heaney 1997-07-20 0:00 ` Robert Dewar 1997-07-20 0:00 ` Tarjei T. Jensen 2 siblings, 0 replies; 36+ messages in thread From: Matthew Heaney @ 1997-07-19 0:00 UTC (permalink / raw) In article <33D11806.6C59@online.no>, tarjei@online.no wrote: >VAX Pascal passes strings by descriptors. The string pointed to by >descriptor is of type counted string. A counted string has a maximum >size, an actual size and the string itself. That's all an Ada bounded string is: a maximum size (implicit in the size of the array used to implement Bounded_String), and another component to record the logical length of the string. So I'm confused by your comment, because it sounds like they're very much similar. >This means that comparing Ada bounded strings and VAX Pascal strings >compares two different ways of doing things. This appears not the be the case; perhaps you can elaborate. >Each time you change the >size of an Ada bounded string there can be serious overhead. What size do you mean: the physical length or the logical length. I assume logical length, because the physical size can't be changed. But where's the "serious overhead" when changing the (logical) length of a bounded string? For example, if I have this (a hypothetical implementation - I don't have GNAT source handy): type Bounded_String is record Items : String (1 .. Max); Length : Natural := 0; end record; and I do an append: procedure Append (Bounded : in out Bounded_String; Item : in String) is Length : constant Natural := Bounded.Length + Item'Length; Items : String renames Bounded.Items; begin if Length <= Max then Items (Bounded.Length + 1 .. Length) := Item; Bounded.Length := Length; elsif ... end Append; Where's the overhead, exactly? >The apparent overhead of bounded strings is an strong argument for >providing a standard counted string package in Ada. I think that that package is already included in Ada: it's called Ada.Strings.Bounded. Are you sure you aren't thinking of Unbounded_String? That abstraction definately is very likely to have greater overhead than any form of bounded string. -------------------------------------------------------------------- Matthew Heaney Software Development Consultant <mailto:matthew_heaney@acm.org> (818) 985-1271 ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-19 0:00 ` Tarjei T. Jensen 1997-07-19 0:00 ` Matthew Heaney @ 1997-07-20 0:00 ` Robert Dewar 1997-07-20 0:00 ` Tarjei T. Jensen 1997-07-20 0:00 ` Tarjei T. Jensen 2 siblings, 1 reply; 36+ messages in thread From: Robert Dewar @ 1997-07-20 0:00 UTC (permalink / raw) Tarjei said <<VAX Pascal passes strings by descriptors. The string pointed to by descriptor is of type counted string. A counted string has a maximum size, an actual size and the string itself. I'm not sure anymore if the descriptor contains the maximum size of the string or if the programmer have to keep track of this. This means that comparing Ada bounded strings and VAX Pascal strings compares two different ways of doing things. Each time you change the size of an Ada bounded string there can be serious overhead. >> So, clearly Vax Pascal strings are to be compared with bounded strings, not with unbounded strings, and presumably there is no automatic finalization involved. I am not sure what your "serious" overhead message is about, makes no sense to me, but perhaps it is just not clear. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-20 0:00 ` Robert Dewar @ 1997-07-20 0:00 ` Tarjei T. Jensen 1997-07-20 0:00 ` Robert Dewar 0 siblings, 1 reply; 36+ messages in thread From: Tarjei T. Jensen @ 1997-07-20 0:00 UTC (permalink / raw) Robert Dewar wrote: > > Tarjei said > > <<VAX Pascal passes strings by descriptors. The string pointed to by > descriptor is of type counted string. A counted string has a maximum > size, an actual size and the string itself. I'm not sure anymore if the > descriptor contains the maximum size of the string or if the programmer > have to keep track of this. > > This means that comparing Ada bounded strings and VAX Pascal strings > compares two different ways of doing things. Each time you change the > size of an Ada bounded string there can be serious overhead. > >> > > So, clearly Vax Pascal strings are to be compared with bounded strings, > not with unbounded strings, and presumably there is no automatic > finalization involved. > > I am not sure what your "serious" overhead message is about, makes no > sense to me, but perhaps it is just not clear. I wrote bounded string when I meant unbounded string. Silly me. As unbounded strings are allocated on the heap there is a potential serious overhead in allocating and deallocating space for these strings as they change size. The serious overhead can include system calls to allocate or free space. I expect that it also include wholesale copying of the contents of the strings. I have not checked the GNAT source for how unbounded strings are handled. Greetings, -- // Tarjei T. Jensen // tarjei@online.no || voice +47 51 62 85 58 // Support you local rescue centre: GET LOST! ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-20 0:00 ` Tarjei T. Jensen @ 1997-07-20 0:00 ` Robert Dewar 1997-07-21 0:00 ` Tucker Taft 0 siblings, 1 reply; 36+ messages in thread From: Robert Dewar @ 1997-07-20 0:00 UTC (permalink / raw) Tarjei said <<As unbounded strings are allocated on the heap there is a potential serious overhead in allocating and deallocating space for these strings as they change size. The serious overhead can include system calls to allocate or free space. I expect that it also include wholesale copying of the contents of the strings. >> Right, that is generally true, we have considered, but not yet implemented, a version of unbounded strings that would allocate a little growth space in each string value so that, for example, if you are adding one character at a time, you do not copy on every addition. I think this would be a valuable improvement. Your message sure does become clearer when we put the "un" into it at the appropriate points, and what you say is quite right. In any case the bottom line here is that the original posters comparison was inappropriate. A comparison with VAX strings should use bounded srtings, not unbounded strings ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-20 0:00 ` Robert Dewar @ 1997-07-21 0:00 ` Tucker Taft 1997-07-21 0:00 ` Tarjei Jensen 1997-07-21 0:00 ` Robert Dewar 0 siblings, 2 replies; 36+ messages in thread From: Tucker Taft @ 1997-07-21 0:00 UTC (permalink / raw) Robert Dewar (dewar@merv.cs.nyu.edu) wrote: : Tarjei said : <<As unbounded strings are allocated on the heap there is a potential : serious overhead in allocating and deallocating space for these strings : as they change size. The serious overhead can include system calls to : allocate or free space. I expect that it also include wholesale copying : of the contents of the strings. : >> : Right, that is generally true, we have considered, but not yet implemented, : a version of unbounded strings that would allocate a little growth space : in each string value so that, for example, if you are adding one character : at a time, you do not copy on every addition. I think this would be a : valuable improvement. For what it is worth, our latest AdaMagic runtime provides for this growth, and it also avoids the heap completely for "short" unbounded strings. -- -Tucker Taft stt@inmet.com http://www.inmet.com/~stt/ Intermetrics, Inc. Burlington, MA USA ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-21 0:00 ` Tucker Taft @ 1997-07-21 0:00 ` Tarjei Jensen 1997-07-21 0:00 ` Matthew Heaney 1997-07-22 0:00 ` Robert Dewar 1997-07-21 0:00 ` Robert Dewar 1 sibling, 2 replies; 36+ messages in thread From: Tarjei Jensen @ 1997-07-21 0:00 UTC (permalink / raw) > Robert Dewar wrote: > : Right, that is generally true, we have considered, but not yet implemented, > : a version of unbounded strings that would allocate a little growth space > : in each string value so that, for example, if you are adding one character > : at a time, you do not copy on every addition. I think this would be a > : valuable improvement. >Tucker Taft writes: > For what it is worth, our latest AdaMagic runtime provides for > this growth, and it also avoids the heap completely for "short" > unbounded strings. It looks to me that you are fixing something that is not particularly broken in the first place. The real problem as pointed out before, is that there is no proper counted string in the Ada standard library. With counted strings you get the performance you loose when using unbounded strings improperly. If one really should put the blame on anything it is bounded string which I think is close to useless. Greetings, -- // Tarjei T. Jensen // tarjeij@ulrik.uio.no || fax +47 51 85 87 01 || voice +47 51 85 87 39 // Support you local rescue centre: GET LOST! // Working, but not speaking for the Norwegian Hydrographic Service. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-21 0:00 ` Tarjei Jensen @ 1997-07-21 0:00 ` Matthew Heaney 1997-07-22 0:00 ` Robert Dewar 1 sibling, 0 replies; 36+ messages in thread From: Matthew Heaney @ 1997-07-21 0:00 UTC (permalink / raw) In article <5qvdbn$pno$1@ratatosk.uio.no>, tarjeij@ulrik.uio.no (Tarjei Jensen) wrote: >It looks to me that you are fixing something that is not particularly broken in >the first place. The real problem as pointed out before, is that there is no >proper counted string in the Ada standard library. With counted strings you get >the performance you loose when using unbounded strings improperly. Again, I am unclear about your position. Ada already _does_ have a "counted string," called Ada.Strings.Bounded. This type does _not_ allocate heap, and _is_ more efficient then Ada.Strings.Unbounded. Why doesn't this type satisfy your needs? -------------------------------------------------------------------- Matthew Heaney Software Development Consultant <mailto:matthew_heaney@acm.org> (818) 985-1271 ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-21 0:00 ` Tarjei Jensen 1997-07-21 0:00 ` Matthew Heaney @ 1997-07-22 0:00 ` Robert Dewar 1997-07-22 0:00 ` Tarjei Jensen 1 sibling, 1 reply; 36+ messages in thread From: Robert Dewar @ 1997-07-22 0:00 UTC (permalink / raw) , can you say what you mean by a counted string, it is not a standard term, and you need to explain so we know what you are talking about. P.S. I find bounded strings most useful ... ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-22 0:00 ` Robert Dewar @ 1997-07-22 0:00 ` Tarjei Jensen 1997-07-22 0:00 ` Robert Dewar 1997-07-22 0:00 ` Larry Kilgallen 0 siblings, 2 replies; 36+ messages in thread From: Tarjei Jensen @ 1997-07-22 0:00 UTC (permalink / raw) A counted string would be equivalent to a bounded string with individual maximum string length. All would be assignment compatible if the length of the assigned string was equal or less to the maximum size of the destinatation. I suspect bounded strings will perform badly against a counted string implementation because assignment of strings involves copying the entire array whether it contains valid data or not. This makes Ada less than wonderful in applications that involve a lot of bounded string assignments. Greetings, -- // Tarjei T. Jensen // tarjeij@ulrik.uio.no || fax +47 51 85 87 01 || voice +47 51 85 87 39 // Support you local rescue centre: GET LOST! // Working, but not speaking for the Norwegian Hydrographic Service. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-22 0:00 ` Tarjei Jensen @ 1997-07-22 0:00 ` Robert Dewar 1997-07-22 0:00 ` Larry Kilgallen 1 sibling, 0 replies; 36+ messages in thread From: Robert Dewar @ 1997-07-22 0:00 UTC (permalink / raw) iTarjei said <<A counted string would be equivalent to a bounded string with individual maximum string length. All would be assignment compatible if the length of the assigned string was equal or less to the maximum size of the destinatation. I suspect bounded strings will perform badly against a counted string implementation because assignment of strings involves copying the entire array whether it contains valid data or not. This makes Ada less than wonderful in applications that involve a lot of bounded string assignments. >> There is no reason that the implementation has to copy the entire string, it may, but that is an implementation issue, not a language issue. In fact what makes you think you know *anything* about the implementation of bounded string from the RM, it is a private type. You can make the above comment for some particular implementation of Ada, but your statement above, which phrases it as a comment on the language is just plain incorrect. You may be right that if there are aplications that involve a lot of bounded string assignments (I have not seen any yet), and if the implementation does full copies all the time, THEN for that implementation, there may be a problemantic inefficiency. Actually it makes perfectly sense to implement bounded strings in terms of what you call counted strings. That's certainly the implementation plan for GNAT. A side benefit of this approach is that then far less of the implementation is generic. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-22 0:00 ` Tarjei Jensen 1997-07-22 0:00 ` Robert Dewar @ 1997-07-22 0:00 ` Larry Kilgallen 1997-07-22 0:00 ` Tarjei T. Jensen 1 sibling, 1 reply; 36+ messages in thread From: Larry Kilgallen @ 1997-07-22 0:00 UTC (permalink / raw) In article <5r1l6e$e0h$1@ratatosk.uio.no>, tarjeij@ulrik.uio.no (Tarjei Jensen) writes: > I suspect bounded strings will perform badly against a counted string > implementation because assignment of strings involves copying the entire array > whether it contains valid data or not. This makes Ada less than wonderful in > applications that involve a lot of bounded string assignments. No, it might affect particular _implementations_ of Ada, but I have read no indications that counted strings would not be a valid way of implementing bounded strings. If the Varying String datatype made quite public by DEC Pascal were used in some Ada implementation to implement bounded strings (and for all I know, it or something quite close to it is) execution should be quite efficient for strings below a certain size (ultimately one gets into page faults, cache misses and other stack-vs-heap issues). Larry Kilgallen ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-22 0:00 ` Larry Kilgallen @ 1997-07-22 0:00 ` Tarjei T. Jensen 1997-07-23 0:00 ` Larry Kilgallen 1997-07-23 0:00 ` Robert Dewar 0 siblings, 2 replies; 36+ messages in thread From: Tarjei T. Jensen @ 1997-07-22 0:00 UTC (permalink / raw) >> >>Tarjei T. Jensen writes: >> >> I suspect bounded strings will perform badly against a counted string >> implementation because assignment of strings involves copying the entire array >> whether it contains valid data or not. This makes Ada less than wonderful in >> applications that involve a lot of bounded string assignments. > >Larry Kilgallen wrote: > No, it might affect particular _implementations_ of Ada, but I have > read no indications that counted strings would not be a valid way of > implementing bounded strings. If the Varying String datatype made > quite public by DEC Pascal were used in some Ada implementation to > implement bounded strings (and for all I know, it or something quite > close to it is) execution should be quite efficient for strings > below a certain size (ultimately one gets into page faults, cache > misses and other stack-vs-heap issues). > > Larry Kilgallen That would mean that the implementation supply a default maximum size. If the implementation should be more effective than the usual bounded string implementation the compiler whould have to know what the real length of a string is for assignments. Otherwise the code generated spends a lot of time copying junk data. A "smart" compiler is paramount since we cannot do anything about the assignment operator in Ada. We can write a custom procedure or function to handle assignment, but it is not the same as ":=". I would be delighted if someone could show that a smart compiler is not neccessary to optimize counted or bounded string operations. Greetings, -- // Tarjei T. Jensen // tarjei@online.no || voice +47 51 62 85 58 // Support you local rescue centre: GET LOST! ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-22 0:00 ` Tarjei T. Jensen @ 1997-07-23 0:00 ` Larry Kilgallen 1997-07-23 0:00 ` Robert Dewar 1 sibling, 0 replies; 36+ messages in thread From: Larry Kilgallen @ 1997-07-23 0:00 UTC (permalink / raw) Sorry for the extensive quote, but enough context may shorten this discussion. In article <33D4F30F.5299@online.no>, "Tarjei T. Jensen" <tarjei@online.no> writes: >>> >>>Tarjei T. Jensen writes: >>> >>> I suspect bounded strings will perform badly against a counted string >>> implementation because assignment of strings involves copying the entire array >>> whether it contains valid data or not. This makes Ada less than wonderful in >>> applications that involve a lot of bounded string assignments. >> >>Larry Kilgallen wrote: >> No, it might affect particular _implementations_ of Ada, but I have >> read no indications that counted strings would not be a valid way of >> implementing bounded strings. If the Varying String datatype made >> quite public by DEC Pascal were used in some Ada implementation to >> implement bounded strings (and for all I know, it or something quite >> close to it is) execution should be quite efficient for strings >> below a certain size (ultimately one gets into page faults, cache >> misses and other stack-vs-heap issues). >> >> Larry Kilgallen > > That would mean that the implementation supply a default maximum size. > If the implementation should be more effective than the usual bounded > string implementation the compiler whould have to know what the real > length of a string is for assignments. Otherwise the code generated > spends a lot of time copying junk data. I guess I don't know enough about the bounded string type. I presumed the programmer could specify the maximum size (bound?). In DEC Pascal one does that per variable, and for an Assignment the generated code copies only as many bytes as are actually used in the source string. The runtime image in memory certainly carries an indication of the current length. Larry Kilgallen ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-22 0:00 ` Tarjei T. Jensen 1997-07-23 0:00 ` Larry Kilgallen @ 1997-07-23 0:00 ` Robert Dewar 1997-07-23 0:00 ` Tarjei Jensen 1 sibling, 1 reply; 36+ messages in thread From: Robert Dewar @ 1997-07-23 0:00 UTC (permalink / raw) Tarjei said <<That would mean that the implementation supply a default maximum size. If the implementation should be more effective than the usual bounded string implementation the compiler whould have to know what the real length of a string is for assignments. Otherwise the code generated spends a lot of time copying junk data. A "smart" compiler is paramount since we cannot do anything about the assignment operator in Ada. We can write a custom procedure or function to handle assignment, but it is not the same as ":=". >> You are still making unwarranted assumptions about the data structure used for bounded strings. If the data structure is what you call a "counted " string, then no special cleverness is required on the compiler's part. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-23 0:00 ` Robert Dewar @ 1997-07-23 0:00 ` Tarjei Jensen 1997-07-23 0:00 ` Samuel Mize 1997-07-24 0:00 ` Mats Weber 0 siblings, 2 replies; 36+ messages in thread From: Tarjei Jensen @ 1997-07-23 0:00 UTC (permalink / raw) >Robert Dewar writes: > You are still making unwarranted assumptions about the data structure > used for bounded strings. If the data structure is what you call a > "counted > " string, then no special cleverness is required on the compiler's part. As you have supplied no name of any existing compiler that implements bounded strings any other way than GNAT does it, the I assume that my assumption is warranted. I'd be delighted if it is possible to write a portable bounded string package that does not copy blindly on assignment. So far everything I have seen leads me to believe that this is not possible. Perhaps if bounded strings were implemented with pointers to the actual strings, but that may just complicate matters. Greetings, -- // Tarjei T. Jensen // tarjeij@ulrik.uio.no || fax +47 51 85 87 01 || voice +47 51 85 87 39 // Support you local rescue centre: GET LOST! // Working, but not speaking for the Norwegian Hydrographic Service. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-23 0:00 ` Tarjei Jensen @ 1997-07-23 0:00 ` Samuel Mize 1997-07-23 0:00 ` W. Wesley Groleau x4923 1997-07-24 0:00 ` Robert A Duff 1997-07-24 0:00 ` Mats Weber 1 sibling, 2 replies; 36+ messages in thread From: Samuel Mize @ 1997-07-23 0:00 UTC (permalink / raw) Tarjei Jensen wrote: > I'd be delighted if it is possible to write a portable bounded > string package > that does not copy blindly on assignment. Here's what I understand you want: 1 Each string has its own individual maximum length, and a currently-used length. 2 Each string allocates ONLY its maximum length, not a max length for all strings of this subtype. 3 The user can assign A to B so long as the "used" length of A is less than the maximum length of B. 4 Normal assignment is available. 5 Unused bytes are not copied on assignment. You can build a type to do what you want (except you have to use a "copy" procedure), but Bounded_Type doesn't quite do what you want. You can get all but 4 (normal assignment) with a limited variant record with a discriminant (max length). To assign the value, you provide a "copy" routine. Don't provide a default value for the discriminant, so the user will have to explicitly give a max size for each variable of this type. (If you provide a default value, the implementation may allocate the subtype's max size for each record.) Since Ada's "user-defined" assignment is actually user code adjusting and finalizing values, you can't get 4 (unused bytes not copied) with normal assignment. (You might get it by using a variant record, but then the compiler can allocate the subtype's max size for each record.) Sam Mize ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-23 0:00 ` Samuel Mize @ 1997-07-23 0:00 ` W. Wesley Groleau x4923 1997-07-24 0:00 ` Robert A Duff 1 sibling, 0 replies; 36+ messages in thread From: W. Wesley Groleau x4923 @ 1997-07-23 0:00 UTC (permalink / raw) Samuel Mize wrote: > > Tarjei Jensen wrote: > > > I'd be delighted if it is possible to write a portable bounded > > string package > > that does not copy blindly on assignment. > > Here's what I understand you want: > > 1 .... to 5 .... Might also want "=" to work. > You can get all but 4 (normal assignment) with a limited variant > record with a discriminant (max length). If there are "unused bytes" that are not determined by a discriminant, then "=" may also have a problem (unless it is limited, in which case there is no ":=" either. -- ---------------------------------------------------------------------- Wes Groleau, Hughes Defense Communications, Fort Wayne, IN USA Senior Software Engineer - AFATDS Tool-smith Wanna-be Don't send advertisements to this domain unless asked! All disk space on fw.hac.com hosts belongs to either Hughes Defense Communications or the United States government. Using email to store YOUR advertising on them is trespassing! ---------------------------------------------------------------------- ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-23 0:00 ` Samuel Mize 1997-07-23 0:00 ` W. Wesley Groleau x4923 @ 1997-07-24 0:00 ` Robert A Duff 1 sibling, 0 replies; 36+ messages in thread From: Robert A Duff @ 1997-07-24 0:00 UTC (permalink / raw) In article <33D66563.3F86@link.com>, Samuel Mize <smize@link.com> wrote: >You can get all but 4 (normal assignment) with a limited variant >record with a discriminant (max length). To assign the value, >you provide a "copy" routine. Don't provide a default value >for the discriminant, so the user will have to explicitly give >a max size for each variable of this type. (If you provide a >default value, the implementation may allocate the subtype's >max size for each record.) No, the compiler should not allocate the max. The thing is limited, therefore its size can't change, therefore whatever size corresponds to the default should be allocated (when the default is used). - Bob ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-23 0:00 ` Tarjei Jensen 1997-07-23 0:00 ` Samuel Mize @ 1997-07-24 0:00 ` Mats Weber 1997-07-24 0:00 ` Matthew Heaney ` (2 more replies) 1 sibling, 3 replies; 36+ messages in thread From: Mats Weber @ 1997-07-24 0:00 UTC (permalink / raw) > I'd be delighted if it is possible to write a portable bounded string > package > that does not copy blindly on assignment. So far everything I have > seen leads > me to believe that this is not possible. Perhaps if bounded strings > were > implemented with pointers to the actual strings, but that may just > complicate > matters. In GNAT 3.09, bounded strings are implemented as subtype Length_Range is Natural range 0 .. Max_Length; type Bounded_String is record Length : Length_Range := 0; Data : String (1 .. Max_Length); end record; an alternative implementation is subtype Length_Range is Natural range 0 .. Max_Length; type Bounded_String (Length : Length_Range := 0) is record Data : String (1 .. Length); end record; both have their advantages and disadvantages: GNAT's choice implies copying the whole record on assignment, whereas the other approoach will probably only copy the storage occupied by the value. So yes, bounded strings can be implemented without blind copying on assignment. On the other hand, operations such as appending a single character to a string can be done directly in the GNAT implementation, but involves re-assigning the whole value in the alternative implementation, and can turn O(N) algorithms into O(N**2). Also, in the GNAT case, if you get into a situation where predefined "=" re-emerges, then your program is broken because you will be comparing the whole records instead of just the slice 1 .. Length. This problem does not occur with the alternative approach. I think an AI is about to be voted that will require predefined "=" to behave correctly for these types, which will almost force implementors to use the alternative approach. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-24 0:00 ` Mats Weber @ 1997-07-24 0:00 ` Matthew Heaney 1997-07-24 0:00 ` Tarjei Jensen 1997-07-24 0:00 ` Robert Dewar 2 siblings, 0 replies; 36+ messages in thread From: Matthew Heaney @ 1997-07-24 0:00 UTC (permalink / raw) In article <33D74581.DEC93419@elca-matrix.ch>, Mats.Weber@elca-matrix.ch wrote: >Also, in the GNAT case, if you get into a situation where predefined "=" >re-emerges, then your program is broken because you will be comparing the >whole records instead of just the slice 1 .. Length. This problem does not >occur with the alternative approach. I think an AI is about to be voted that >will require predefined "=" to behave correctly for these types, which will >almost force implementors to use the alternative approach. I don't understand why composing equality will require the "alternative approach" using a discriminant record. Just use a pad character, so that the entire array has defined values at all times. -------------------------------------------------------------------- Matthew Heaney Software Development Consultant <mailto:matthew_heaney@acm.org> (818) 985-1271 ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-24 0:00 ` Mats Weber 1997-07-24 0:00 ` Matthew Heaney @ 1997-07-24 0:00 ` Tarjei Jensen 1997-07-24 0:00 ` Robert Dewar 1997-07-24 0:00 ` Matthew Heaney 1997-07-24 0:00 ` Robert Dewar 2 siblings, 2 replies; 36+ messages in thread From: Tarjei Jensen @ 1997-07-24 0:00 UTC (permalink / raw) > Mats Weber <Mats.Weber@elca-matrix.ch> writes: > > subtype Length_Range is Natural range 0 .. Max_Length; > > type Bounded_String (Length : Length_Range := 0) is record > Data : String (1 .. Length); > end record; > As father Jones informes his son: Our situation has not improved. The alternative bounded string is just an unbounded string. That means that there is a potential copy of the string each time the size changes. This may be worse than the original GNAT version of bounded string wich only performs the copy on assignment. I don't think the real time people would be happy with the above solution. I don't think they like to think of what it might do to the heap. I'm not particulary happy with the solution either. Greetings, -- // Tarjei T. Jensen // tarjeij@ulrik.uio.no || fax +47 51 85 87 01 || voice +47 51 85 87 39 // Support you local rescue centre: GET LOST! // Working, but not speaking for the Norwegian Hydrographic Service. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-24 0:00 ` Tarjei Jensen @ 1997-07-24 0:00 ` Robert Dewar 1997-07-24 0:00 ` Matthew Heaney 1 sibling, 0 replies; 36+ messages in thread From: Robert Dewar @ 1997-07-24 0:00 UTC (permalink / raw) Tarjei says <<> Mats Weber <Mats.Weber@elca-matrix.ch> writes: > > subtype Length_Range is Natural range 0 .. Max_Length; > > type Bounded_String (Length : Length_Range := 0) is record > Data : String (1 .. Length); > end record; > As father Jones informes his son: Our situation has not improved. The alternative bounded string is just an unbounded string. That means that there is a potential copy of the string each time the size changes. This may be worse than the original GNAT version of bounded string wich only performs the copy on assignment. I don't think the real time people would be happy with the above solution. I don't think they like to think of what it might do to the heap. I'm not particulary happy with the solution either. >> I agree that this is a dubious choice. The trouble is that with this representation you CANNOT change the length of an existing value, you can ONLY assign to change the length. With this representation, I would be stronger than Tarjei and say "there is a copy" not "there is a potential copy", since it is really hard to imagine a compiler that avoids the copy on the full record assignment needed to change the length. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-24 0:00 ` Tarjei Jensen 1997-07-24 0:00 ` Robert Dewar @ 1997-07-24 0:00 ` Matthew Heaney 1 sibling, 0 replies; 36+ messages in thread From: Matthew Heaney @ 1997-07-24 0:00 UTC (permalink / raw) In article <5r7n83$b37$1@ratatosk.uio.no>, tarjeij@ulrik.uio.no (Tarjei Jensen) wrote: >The alternative bounded string is just an unbounded string. That means that >there is a potential copy of the string each time the size changes. This may >be worse than the original GNAT version of bounded string wich only performs >the copy on assignment. You are confused. The alternative is _not_ the same as an "unbounded string," nor does it require any use of heap. -------------------------------------------------------------------- Matthew Heaney Software Development Consultant <mailto:matthew_heaney@acm.org> (818) 985-1271 ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-24 0:00 ` Mats Weber 1997-07-24 0:00 ` Matthew Heaney 1997-07-24 0:00 ` Tarjei Jensen @ 1997-07-24 0:00 ` Robert Dewar 1997-07-28 0:00 ` Mats Weber 2 siblings, 1 reply; 36+ messages in thread From: Robert Dewar @ 1997-07-24 0:00 UTC (permalink / raw) <<Also, in the GNAT case, if you get into a situation where predefined "=" re-emerges, then your program is broken because you will be comparing the whole records instead of just the slice 1 .. Length. This problem does not occur with the alternative approach. I think an AI is about to be voted that will require predefined "=" to behave correctly for these types, which will almost force implementors to use the alternative approach. >> There is no justification at all for the claim at the end of this paragraph. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-24 0:00 ` Robert Dewar @ 1997-07-28 0:00 ` Mats Weber 1997-07-28 0:00 ` Matthew Heaney 1997-07-28 0:00 ` Robert Dewar 0 siblings, 2 replies; 36+ messages in thread From: Mats Weber @ 1997-07-28 0:00 UTC (permalink / raw) In article <dewar.869803083@merv>, dewar@merv.cs.nyu.edu (Robert Dewar) wrote: ><<Also, in the GNAT case, if you get into a situation where predefined "=" >re-emerges, then your program is broken because you will be comparing the >whole records instead of just the slice 1 .. Length. This problem does not >occur with the alternative approach. I think an AI is about to be voted that >will require predefined "=" to behave correctly for these types, which will >almost force implementors to use the alternative approach. >>> > > >There is no justification at all for the claim at the end of this >paragraph. So how would you do it ? Make the compiler treat that type in a special way ? Pad the unused characters with a constant character ? ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-28 0:00 ` Mats Weber @ 1997-07-28 0:00 ` Matthew Heaney 1997-07-29 0:00 ` Robert Dewar 1997-07-28 0:00 ` Robert Dewar 1 sibling, 1 reply; 36+ messages in thread From: Matthew Heaney @ 1997-07-28 0:00 UTC (permalink / raw) In article <Mats.Weber-2807971207240001@mlma27.elca-matrix.ch>, Mats.Weber@elca-matrix.ch (Mats Weber) wrote: >I think an AI is about to be voted that >>will require predefined "=" to behave correctly for these types, which will >>almost force implementors to use the alternative approach. >>>> >>There is no justification at all for the claim at the end of this >>paragraph. > >So how would you do it ? Make the compiler treat that type in a special >way ? Pad the unused characters with a constant character ? What's wrong with pad characters? On my currect project, which uses Ada 83, we're using the GNAT bounded strings package, modified to pad the unused portion of the string buffer (with ASCII.NUL). Equality works, even though it's a bitwise comparison. Is there any reason why you'd be averse to pad characters as a solution? -------------------------------------------------------------------- Matthew Heaney Software Development Consultant <mailto:matthew_heaney@acm.org> (818) 985-1271 ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-28 0:00 ` Matthew Heaney @ 1997-07-29 0:00 ` Robert Dewar 0 siblings, 0 replies; 36+ messages in thread From: Robert Dewar @ 1997-07-29 0:00 UTC (permalink / raw) Matthew said <<Is there any reason why you'd be averse to pad characters as a solution?>> Of course, an obvious reason: you really do not want to waste time comparing pad characters! The "=" defined in the GNAT bounded strings package of course does not compare unused characters. So this inefficiency arises only if the predefined equality is used. One approach we were considering was a pragma pragma Equality_Composes (type); with the obvious meaning. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-28 0:00 ` Mats Weber 1997-07-28 0:00 ` Matthew Heaney @ 1997-07-28 0:00 ` Robert Dewar 1997-07-28 0:00 ` Robert Dewar 1 sibling, 1 reply; 36+ messages in thread From: Robert Dewar @ 1997-07-28 0:00 UTC (permalink / raw) Mats Weber says <<So how would you do it ? Make the compiler treat that type in a special way ? Pad the unused characters with a constant character ? >> That is two possible ways out of many! ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-28 0:00 ` Robert Dewar @ 1997-07-28 0:00 ` Robert Dewar 0 siblings, 0 replies; 36+ messages in thread From: Robert Dewar @ 1997-07-28 0:00 UTC (permalink / raw) <<That is two possible ways out of many!>> Note that another obvious technique is to make the type in question a tagged type. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-21 0:00 ` Tucker Taft 1997-07-21 0:00 ` Tarjei Jensen @ 1997-07-21 0:00 ` Robert Dewar 1 sibling, 0 replies; 36+ messages in thread From: Robert Dewar @ 1997-07-21 0:00 UTC (permalink / raw) Tucker says <<For what it is worth, our latest AdaMagic runtime provides for this growth, and it also avoids the heap completely for "short" unbounded strings. >> This seems a perfectly reasonable space/time tradeoff (obviously the above approach can end up using more space, but can save considerable time). Possibly a good alternative is to provide user control over this tradeoff. Another good idea for unbounded strings is to implement automatic garbage collection for just this type. That's also somehting we have looked at in some detail. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-19 0:00 ` Tarjei T. Jensen 1997-07-19 0:00 ` Matthew Heaney 1997-07-20 0:00 ` Robert Dewar @ 1997-07-20 0:00 ` Tarjei T. Jensen 2 siblings, 0 replies; 36+ messages in thread From: Tarjei T. Jensen @ 1997-07-20 0:00 UTC (permalink / raw) Please read unbounded strings where I have written bounded strings. Sigh, -- // Tarjei T. Jensen // tarjei@online.no || voice +47 51 62 85 58 // Support you local rescue centre: GET LOST! ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-18 0:00 ` Robert Dewar 1997-07-19 0:00 ` Tarjei T. Jensen @ 1997-07-22 0:00 ` Jakob Heinemann 1997-07-23 0:00 ` Robert Dewar 1 sibling, 1 reply; 36+ messages in thread From: Jakob Heinemann @ 1997-07-22 0:00 UTC (permalink / raw) Hello again! Thanks for yor discussion on my question about string speed. Your answers got me looking in the right direction and it no longer looks like a = string-comparing performance problem. = But first I will comment mr Dewar's letter, and at the end you will get presented the actual problem and its solution. Robert Dewar wrote: > = > A couple of notes. First all strings are passed by reference in all cas= es, > so that is definitely not an issue. Good. = > I doubt you could have an equivalent algorithm in VAX Pascal, although = I don't > know VAX Pascal that well. = Well, your doubts fails you. It is the SAME algoritm, it has been translated mechanically, and modified in minimum way to use ada functions were applicable. But the overall algoritm of the translated program is the same. The VAX-pascal = is also modular so things are wrapped in packages the same way as in the old code. Certain Pascal concepts have been exchanged with ADA equivs, systems calls are of course new. One of these is strings and strings comparison. One mistake, perhaps = is that vax-strings have been replaced with unbounded strings instedad of bounded s'. > In particular, what does the VAX Pascal use -- I > bet it either uses the equivalent of bounded strings, or it does not ha= ve > automatic finalization. = The VAX-pascal strings are really the same as bounded strings, yes. = > The unboundedness you certainly pay a price for, > and you also pay a price for the automatic finalization. But HOW MUCH penalty is there for this finalization? = ! Actually really ALOT! > And also, it would be interesting to see a distilled benchmark. = > When we look at such code, we often find it is apples vs oranges compar= isons, > and that in particular if some foreign language code has been simplisti= cally > translated to Ada, it has ended up as non-idiomatic Ada which can be ex= pected > to run slower. A factor of 10x smacks of that kind of phenomenon. Well, I agree to som point that simplistical translation sometimes generates rude and inefficient code. But dont=B4t forget, the pascal language is no= t that far away from Ada, of course a lot of nifty stuff is missing in Pascal, so when translating to Ada, nothing really has to change, neither code structure or the data structures have to change. Well, at least not in my case, since there is a stright off line, single process, program that is really a compiler/linker system. But in our project, of porting this particular code, what you describe is = what happened. A piece of code, very small, got converted, *and* fixed in a wrong way. = The problem was: In a little routine called to extract a few strings from a list (array of pointers) there was a sorting algorithm. It was bubble sort, not very famous for great = speed. I found out that the routine for getting the sorted list was called 6 times each execution and ity consumed 95% execution time. = Here is the code fragment for doing really SLOW sorting: SHIFT :=3D true; while shift loop shift :=3D false; for i in 1..nr_of_strings_to_sort loop if vax_fixed_gts(tab(i-1).all, tab(i).all) then tmp :=3D tab(i-1); tab(i-1):=3D tab(i); tab(i):=3Dtmp; shift :=3D true; end if; end loop, end loop; The table is an array of pointers to unbounded strings. Most time was spent = somewhere in this loop, looking closer to the gts (greater string) function revealed out of date patcheds for making vax-comparing speedy. There vere unneccesary padding of the strings to equal length. This must have caused a lot = of cleaning up-operations on the heap. Solution: = I threw out that sorting routine and made a new one, a qsort and one of the sortings went from 16.5 s to 0.017 s. The overall timing went from 40 minutes to 3 min. So I've qut blaming the unbounded strings for comparative slowlyness. = > = > The GNAT code for unbounded strings is relatively straightforward (you > can look at it easily!) There may be places it can be speeded up, but t= here > is nothing like a 10x factor of inefficiency in the coding that I know = of Yes, I've had a look and it doesn't seem to be that complex. Well, thanks for your attention, /Jakob :) ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: GNAT: Performance of String functions 1997-07-22 0:00 ` Jakob Heinemann @ 1997-07-23 0:00 ` Robert Dewar 0 siblings, 0 replies; 36+ messages in thread From: Robert Dewar @ 1997-07-23 0:00 UTC (permalink / raw) Jakob says <<Well, your doubts fails you. It is the SAME algoritm, it has been translated mechanically, and modified in minimum way to use ada functions were mechanically, and modified in minimum way to use ada functions were applicable. But the overall algoritm of the translated program is the same. The VAX-pascal = is also modular so things are wrapped in packages the same way as in the old code. Certain Pascal concepts have been exchanged with ADA equivs, systems calls are of course new. One of these is strings and strings comparison. One mistake, perhaps = is that vax-strings have been replaced with unbounded strings instedad of bounded s'. >> You missed my point, yes of course you can translate the high le vel structure of the algorithm, but what interests us here is the algorithms and data structures used for string handling, and what I meant when I said that I doubted you could do the same thing in Pascal, was that I doubted you could duplicate the algorithms and data structures of unbounded strings. And indeed that is the case, the proper translation is to bounded strings. ^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~1997-07-29 0:00 UTC | newest] Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 1997-07-18 0:00 GNAT: Performance of String functions Jakob Heinemann 1997-07-18 0:00 ` Robert Dewar 1997-07-19 0:00 ` Tarjei T. Jensen 1997-07-19 0:00 ` Matthew Heaney 1997-07-20 0:00 ` Robert Dewar 1997-07-20 0:00 ` Tarjei T. Jensen 1997-07-20 0:00 ` Robert Dewar 1997-07-21 0:00 ` Tucker Taft 1997-07-21 0:00 ` Tarjei Jensen 1997-07-21 0:00 ` Matthew Heaney 1997-07-22 0:00 ` Robert Dewar 1997-07-22 0:00 ` Tarjei Jensen 1997-07-22 0:00 ` Robert Dewar 1997-07-22 0:00 ` Larry Kilgallen 1997-07-22 0:00 ` Tarjei T. Jensen 1997-07-23 0:00 ` Larry Kilgallen 1997-07-23 0:00 ` Robert Dewar 1997-07-23 0:00 ` Tarjei Jensen 1997-07-23 0:00 ` Samuel Mize 1997-07-23 0:00 ` W. Wesley Groleau x4923 1997-07-24 0:00 ` Robert A Duff 1997-07-24 0:00 ` Mats Weber 1997-07-24 0:00 ` Matthew Heaney 1997-07-24 0:00 ` Tarjei Jensen 1997-07-24 0:00 ` Robert Dewar 1997-07-24 0:00 ` Matthew Heaney 1997-07-24 0:00 ` Robert Dewar 1997-07-28 0:00 ` Mats Weber 1997-07-28 0:00 ` Matthew Heaney 1997-07-29 0:00 ` Robert Dewar 1997-07-28 0:00 ` Robert Dewar 1997-07-28 0:00 ` Robert Dewar 1997-07-21 0:00 ` Robert Dewar 1997-07-20 0:00 ` Tarjei T. Jensen 1997-07-22 0:00 ` Jakob Heinemann 1997-07-23 0:00 ` Robert Dewar
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox