* Reserve_Capacity for Unbounded_String? @ 2007-07-22 19:54 Maciej Sobczak 2007-07-22 21:32 ` Robert A Duff ` (2 more replies) 0 siblings, 3 replies; 35+ messages in thread From: Maciej Sobczak @ 2007-07-22 19:54 UTC (permalink / raw) Why there is no Reserve_Capacity for Unbounded_String? Sounds like a natural idea to me. How to emulate it? The following seems to work fine: declare S : Unbounded_String; My_Capacity : constant := 1000; begin S := To_Unbounded_String (My_Capacity); Delete (Source => S, From => 1, Through => My_Capacity); -- ... end; "Seems to work fine" means that after the above two operations the string is logically empty, but the subsequent appends run faster (which is actually the original motivation). Is this a reasonable approach if performance is of concern? -- Maciej Sobczak http://www.msobczak.com/ ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-22 19:54 Reserve_Capacity for Unbounded_String? Maciej Sobczak @ 2007-07-22 21:32 ` Robert A Duff 2007-07-23 19:29 ` Maciej Sobczak 2007-07-23 4:28 ` Jeffrey R. Carter 2007-07-23 15:07 ` Adam Beneschan 2 siblings, 1 reply; 35+ messages in thread From: Robert A Duff @ 2007-07-22 21:32 UTC (permalink / raw) Maciej Sobczak <see.my.homepage@gmail.com> writes: > Why there is no Reserve_Capacity for Unbounded_String? Probably for historical reasons. > Sounds like a natural idea to me. Yes. > How to emulate it? > > The following seems to work fine: > > declare > S : Unbounded_String; > My_Capacity : constant := 1000; > begin > S := To_Unbounded_String (My_Capacity); I would write To_Unbounded_String (Length => My_Capacity); to make it clear which To_Unbounded_String you're calling. > Delete (Source => S, From => 1, Through => My_Capacity); > -- ... > end; > > "Seems to work fine" means that after the above two operations the > string is logically empty, but the subsequent appends run faster > (which is actually the original motivation). How much faster (I'm curious)? > Is this a reasonable approach if performance is of concern? I think so. - Bob ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-22 21:32 ` Robert A Duff @ 2007-07-23 19:29 ` Maciej Sobczak 2007-07-23 20:30 ` Robert A Duff 0 siblings, 1 reply; 35+ messages in thread From: Maciej Sobczak @ 2007-07-23 19:29 UTC (permalink / raw) On 22 Lip, 23:32, Robert A Duff <bobd...@shell01.TheWorld.com> wrote: > > Why there is no Reserve_Capacity for Unbounded_String? > > Probably for historical reasons. Would introducing it break any existing code? > > declare > > S : Unbounded_String; > > My_Capacity : constant := 1000; > > begin > > S := To_Unbounded_String (My_Capacity); > > I would write > > To_Unbounded_String (Length => My_Capacity); > > to make it clear which To_Unbounded_String you're calling. Yes. I try to follow the "rule" that names associations should be used for any subprogram call with more than one parameter. I've never considered hardening this rule for overloaded subprograms. > > "Seems to work fine" means that after the above two operations the > > string is logically empty, but the subsequent appends run faster > > (which is actually the original motivation). > > How much faster (I'm curious)? This of course depends on what is the ration between allocations and reservations in a given test. Here: with Ada.Strings.Unbounded; procedure String_Test is begin for I in 1 .. 10000 loop declare S : Ada.Strings.Unbounded.Unbounded_String; use Ada.Strings.Unbounded; begin -- S := Ada.Strings.Unbounded.To_Unbounded_String (10000); -- Ada.Strings.Unbounded.Delete (S, 1, 10000); for J in 1 .. 10000 loop Ada.Strings.Unbounded.Append (S, 'A'); end loop; end; end loop; end String_Test; On my machine and with option -O2 uncommenting the two lines makes a difference between 1.86s and 1.43s, which is about 20% gain. Depending on a situation it might or might not be worth the hassle (the hassle is in determining the correct reservation size - and if we can know it fully in advance, then maybe there is no point in using unbounded at all). -- Maciej Sobczak http://www.msobczak.com/ ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-23 19:29 ` Maciej Sobczak @ 2007-07-23 20:30 ` Robert A Duff 0 siblings, 0 replies; 35+ messages in thread From: Robert A Duff @ 2007-07-23 20:30 UTC (permalink / raw) Maciej Sobczak <see.my.homepage@gmail.com> writes: > On 22 Lip, 23:32, Robert A Duff <bobd...@shell01.TheWorld.com> wrote: > >> > Why there is no Reserve_Capacity for Unbounded_String? >> >> Probably for historical reasons. > > Would introducing it break any existing code? Only in rare cases. For ex., if you had a type called Reserve_Capacity in an unrelated package, and you had "use" on that one and also on unbounded strings, the code would now be illegal. >> > declare >> > S : Unbounded_String; >> > My_Capacity : constant := 1000; >> > begin >> > S := To_Unbounded_String (My_Capacity); >> >> I would write >> >> To_Unbounded_String (Length => My_Capacity); >> >> to make it clear which To_Unbounded_String you're calling. > > Yes. I try to follow the "rule" that names associations should be used > for any subprogram call with more than one parameter. Well, that's close, but not quite the right rule. For example, "+" on integers has two parameters, but I would say X+Y rather than "+"(Left => X, Right => Y). I don't have a well-defined rule, but I like to use named notation often. I think perhaps the designer of the subprogram should decide whether named notation is appropriate, and all callers have to obey. But that's a language design issue, not an Ada issue. >...I've never > considered hardening this rule for overloaded subprograms. > >> > "Seems to work fine" means that after the above two operations the >> > string is logically empty, but the subsequent appends run faster >> > (which is actually the original motivation). >> >> How much faster (I'm curious)? > > This of course depends on what is the ration between allocations and > reservations in a given test. > > Here: > > with Ada.Strings.Unbounded; > procedure String_Test is > begin > for I in 1 .. 10000 loop > declare > S : Ada.Strings.Unbounded.Unbounded_String; > use Ada.Strings.Unbounded; > begin > -- S := Ada.Strings.Unbounded.To_Unbounded_String (10000); > -- Ada.Strings.Unbounded.Delete (S, 1, 10000); > for J in 1 .. 10000 loop > Ada.Strings.Unbounded.Append (S, 'A'); > end loop; > end; > end loop; > end String_Test; > > On my machine and with option -O2 uncommenting the two lines makes a > difference between 1.86s and 1.43s, which is about 20% gain. Interesting. Thanks for posting the measurements. >...Depending > on a situation it might or might not be worth the hassle (the hassle > is in determining the correct reservation size - and if we can know it > fully in advance, then maybe there is no point in using unbounded at > all). Good point. If you know the max size, the Bounded (or just plain String) is appropriate. But if you know the max "typical" size, but that you still want it to work for atypical sizes, then this sort of thing seems reasonable. As others have pointed out, it depends on the implementation. But that's always true about efficiency -- if you care about efficiency, then the abstraction leaks. - Bob ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-22 19:54 Reserve_Capacity for Unbounded_String? Maciej Sobczak 2007-07-22 21:32 ` Robert A Duff @ 2007-07-23 4:28 ` Jeffrey R. Carter 2007-07-23 15:07 ` Adam Beneschan 2 siblings, 0 replies; 35+ messages in thread From: Jeffrey R. Carter @ 2007-07-23 4:28 UTC (permalink / raw) Maciej Sobczak wrote: > Is this a reasonable approach if performance is of concern? No. If performance is a concern, you should probably not be using Ada.Strings.Unbounded, or any other general-purpose components. -- Jeff Carter "I'm a kike, a yid, a heebie, a hook nose! I'm Kosher, Mum! I'm a Red Sea pedestrian, and proud of it!" Monty Python's Life of Brian 77 ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-22 19:54 Reserve_Capacity for Unbounded_String? Maciej Sobczak 2007-07-22 21:32 ` Robert A Duff 2007-07-23 4:28 ` Jeffrey R. Carter @ 2007-07-23 15:07 ` Adam Beneschan 2007-07-24 1:01 ` Randy Brukardt 2 siblings, 1 reply; 35+ messages in thread From: Adam Beneschan @ 2007-07-23 15:07 UTC (permalink / raw) On Jul 22, 12:54 pm, Maciej Sobczak <see.my.homep...@gmail.com> wrote: > Why there is no Reserve_Capacity for Unbounded_String? > Sounds like a natural idea to me. > > How to emulate it? > > The following seems to work fine: > > declare > S : Unbounded_String; > My_Capacity : constant := 1000; > begin > S := To_Unbounded_String (My_Capacity); > Delete (Source => S, From => 1, Through => My_Capacity); > -- ... > end; > > "Seems to work fine" means that after the above two operations the > string is logically empty, but the subsequent appends run faster > (which is actually the original motivation). > > Is this a reasonable approach if performance is of concern? I sure wouldn't count on this. I don't know what compiler you're using or how it implements Unbounded_String; but just based on what I know about how this package *might* be implemented and about the suggested implementation in the Rationale, the performance improvements you're seeing could be just an accident. (One test: try doing something like "Intp := new Integer;" after each append. If the Delete actually deallocates memory, and a piece of that memory could be reused by a "new Integer" allocation, you may get very different performance results than if the deallocated memory is never reused except by new appends. Here, though, I'm just guessing, since as I said I have no idea how Unbounded_Strings is implemented.) I'm with Jeff here: you may need an Unbounded_Strings package tailored to your specific needs. It just isn't possible to write a package that will achieve the best performance for all programs that might want to use it; even a simple idea like "Reserve_Capacity" could introduce some implementation overhead that might be unwelcome in programs that wouldn't benefit from that feature. -- Adam ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-23 15:07 ` Adam Beneschan @ 2007-07-24 1:01 ` Randy Brukardt 2007-07-24 7:57 ` Pascal Obry 2007-07-24 23:41 ` Robert A Duff 0 siblings, 2 replies; 35+ messages in thread From: Randy Brukardt @ 2007-07-24 1:01 UTC (permalink / raw) "Adam Beneschan" <adam@irvine.com> wrote in message news:1185203238.701948.307410@m37g2000prh.googlegroups.com... ... > > Is this a reasonable approach if performance is of concern? > > I sure wouldn't count on this. I second (or is that "third") Adam and Jeff's comments. For Janus/Ada, Unbounded_String is just a controlled wrapper around "access String"; in particular, the length is the length of the item allocated. The code you have would just cause an extra allocation/deallocation without any corresponding benefit (and it might even cause additional fragmentation and slow the program down). In general, you cannot depend on the performance of the predefined packages; you can presume that implementations have done a decent job for their target, but that is about all. (The containers are a bit of an exception, in that there are performance requirements on them; but those requirements are vague enough to be pretty meaningless for average-sized containers.) I usually recommend to use the predefined packages as necessary in your application, and avoid worrying about performance of them. If it turns out that there is a performance problem, and that problem is with the one of the predefined packages, then it makes sense to replace the offending package with something custom-written to match the problem at hand. (In this case, just using type String and doing your own storage management would seem to be appropriate.) For things like Unbounded_Strings (and the Containers), the main value is to let the package do the memory management (meaning that you have well-tested memory management that won't leak or break). Trying to take over some of that is suspicious (that's true for Reserve_Capacity in the containers as well), because it is easy to get it wrong and cause *worse* performance. In any case, implementing Reserve_Capacity to do anything (as opposed to being a no-op) in Janus/Ada would require junking the entire existing implementation and starting over. It surely would add overhead (because you'd continually need to check the allocated length against the "nominal" length). As such, this is something I would fight strongly. Randy. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-24 1:01 ` Randy Brukardt @ 2007-07-24 7:57 ` Pascal Obry 2007-07-24 18:58 ` Randy Brukardt 2007-07-24 23:41 ` Robert A Duff 1 sibling, 1 reply; 35+ messages in thread From: Pascal Obry @ 2007-07-24 7:57 UTC (permalink / raw) To: Randy Brukardt Randy Brukardt a �crit : > In any case, implementing Reserve_Capacity to do anything (as opposed to > being a no-op) in Janus/Ada would require junking the entire existing > implementation and starting over. It surely would add overhead (because > you'd continually need to check the allocated length against the "nominal" > length). As such, this is something I would fight strongly. Well this is already done in GNAT for speed performances. If you are adding lot of chunk of string into an Unbounded_String it is far too costly to reallocate the Unbounded_String buffer each time. This was a huge speed penalty for the templates_parser for example. The current GNAT implementation is just what users would expect it to be. Pascal. -- --|------------------------------------------------------ --| Pascal Obry Team-Ada Member --| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE --|------------------------------------------------------ --| http://www.obry.net --| "The best way to travel is by means of imagination" --| --| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595 ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-24 7:57 ` Pascal Obry @ 2007-07-24 18:58 ` Randy Brukardt 2007-07-24 23:50 ` Robert A Duff 2007-07-24 23:54 ` Pascal Obry 0 siblings, 2 replies; 35+ messages in thread From: Randy Brukardt @ 2007-07-24 18:58 UTC (permalink / raw) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 2019 bytes --] "Pascal Obry" <pascal@obry.net> wrote in message news:46A5B0FE.3060008@obry.net... > Randy Brukardt a �crit : > > In any case, implementing Reserve_Capacity to do anything (as opposed to > > being a no-op) in Janus/Ada would require junking the entire existing > > implementation and starting over. It surely would add overhead (because > > you'd continually need to check the allocated length against the "nominal" > > length). As such, this is something I would fight strongly. > > Well this is already done in GNAT for speed performances. If you are > adding lot of chunk of string into an Unbounded_String it is far too > costly to reallocate the Unbounded_String buffer each time. This was a > huge speed penalty for the templates_parser for example. The current > GNAT implementation is just what users would expect it to be. Maybe some uninformed users, but not anyone who doesn't add non-existent requirements to the specs. Unbounded_String is best used for storage (not calculation) of values. It is a horrible design for calculation, no matter what schemes you use for the implementation. Moreover, I've yet to see an example where the performance of Unbounded_Strings (allocation-wise) matters. Our spam filter was written solely using Unbounded_Strings, it does a lot of calculation (normalization of e-mail) and still the allocation overhead is insignificant. (The bottleneck is in Index, which does no allocation at all.) I can imagine that it would be possible to write a program where the allocation overhead would matter, but such a program would be helped a lot more by avoiding Unbounded_Strings (and *all* of their allocation overhead) than by tweaking the implementation of Unbounded_Strings to reduce that overhead in a few rarely used operations. So I stand by my original statement. Anyone that depends on the performance characteristics of a particular implementation (of anything!) is an idiot and will be burned in the future. Randy. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-24 18:58 ` Randy Brukardt @ 2007-07-24 23:50 ` Robert A Duff 2007-07-25 0:00 ` Randy Brukardt 2007-07-24 23:54 ` Pascal Obry 1 sibling, 1 reply; 35+ messages in thread From: Robert A Duff @ 2007-07-24 23:50 UTC (permalink / raw) "Randy Brukardt" <randy@rrsoftware.com> writes: > So I stand by my original statement. Anyone that depends on the performance > characteristics of a particular implementation (of anything!) is an idiot > and will be burned in the future. I really don't think you mean that literally! I mean, the Ada language standard does not require that "X := X + 1" take less than 100 years to execute. Programmers _must_ trust that language implementers make reasonably efficient choices. In fact, in Ada, I write things like "X := X + 1" all the time, trusting that it compiles into a single machine instruction, or close to that. - Bob ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-24 23:50 ` Robert A Duff @ 2007-07-25 0:00 ` Randy Brukardt 0 siblings, 0 replies; 35+ messages in thread From: Randy Brukardt @ 2007-07-25 0:00 UTC (permalink / raw) "Robert A Duff" <bobduff@shell01.TheWorld.com> wrote in message news:wccmyxl5plh.fsf@shell01.TheWorld.com... > "Randy Brukardt" <randy@rrsoftware.com> writes: > > > So I stand by my original statement. Anyone that depends on the performance > > characteristics of a particular implementation (of anything!) is an idiot > > and will be burned in the future. > > I really don't think you mean that literally! > > I mean, the Ada language standard does not require that "X := X + 1" > take less than 100 years to execute. Programmers _must_ trust that > language implementers make reasonably efficient choices. In fact, > in Ada, I write things like "X := X + 1" all the time, trusting that it > compiles into a single machine instruction, or close to that. Perhaps; I was thinking of predefined libraries more than basic stuff like "X := X + 1;". But I think you really do have to test whether something is fast enough; you can never *assume* that it will be. Vendors don't intentionally make things slow, but what you think is slow might not be for a given implementation's model. Also see my answer to your other note. Randy. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-24 18:58 ` Randy Brukardt 2007-07-24 23:50 ` Robert A Duff @ 2007-07-24 23:54 ` Pascal Obry 2007-07-25 0:52 ` Randy Brukardt 2007-07-25 1:28 ` Randy Brukardt 1 sibling, 2 replies; 35+ messages in thread From: Pascal Obry @ 2007-07-24 23:54 UTC (permalink / raw) To: Randy Brukardt Randy Brukardt a �crit : > Maybe some uninformed users, but not anyone who doesn't add non-existent > requirements to the specs. > > Unbounded_String is best used for storage (not calculation) of values. It is > a horrible design for calculation, no matter what schemes you use for the > implementation. > > Moreover, I've yet to see an example where the performance of > Unbounded_Strings (allocation-wise) matters. Our spam filter was written > solely using Unbounded_Strings, it does a lot of calculation (normalization > of e-mail) and still the allocation overhead is insignificant. (The > bottleneck is in Index, which does no allocation at all.) > > I can imagine that it would be possible to write a program where the > allocation overhead would matter, but such a program would be helped a lot > more by avoiding Unbounded_Strings (and *all* of their allocation overhead) > than by tweaking the implementation of Unbounded_Strings to reduce that > overhead in a few rarely used operations. > > So I stand by my original statement. Anyone that depends on the performance > characteristics of a particular implementation (of anything!) is an idiot > and will be burned in the future. Well Randy, its all too easy to dismiss points like this by saying that you have not seen a problem with the current implementation and if somebody show you something wrong in this area then he has to use something else that Unbounded_String... At least that's the way I'm reading your message. Just try this on your implementation: C : constant Character := 'x'; U : Unbounded_String; for k in 1 .. 5_000_000 loop U := U & C; end loop; And then test it with GNAT. Of course that is not a real example but very close to something done in my templates engine. You were speaking about performance about checking the preallocated buffer. My point is that it is not that costly compared to the price of reallocating a whole string thousands of time. Pascal. -- --|------------------------------------------------------ --| Pascal Obry Team-Ada Member --| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE --|------------------------------------------------------ --| http://www.obry.net --| "The best way to travel is by means of imagination" --| --| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595 ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-24 23:54 ` Pascal Obry @ 2007-07-25 0:52 ` Randy Brukardt 2007-07-25 1:28 ` Randy Brukardt 1 sibling, 0 replies; 35+ messages in thread From: Randy Brukardt @ 2007-07-25 0:52 UTC (permalink / raw) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 2559 bytes --] "Pascal Obry" <pascal@obry.net> wrote in message news:46A69136.3010803@obry.net... > Randy Brukardt a �crit : > > Maybe some uninformed users, but not anyone who doesn't add non-existent > > requirements to the specs. > > > > Unbounded_String is best used for storage (not calculation) of values. It is > > a horrible design for calculation, no matter what schemes you use for the > > implementation. > > > > Moreover, I've yet to see an example where the performance of > > Unbounded_Strings (allocation-wise) matters. Our spam filter was written > > solely using Unbounded_Strings, it does a lot of calculation (normalization > > of e-mail) and still the allocation overhead is insignificant. (The > > bottleneck is in Index, which does no allocation at all.) > > > > I can imagine that it would be possible to write a program where the > > allocation overhead would matter, but such a program would be helped a lot > > more by avoiding Unbounded_Strings (and *all* of their allocation overhead) > > than by tweaking the implementation of Unbounded_Strings to reduce that > > overhead in a few rarely used operations. > > > > So I stand by my original statement. Anyone that depends on the performance > > characteristics of a particular implementation (of anything!) is an idiot > > and will be burned in the future. > > Well Randy, its all too easy to dismiss points like this by saying that > you have not seen a problem with the current implementation and if > somebody show you something wrong in this area then he has to use > something else that Unbounded_String... At least that's the way I'm > reading your message. > > Just try this on your implementation: > > C : constant Character := 'x'; > > U : Unbounded_String; > > for k in 1 .. 5_000_000 loop > U := U & C; > end loop; > > And then test it with GNAT. Of course that is not a real example but > very close to something done in my templates engine. > > You were speaking about performance about checking the preallocated > buffer. My point is that it is not that costly compared to the price of > reallocating a whole string thousands of time. > > Pascal. > > -- > > --|------------------------------------------------------ > --| Pascal Obry Team-Ada Member > --| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE > --|------------------------------------------------------ > --| http://www.obry.net > --| "The best way to travel is by means of imagination" > --| > --| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595 ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-24 23:54 ` Pascal Obry 2007-07-25 0:52 ` Randy Brukardt @ 2007-07-25 1:28 ` Randy Brukardt 2007-07-25 7:48 ` Pascal Obry 2007-07-25 8:50 ` Martin Krischik 1 sibling, 2 replies; 35+ messages in thread From: Randy Brukardt @ 2007-07-25 1:28 UTC (permalink / raw) (I managed to press Send on this without actually writing a reply...sorry) "Pascal Obry" <pascal@obry.net> wrote in message news:46A69136.3010803@obry.net... ... > > So I stand by my original statement. Anyone that depends on the performance > > characteristics of a particular implementation (of anything!) is an idiot > > and will be burned in the future. > > Well Randy, its all too easy to dismiss points like this by saying that > you have not seen a problem with the current implementation and if > somebody show you something wrong in this area then he has to use > something else that Unbounded_String... At least that's the way I'm > reading your message. I want to see a real program compiled with using Janus/Ada that shows that (a) the performance is not good enough currently and (b) that tweaking Ada.Strings.Unbounded would make the performance good enough. Otherwise, time spent on Ada.Strings.Unbounded is a waste of time (and potentially a compatibility problem, such that existing programs might run out of memory where they do not currently). If I spent much time on things where (a) and (b) were not true, then I'd be the idiot, by not making the best use of my time to improve the compile/runtime/etc. > Just try this on your implementation: > > C : constant Character := 'x'; > > U : Unbounded_String; > > for k in 1 .. 5_000_000 loop > U := U & C; > end loop; > > And then test it with GNAT. Of course that is not a real example but > very close to something done in my templates engine. (Aside: I hope not. It's always bad practice to append small things to large things, be it strings, I/O, or anything else. It always leads to performance issues, more so the more abstract the interface is. I realize that you have to balance understandability with performance, and sometimes occassional appends of small things make sense. But doing it in an inner loop like this sets off my "this is a very bad idea" detector.) The old version of GNAT I have installed allocates for each call to '&' (I just checked the source code), so I doubt testing it would prove anything whatsoever. (Indeed, looking at the code, the Janus/Ada version is likely to be more efficient. But, based on your assertions, I have to assume that the GNAT code is out of date, so there is no point in looking at it further.) But even if it did, so what? Using any version of "&" (be it for String, Unbounded_String, or any other type) in a tight loop is silly. The amount of code needed to implement "&" is huge, and virtually any alternative way of doing the operation would be better. Moreover, the only way to save allocations in an increasing length string is to overallocate the memory; that is, to waste memory. (If you had a way to find out the blocking factor of the underlying storage pool, you could possibly avoid that by rounding allocations up, but that would only help in pathological programs that are adding single characters to end of strings thousands of times.) Janus/Ada is all about minimum memory usage; speed is a secondary consideration and in general we will not write runtime packages such that they waste memory. As such, even if I was to rewrite this, I would not allocate more than the memory needed by the current operation (modulo a very small blocking factor), and as such this program would *still* need a new allocation nearly every iteration (because the string is continuously getting longer). I'd expect a program that *deleted* parts of strings to show much more improvement than one that added strings, because in that case there wouldn't be a need to free anything (just adjust the length). But such programs are far rarer than the append everything ones like this example. > You were speaking about performance about checking the preallocated > buffer. My point is that it is not that costly compared to the price of > reallocating a whole string thousands of time. I'm sure that's true, but my point also was that programs that do that are generally pathologies (surely, that's the only examples that I've been shown so far). And the real programs I've looked at (like our spam filter), this overhead is not a significant part of the entire program. Even though our spam filter does something like the above when decoding BASE64 encoded e-mails. And I don't think that the performance improvement of a tweaked Unbounded_String would be a tenth of simply not using it at all for calculations -- which would make far more difference in a misbehaving application. After all, the first rule of performance improvement is to find a better algorithm; tweaking an existing one to make it faster can only provide incremental improvements (and that rarely helps anything). Surely that applies here, too. And now I've wasted far too much time on this worthless topic. Randy. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-25 1:28 ` Randy Brukardt @ 2007-07-25 7:48 ` Pascal Obry 2007-07-25 9:55 ` Georg Bauhaus 2007-07-25 18:58 ` Randy Brukardt 2007-07-25 8:50 ` Martin Krischik 1 sibling, 2 replies; 35+ messages in thread From: Pascal Obry @ 2007-07-25 7:48 UTC (permalink / raw) To: Randy Brukardt Randy Brukardt a �crit : > And now I've wasted far too much time on this worthless topic. If you think so! GNAT has implemented that as in a real application the code was running more than 10 times faster. If you think you have wasted far too much time, no problem. I'm not a Janus/Ada customer anyway! Pascal. -- --|------------------------------------------------------ --| Pascal Obry Team-Ada Member --| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE --|------------------------------------------------------ --| http://www.obry.net --| "The best way to travel is by means of imagination" --| --| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595 ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-25 7:48 ` Pascal Obry @ 2007-07-25 9:55 ` Georg Bauhaus 2007-07-25 10:02 ` Georg Bauhaus 2007-07-25 18:58 ` Randy Brukardt 1 sibling, 1 reply; 35+ messages in thread From: Georg Bauhaus @ 2007-07-25 9:55 UTC (permalink / raw) On Wed, 2007-07-25 at 09:48 +0200, Pascal Obry wrote: > Randy Brukardt a écrit : > > And now I've wasted far too much time on this worthless topic. > > If you think so! GNAT has implemented that as in a real application the > code was running more than 10 times faster. If you think you have wasted > far too much time, no problem. I'm not a Janus/Ada customer anyway! In fact, there are many special string-related subprograms in GNAT that arguably demonstrate the influences of effective marketing on technical properties of compilers. E.g. there are variations of function Str_Concat_3 (S1, S2, S3 : String) return String; -- Concatenate three strings and return resulting string hidden in System.String_Ops_Concat_3 and used in algorithmically (!) optimizing two "&"s in the compiler. Likewise, "... use of & removed for efficiency reasons" in package body Ada.Strings.Fixed. (See also Ada.Strings.Unbounded.Aux.Get_String for the unbounded case.) Two conclusion seem to follow from this: 1) that GNAT authors agree with Randy when he says that you have to be careful with "&"-centric algorithms, but 2) that AdaCore has better salespeople because customers are more satisfied when the vendor has already done the "right" thing for them. ;-) Try this: procedure Strtest is S: constant String := "A" & Argument(1) & "B"; -- occurences of "&" --> GNAT special case begin Text_IO.put_line(S); end Strtest; Then, using GNAT, you get $ nm strtest.o ... 00000019 T _ada_strtest ... U system__string_ops_concat_3__str_concat_3 <-- $ ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-25 9:55 ` Georg Bauhaus @ 2007-07-25 10:02 ` Georg Bauhaus 0 siblings, 0 replies; 35+ messages in thread From: Georg Bauhaus @ 2007-07-25 10:02 UTC (permalink / raw) On Wed, 2007-07-25 at 11:55 +0200, Georg Bauhaus wrote: > function Str_Concat_3 (S1, S2, S3 : String) return String; > -- Concatenate three strings and return resulting string > > hidden in System.String_Ops_Concat_3 and used in algorithmically (!) > optimizing two "&"s in the compiler. I mean, the compiler optimizes occurences of a small number of "&" in a row in user programs. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-25 7:48 ` Pascal Obry 2007-07-25 9:55 ` Georg Bauhaus @ 2007-07-25 18:58 ` Randy Brukardt 1 sibling, 0 replies; 35+ messages in thread From: Randy Brukardt @ 2007-07-25 18:58 UTC (permalink / raw) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 1202 bytes --] "Pascal Obry" <pascal@obry.net> wrote in message news:46A70064.8090407@obry.net... > Randy Brukardt a �crit : > > And now I've wasted far too much time on this worthless topic. > > If you think so! GNAT has implemented that as in a real application the > code was running more than 10 times faster. I don't think that much of a speed-up would be possible with Janus/Ada even with a specially tuned version of Ada.Strings.Unbounded. The overhead of allocation is only a portion of the total overhead, after all; call overhead, checking overhead, and copying overhead are all significant. The only way to get close would be to preallocate strings in large blocks, but that would only make sense for applications with very few unbounded strings and that do a lot of appends and '&'s. The performance would be worse on programs that use tens of thousands of Unbounded_Strings (like my spam filter and most other real apps that I've seen) because of increased paging overhead and reduced capacity. > If you think you have wasted > far too much time, no problem. I'm not a Janus/Ada customer anyway! Exactly. My response would be different to a good customer... Randy. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-25 1:28 ` Randy Brukardt 2007-07-25 7:48 ` Pascal Obry @ 2007-07-25 8:50 ` Martin Krischik 2007-07-25 9:26 ` AW: " Grein, Christoph (Fa. ESG) 1 sibling, 1 reply; 35+ messages in thread From: Martin Krischik @ 2007-07-25 8:50 UTC (permalink / raw) Randy Brukardt schrieb: > But, based on your assertions, I have to assume that the > GNAT code is out of date, so there is no point in looking at it further.) No need to guess, have a look at the current source here: http://gnuada.sourceforge.net/source-gcc-4.1.2/index.htm Martin -- mailto://krischik@users.sourceforge.net Ada programming at: http://ada.krischik.com ^ permalink raw reply [flat|nested] 35+ messages in thread
* AW: Reserve_Capacity for Unbounded_String? 2007-07-25 8:50 ` Martin Krischik @ 2007-07-25 9:26 ` Grein, Christoph (Fa. ESG) 2007-07-25 15:32 ` Martin Krischik 2007-07-25 15:39 ` Martin Krischik 0 siblings, 2 replies; 35+ messages in thread From: Grein, Christoph (Fa. ESG) @ 2007-07-25 9:26 UTC (permalink / raw) To: comp.lang.ada > No need to guess, have a look at the current source here: > > http://gnuada.sourceforge.net/source-gcc-4.1.2/index.htm There is only a-strunb.ads, the link to a-strunb.adb fails. Eurocopter Deutschland GmbH Sitz der Gesellschaft/Registered Office: Donauwoerth Registergericht/Registration Court: Amtsgericht Augsburg HRB 16508 Vorsitzender des Aufsichtsrates/Chairman of the Supervisory Board: Dr. Lutz Bertling Geschaeftsfuehrung/Board of Management: Dr. Wolfgang Schoder, Vorsitzender/CEO; Friedrich-Wilhelm Hormel; Ralf Barnscheidt CONFIDENTIALITY NOTICE This communication and the information it contains is intended for the addressee(s) named above and for no other persons or organizations. It is confidential and may be legally privileged and protected by law. The unauthorized use, copying or disclosure of this communication or any part of it is prohibited and may be unlawful. If you have received this communication in error, kindly notify us by return e-mail and discard and/or delete the communication. Thank you very much. It is possible for e-mails to be intercepted or affected by viruses. Whilst we maintain virus checks on our e-mails, we accept no liability for viruses or other material which might be introduced with this message. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: AW: Reserve_Capacity for Unbounded_String? 2007-07-25 9:26 ` AW: " Grein, Christoph (Fa. ESG) @ 2007-07-25 15:32 ` Martin Krischik 2007-07-25 15:39 ` Martin Krischik 1 sibling, 0 replies; 35+ messages in thread From: Martin Krischik @ 2007-07-25 15:32 UTC (permalink / raw) Grein, Christoph (Fa. ESG) wrote: >> No need to guess, have a look at the current source here: >> >> http://gnuada.sourceforge.net/source-gcc-4.1.2/index.htm > > There is only a-strunb.ads, the link to a-strunb.adb fails. Indeed. I have to check why this is so. Martin -- mailto://krischik@users.sourceforge.net Ada programming at: http://ada.krischik.com ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: AW: Reserve_Capacity for Unbounded_String? 2007-07-25 9:26 ` AW: " Grein, Christoph (Fa. ESG) 2007-07-25 15:32 ` Martin Krischik @ 2007-07-25 15:39 ` Martin Krischik 1 sibling, 0 replies; 35+ messages in thread From: Martin Krischik @ 2007-07-25 15:39 UTC (permalink / raw) Grein, Christoph (Fa. ESG) wrote: >> No need to guess, have a look at the current source here: >> >> http://gnuada.sourceforge.net/source-gcc-4.1.2/index.htm > > There is only a-strunb.ads, the link to a-strunb.adb fails. The source is here: http://gnuada.sourceforge.net/source-gcc-4.1.2/____a-strunb__adb.htm the link is missing the 4 _'s. And only AdaCore will know why! Martin -- mailto://krischik@users.sourceforge.net Ada programming at: http://ada.krischik.com ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-24 1:01 ` Randy Brukardt 2007-07-24 7:57 ` Pascal Obry @ 2007-07-24 23:41 ` Robert A Duff 2007-07-25 0:16 ` Randy Brukardt 1 sibling, 1 reply; 35+ messages in thread From: Robert A Duff @ 2007-07-24 23:41 UTC (permalink / raw) "Randy Brukardt" <randy@rrsoftware.com> writes: > "Adam Beneschan" <adam@irvine.com> wrote in message > news:1185203238.701948.307410@m37g2000prh.googlegroups.com... > ... >> > Is this a reasonable approach if performance is of concern? >> >> I sure wouldn't count on this. > > I second (or is that "third") Adam and Jeff's comments. I said the opposite, but I guess I'm convinced by your (and Adam and Jeff's) arguments that one should not depend on this particular "efficiency hack". But... > For Janus/Ada, Unbounded_String is just a controlled wrapper around "access > String"; in particular, the length is the length of the item > allocated. Are you saying that: for I in 1..N loop Append (Some_Unbounded_String, 'x'); causes a free and a "new" and a copy every time around the loop? - Bob ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-24 23:41 ` Robert A Duff @ 2007-07-25 0:16 ` Randy Brukardt 2007-07-25 2:25 ` Robert A Duff 0 siblings, 1 reply; 35+ messages in thread From: Randy Brukardt @ 2007-07-25 0:16 UTC (permalink / raw) "Robert A Duff" <bobduff@shell01.TheWorld.com> wrote in message news:wccr6mx5pz0.fsf@shell01.TheWorld.com... > "Randy Brukardt" <randy@rrsoftware.com> writes: ... > > For Janus/Ada, Unbounded_String is just a controlled wrapper around "access > > String"; in particular, the length is the length of the item > > allocated. > > Are you saying that: > > for I in 1..N loop > Append (Some_Unbounded_String, 'x'); > > causes a free and a "new" and a copy every time around the loop? Yup; I just checked the code to make sure. Moreover, I have yet to see an real example where the performance difference would matter that much (Free and new aren't that expensive relative to the alternatives). Our spam filter does operations like this, and as I said earlier, the performance bottleneck is with Index and with the file I/O operations and not here. (Nor are there any fragmentation problems that I can detect; the program runs for many days without issues.) In any case, I would expect this operation (of making the string longer) to do a lot of allocations anyway, because the string will need to be expanded a lot. And any "over"-allocation will be wasteful of memory; for the spam filter, that would tend to reduce the maximum size of messages that could be filtered.(Janus/Ada always emphasizes smaller memory usages versus wasting memory to get better performance.) Besides, in Janus/Ada, most the alternatives would also do a (under the covers) "new" and "free". We do try to optimize those out, but by default Var : String := Some_Value & 'x'; would generate three allocations and deallocations. The optimizer probably would get rid of the descriptors or allocate them statically, but the object Var probably would end up in the heap. If performance is a real issue, you'd have to write your own code to do the actual operations you want; for instance, in your example To_Unbounded_String (Some_Unbounded_String, (1..N => 'x')); would be a lot faster on any compiler (far fewer calls to a library; even if there is no allocation, there still will be overhead checks for bounds and the like). The examples where an alternative implementation of Unbounded_Strings would make a significant difference such that the application was fast enough with it and not fast enough without it have to be extremely rare. Randy. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-25 0:16 ` Randy Brukardt @ 2007-07-25 2:25 ` Robert A Duff 2007-07-25 6:07 ` Simon Wright 2007-07-25 19:08 ` Randy Brukardt 0 siblings, 2 replies; 35+ messages in thread From: Robert A Duff @ 2007-07-25 2:25 UTC (permalink / raw) "Randy Brukardt" <randy@rrsoftware.com> writes: > "Robert A Duff" <bobduff@shell01.TheWorld.com> wrote in message > news:wccr6mx5pz0.fsf@shell01.TheWorld.com... >> "Randy Brukardt" <randy@rrsoftware.com> writes: > ... >> > For Janus/Ada, Unbounded_String is just a controlled wrapper around > "access >> > String"; in particular, the length is the length of the item >> > allocated. >> >> Are you saying that: >> >> for I in 1..N loop >> Append (Some_Unbounded_String, 'x'); >> >> causes a free and a "new" and a copy every time around the loop? > > Yup; I just checked the code to make sure. Moreover, I have yet to see an > real example where the performance difference would matter that much (Free > and new aren't that expensive relative to the alternatives). It's the copy I'm wondering about, not the new and free so much. To append N characters to an unb string requires quadratic copying (copying a char N**2 / 2 times). If you double the allocation every time, it's amortized linear. Yes, you waste some space that way. > If performance is a real issue, you'd have to write your own code to do the > actual operations you want; for instance, in your example > To_Unbounded_String (Some_Unbounded_String, (1..N => 'x')); would be a lot > faster ... Yes, but that's just an example. The real example is where you're reading characters one-by-one from somewhere, and you don't know how many there will be, and you want to append some of them to a string. >...on any compiler (far fewer calls to a library; even if there is no > allocation, there still will be overhead checks for bounds and the like). > The examples where an alternative implementation of Unbounded_Strings would > make a significant difference such that the application was fast enough with > it and not fast enough without it have to be extremely rare. Fair enough. Real-world measurements beat theoretical reasoning. ;-) - Bob ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-25 2:25 ` Robert A Duff @ 2007-07-25 6:07 ` Simon Wright 2007-07-25 19:08 ` Randy Brukardt 1 sibling, 0 replies; 35+ messages in thread From: Simon Wright @ 2007-07-25 6:07 UTC (permalink / raw) Robert A Duff <bobduff@shell01.TheWorld.com> writes: > It's the copy I'm wondering about, not the new and free so much. > To append N characters to an unb string requires quadratic copying > (copying a char N**2 / 2 times). > > If you double the allocation every time, it's amortized linear. > Yes, you waste some space that way. I made a limited Unbounded_String type (with an Append operation) that works this way when faced with a similar problem (I am stuck on a rather old GNAT which allocates every time). 30% performance improvement overall, as I recall. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-25 2:25 ` Robert A Duff 2007-07-25 6:07 ` Simon Wright @ 2007-07-25 19:08 ` Randy Brukardt 2007-07-25 20:37 ` Maciej Sobczak 1 sibling, 1 reply; 35+ messages in thread From: Randy Brukardt @ 2007-07-25 19:08 UTC (permalink / raw) "Robert A Duff" <bobduff@shell01.TheWorld.com> wrote in message news:wcctzrtrzhz.fsf@shell01.TheWorld.com... > "Randy Brukardt" <randy@rrsoftware.com> writes: ... > > Yup; I just checked the code to make sure. Moreover, I have yet to see an > > real example where the performance difference would matter that much (Free > > and new aren't that expensive relative to the alternatives). > > It's the copy I'm wondering about, not the new and free so much. > To append N characters to an unb string requires quadratic copying > (copying a char N**2 / 2 times). True enough, but copying is extremely cheap relative to other operations. It could only matter if N is extremely large, and that won't work in Janus/Ada anyway (because Ada.Strings.Unbounded is broken in that it depends on type Integer, which is 16-bit in Janus/Ada for historical reasons; if you want really big strings in Janus/Ada you have to declare them yourself. I don't much like this situation, but it is not practically fixable.) > If you double the allocation every time, it's amortized linear. > Yes, you waste some space that way. Quadratic allocation maximizes heap fragmentation (a major issue for small heaps, and a performance trap for all), and I no longer use it anywhere for that reason. It also wastes huge amounts of space for long strings. I could see using a scheme where you rounded up to the nearest 16 (or some similar number) bytes (as the storage pool is likely to do something like that anyway); that would catch the long-hanging fruit (i.e. appending lots of single characters) without too much impact in terms of memory. But anything further would have nasty memory use effects on programs with a lot of unbounded strings. And doing that would cost extra overhead on every string operation; most of them for which there would be no gain. I still don't think it is worth it, but obviously a customer example could change my mind. Randy. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-25 19:08 ` Randy Brukardt @ 2007-07-25 20:37 ` Maciej Sobczak 2007-07-25 22:06 ` Georg Bauhaus 2007-07-26 22:11 ` Randy Brukardt 0 siblings, 2 replies; 35+ messages in thread From: Maciej Sobczak @ 2007-07-25 20:37 UTC (permalink / raw) On 25 Lip, 21:08, "Randy Brukardt" <ra...@rrsoftware.com> wrote: > > If you double the allocation every time, it's amortized linear. > > Yes, you waste some space that way. > > Quadratic allocation maximizes heap fragmentation Then instead of doubling the allocation use any multiplier that is smaller than 1.618 (yes, the golden ratio). 1.5 seems to be easy choice. Building a string by appending small chunks to the end seems to be a common practice. Optimizing the library for this case is a wise implementation strategy. > I still don't think it is worth it, but obviously a customer example could > change my mind. What about a customer who is choosing the competing product? Doesn't sound convincing enough? -- Maciej Sobczak http://www.msobczak.com/ ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-25 20:37 ` Maciej Sobczak @ 2007-07-25 22:06 ` Georg Bauhaus 2007-07-26 6:24 ` Maciej Sobczak 2007-07-26 22:11 ` Randy Brukardt 1 sibling, 1 reply; 35+ messages in thread From: Georg Bauhaus @ 2007-07-25 22:06 UTC (permalink / raw) Maciej Sobczak wrote: > Building a string by appending small chunks to the end seems to be a > common practice. Optimizing the library for this case is a wise > implementation strategy. Is there some material on this? I'm wondering whether concatenating strings is more common in languages where strings are lists, or at least not plain arrays. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-25 22:06 ` Georg Bauhaus @ 2007-07-26 6:24 ` Maciej Sobczak 2007-07-26 8:09 ` Dmitry A. Kazakov 2007-07-26 8:35 ` Georg Bauhaus 0 siblings, 2 replies; 35+ messages in thread From: Maciej Sobczak @ 2007-07-26 6:24 UTC (permalink / raw) On 26 Lip, 00:06, Georg Bauhaus <bauhaus.rm.t...@maps.futureapps.de> wrote: > > Building a string by appending small chunks to the end seems to be a > > common practice. Optimizing the library for this case is a wise > > implementation strategy. > > Is there some material on this? I'm wondering whether concatenating > strings is more common in languages where strings are lists, or at > least not plain arrays. Why should that matter? Do you think that the implementation details like this one can influence the way people *think* about strings in their programs? We read text from beginning to the end, acquiring information. It seems obvious that, conversely, adding information to the string is best achieved by adding more characters to the end, at least when human-readable content is involved. This is what I'm actually doing when typing this post. I append characters to what is already there (let's forget correcting typos - automated procedures don't have to do this). Coming back to your question about different languages - it is not the programmer who should adapt his way of building the strings according to how the string is implemented internally in a given language (note that many languages don't even specify it). It is the language that should provide the implementation that is best fitted to how programmers express their algorithms. Why do you think Java added StringBuilder to its library? Because the immutable String didn't quite cut it. -- Maciej Sobczak http://www.msobczak.com/ ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-26 6:24 ` Maciej Sobczak @ 2007-07-26 8:09 ` Dmitry A. Kazakov 2007-07-26 8:20 ` Pascal Obry 2007-07-26 8:35 ` Georg Bauhaus 1 sibling, 1 reply; 35+ messages in thread From: Dmitry A. Kazakov @ 2007-07-26 8:09 UTC (permalink / raw) On Wed, 25 Jul 2007 23:24:03 -0700, Maciej Sobczak wrote: > This is what I'm actually doing when typing this post. I append > characters to what is already there (let's forget correcting typos - > automated procedures don't have to do this). You append characters to words, words to sentences and sentences to paragraphs. String is not the best possible representation for a text. > It is the language that > should provide the implementation that is best fitted to how > programmers express their algorithms. Why do you think Java added > StringBuilder to its library? Because the immutable String didn't > quite cut it. Maybe, but Unbounded_String is not the best way to express algorithms. As for me I avoid using Unbounded_String. I don't like the design of. This does not mean that other containers should not have "accumulation"/"stream" mode. For them I tend to specify the initial size and the multiplier factor. Perhaps, one could introduce something like that for Unbounded_String: X : Unbounded_String := ...; for X'Initial_Size use 1024*8; -- 1K at least for X'Increment_Factor use 1.5; -- Multiply by 1.5 upon expanding -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-26 8:09 ` Dmitry A. Kazakov @ 2007-07-26 8:20 ` Pascal Obry 2007-07-26 9:59 ` Dmitry A. Kazakov 0 siblings, 1 reply; 35+ messages in thread From: Pascal Obry @ 2007-07-26 8:20 UTC (permalink / raw) To: mailbox Dmitry A. Kazakov a �crit : > Perhaps, one could introduce something like that for Unbounded_String: > > X : Unbounded_String := ...; > for X'Initial_Size use 1024*8; -- 1K at least > for X'Increment_Factor use 1.5; -- Multiply by 1.5 upon expanding I really don't see the need for rep attributes here as Unbounded_String is not intrinsic. So Initial_Size (X, 1024*8); Increment_Factor (X, 1.5); looks more appropriate in this case. And mind Randy I do not expect to have to use such artifices in my code. I do expect a compiler to deliver the "expected" best implementation for every supported features. I'm not in the embedded domain and frankly all those low-level details are distractions, I prefer to concentrate on my application and not fighting the compiler :) The main point for Ada to me is the high level abstraction. Pascal. -- --|------------------------------------------------------ --| Pascal Obry Team-Ada Member --| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE --|------------------------------------------------------ --| http://www.obry.net --| "The best way to travel is by means of imagination" --| --| gpg --keyserver wwwkeys.pgp.net --recv-key C1082595 ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-26 8:20 ` Pascal Obry @ 2007-07-26 9:59 ` Dmitry A. Kazakov 0 siblings, 0 replies; 35+ messages in thread From: Dmitry A. Kazakov @ 2007-07-26 9:59 UTC (permalink / raw) On Thu, 26 Jul 2007 10:20:49 +0200, Pascal Obry wrote: > And mind Randy I do not expect to have to use such artifices in my code. > I do expect a compiler to deliver the "expected" best implementation for > every supported features. I'm not in the embedded domain and frankly all > those low-level details are distractions, I prefer to concentrate on my > application and not fighting the compiler :) Wholeheartedly agree > The main point for Ada to me is the high level abstraction. Yes -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-26 6:24 ` Maciej Sobczak 2007-07-26 8:09 ` Dmitry A. Kazakov @ 2007-07-26 8:35 ` Georg Bauhaus 1 sibling, 0 replies; 35+ messages in thread From: Georg Bauhaus @ 2007-07-26 8:35 UTC (permalink / raw) On Wed, 2007-07-25 at 23:24 -0700, Maciej Sobczak wrote: > On 26 Lip, 00:06, Georg Bauhaus <bauhaus.rm.t...@maps.futureapps.de> > wrote: > > > > Building a string by appending small chunks to the end seems to be a > > > common practice. Optimizing the library for this case is a wise > > > implementation strategy. > > > > Is there some material on this? I'm wondering whether concatenating > > strings is more common in languages where strings are lists, or at > > least not plain arrays. > > Why should that matter? Do you think that the implementation details > like this one can influence the way people *think* about strings in > their programs? Precisely; in fact, I think that implementation is what drives most programmers. Also crazy, with obvious economical consequences, but also with quality consequences. To some extent, this is inevitable. Notably when a program is written to control a machine because then you must concentrate of what the computer does in some detail. This is opposed to the vague ideas of what is supposed to happen when a programmer thinks in terms of abstractions of appending characters to some abstraction of String (which is an array!). I'd rather not think about the programs that use strcat() and malloc() and free() a lot for string handling, but I'm almost certain that these functions "influence the way many people *think* about strings in their programs." > We read text from beginning to the end, acquiring information. It > seems obvious that, conversely, adding information to the string is > best achieved by adding more characters to the end, at least when > human-readable content is involved. This might be true for paper strings, but programming a text reader is very different from reading a program (or other text) in my view. Similar reasoning applies to typing. > Coming back to your question about different languages It should have been more about different implementation techniques for strings. When a string is implemented as a doubly linked list and not an array, the performance characteristics will likely be different for a number of operations. When you want to _replace_ substrings with strings of different sizes, then arrays are not at all a good choice made by an implementation/language. When you want to _override_ characters with other characters, arrays may be a good choice. I see the latter choice present in Ada systems. > - it is not the > programmer who should adapt his way of building the strings according > to how the string is implemented internally in a given language (note > that many languages don't even specify it). I think that type String is underspecified (in the heads of programmers) when it comes to performance characteristics. Maybe we are spoiled by the fact that text seems so natural that we think we needn't worry about its implementation as we do for other types. Ada's non-character containers are different in this regard, as already mentioned in the thread. BTW, I have found Ada.Containers.Vectors to be a efficient for some string processing tasks. > It is the language that > should provide the implementation that is best fitted to how > programmers express their algorithms. Without knowing what drives the design of a string-related algorithm it is difficult to judge whether a language's string types have been made for the algorithm. See replace/overwrite above. > Why do you think Java added > StringBuilder to its library? And StringBuffer, right from the start! > Because the immutable String didn't > quite cut it. Yes, this is what I was thinking. Ada does not provide a StringBuilder in the Ada.Strings hierarchy. Unbounded_String is not made for most efficient text composition, as I think we can learn from this thread. Maybe an Unbounded_String is very much like a Matrix. An implementer of Matrix might choose arrays, or lists. How could he/she know whether the objects are going to be sparse matrices? ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: Reserve_Capacity for Unbounded_String? 2007-07-25 20:37 ` Maciej Sobczak 2007-07-25 22:06 ` Georg Bauhaus @ 2007-07-26 22:11 ` Randy Brukardt 1 sibling, 0 replies; 35+ messages in thread From: Randy Brukardt @ 2007-07-26 22:11 UTC (permalink / raw) "Maciej Sobczak" <see.my.homepage@gmail.com> wrote in message news:1185395844.104043.194340@o61g2000hsh.googlegroups.com... ... > Building a string by appending small chunks to the end seems to be a > common practice. Optimizing the library for this case is a wise > implementation strategy. Yes, but not if performance matters. If performance matters, doing a lot of small operations is the worst possible way to do it, irrespective of the implementation of those small operations. We chose > > I still don't think it is worth it, but obviously a customer example could > > change my mind. > > What about a customer who is choosing the competing product? Doesn't > sound convincing enough? No, because customers don't chose products because of the performance of one rarely used library call. They chose products because of price, features, ease of use, reputation of vendor, and quality of support. And it is also the case that they rarely tell the truth about why they chose something (partially because they often don't really have a clear understanding of that either). I'd rather spend our limited efforts on these factors (especially ease of use, where I don't think Janus/Ada really measures up very well) than chasing single customers with single changes, especially ones that will negatively impact other usage patterns. After all, my experience with chasing customers has uniformly been bad; they always have a new reason for not buying once you eliminate the last one. Randy. ^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2007-07-26 22:11 UTC | newest] Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2007-07-22 19:54 Reserve_Capacity for Unbounded_String? Maciej Sobczak 2007-07-22 21:32 ` Robert A Duff 2007-07-23 19:29 ` Maciej Sobczak 2007-07-23 20:30 ` Robert A Duff 2007-07-23 4:28 ` Jeffrey R. Carter 2007-07-23 15:07 ` Adam Beneschan 2007-07-24 1:01 ` Randy Brukardt 2007-07-24 7:57 ` Pascal Obry 2007-07-24 18:58 ` Randy Brukardt 2007-07-24 23:50 ` Robert A Duff 2007-07-25 0:00 ` Randy Brukardt 2007-07-24 23:54 ` Pascal Obry 2007-07-25 0:52 ` Randy Brukardt 2007-07-25 1:28 ` Randy Brukardt 2007-07-25 7:48 ` Pascal Obry 2007-07-25 9:55 ` Georg Bauhaus 2007-07-25 10:02 ` Georg Bauhaus 2007-07-25 18:58 ` Randy Brukardt 2007-07-25 8:50 ` Martin Krischik 2007-07-25 9:26 ` AW: " Grein, Christoph (Fa. ESG) 2007-07-25 15:32 ` Martin Krischik 2007-07-25 15:39 ` Martin Krischik 2007-07-24 23:41 ` Robert A Duff 2007-07-25 0:16 ` Randy Brukardt 2007-07-25 2:25 ` Robert A Duff 2007-07-25 6:07 ` Simon Wright 2007-07-25 19:08 ` Randy Brukardt 2007-07-25 20:37 ` Maciej Sobczak 2007-07-25 22:06 ` Georg Bauhaus 2007-07-26 6:24 ` Maciej Sobczak 2007-07-26 8:09 ` Dmitry A. Kazakov 2007-07-26 8:20 ` Pascal Obry 2007-07-26 9:59 ` Dmitry A. Kazakov 2007-07-26 8:35 ` Georg Bauhaus 2007-07-26 22:11 ` Randy Brukardt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox