From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,64b29dfa2220a59f X-Google-Attributes: gid103376,public,usenet X-Google-Language: ENGLISH,ASCII-7-bit Path: g2news2.google.com!news2.google.com!border1.nntp.dca.giganews.com!nntp.giganews.com!newsfeed00.sul.t-online.de!t-online.de!newsfeed-0.progon.net!progon.net!uucp.gnuu.de!newsfeed.arcor.de!newsspool4.arcor-online.net!news.arcor.de.POSTED!not-for-mail Newsgroups: comp.lang.ada Subject: Re: Reserve_Capacity for Unbounded_String? From: Georg Bauhaus In-Reply-To: <1185431043.649372.223760@r34g2000hsd.googlegroups.com> References: <1185134043.892012.217560@n2g2000hse.googlegroups.com> <1185203238.701948.307410@m37g2000prh.googlegroups.com> <1185395844.104043.194340@o61g2000hsh.googlegroups.com> <46a7c85b$0$3827$9b4e6d93@newsspool4.arcor-online.net> <1185431043.649372.223760@r34g2000hsd.googlegroups.com> Content-Type: text/plain Content-Transfer-Encoding: 7bit Message-ID: <1185438940.28126.31.camel@kartoffel> Mime-Version: 1.0 X-Mailer: Evolution 2.10.1 Date: Thu, 26 Jul 2007 10:35:40 +0200 Organization: Arcor NNTP-Posting-Date: 26 Jul 2007 10:35:12 CEST NNTP-Posting-Host: ee743dff.newsspool3.arcor-online.net X-Trace: DXC=;E^Dbhameo On Wed, 2007-07-25 at 23:24 -0700, Maciej Sobczak wrote: > On 26 Lip, 00:06, Georg Bauhaus > wrote: > > > > Building a string by appending small chunks to the end seems to be a > > > common practice. Optimizing the library for this case is a wise > > > implementation strategy. > > > > Is there some material on this? I'm wondering whether concatenating > > strings is more common in languages where strings are lists, or at > > least not plain arrays. > > Why should that matter? Do you think that the implementation details > like this one can influence the way people *think* about strings in > their programs? Precisely; in fact, I think that implementation is what drives most programmers. Also crazy, with obvious economical consequences, but also with quality consequences. To some extent, this is inevitable. Notably when a program is written to control a machine because then you must concentrate of what the computer does in some detail. This is opposed to the vague ideas of what is supposed to happen when a programmer thinks in terms of abstractions of appending characters to some abstraction of String (which is an array!). I'd rather not think about the programs that use strcat() and malloc() and free() a lot for string handling, but I'm almost certain that these functions "influence the way many people *think* about strings in their programs." > We read text from beginning to the end, acquiring information. It > seems obvious that, conversely, adding information to the string is > best achieved by adding more characters to the end, at least when > human-readable content is involved. This might be true for paper strings, but programming a text reader is very different from reading a program (or other text) in my view. Similar reasoning applies to typing. > Coming back to your question about different languages It should have been more about different implementation techniques for strings. When a string is implemented as a doubly linked list and not an array, the performance characteristics will likely be different for a number of operations. When you want to _replace_ substrings with strings of different sizes, then arrays are not at all a good choice made by an implementation/language. When you want to _override_ characters with other characters, arrays may be a good choice. I see the latter choice present in Ada systems. > - it is not the > programmer who should adapt his way of building the strings according > to how the string is implemented internally in a given language (note > that many languages don't even specify it). I think that type String is underspecified (in the heads of programmers) when it comes to performance characteristics. Maybe we are spoiled by the fact that text seems so natural that we think we needn't worry about its implementation as we do for other types. Ada's non-character containers are different in this regard, as already mentioned in the thread. BTW, I have found Ada.Containers.Vectors to be a efficient for some string processing tasks. > It is the language that > should provide the implementation that is best fitted to how > programmers express their algorithms. Without knowing what drives the design of a string-related algorithm it is difficult to judge whether a language's string types have been made for the algorithm. See replace/overwrite above. > Why do you think Java added > StringBuilder to its library? And StringBuffer, right from the start! > Because the immutable String didn't > quite cut it. Yes, this is what I was thinking. Ada does not provide a StringBuilder in the Ada.Strings hierarchy. Unbounded_String is not made for most efficient text composition, as I think we can learn from this thread. Maybe an Unbounded_String is very much like a Matrix. An implementer of Matrix might choose arrays, or lists. How could he/she know whether the objects are going to be sparse matrices?