* When to use Bounded_String? @ 2017-11-19 2:19 Victor Porton 2017-11-19 9:55 ` Niklas Holsti 2017-11-23 10:04 ` briot.emmanuel 0 siblings, 2 replies; 13+ messages in thread From: Victor Porton @ 2017-11-19 2:19 UTC (permalink / raw) Is it worth to use Bounded_String for short strings (which are expected to be say 12 chars max, as a program version string)? or is Unbounded_String fast enough and this a preliminary optimization? Also not using Bounded_String at all may shorten the program code, right? What is the main purpose of Bounded_String? -- Victor Porton - http://portonvictor.org ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: When to use Bounded_String? 2017-11-19 2:19 When to use Bounded_String? Victor Porton @ 2017-11-19 9:55 ` Niklas Holsti 2017-11-20 5:38 ` J-P. Rosen 2017-11-23 10:04 ` briot.emmanuel 1 sibling, 1 reply; 13+ messages in thread From: Niklas Holsti @ 2017-11-19 9:55 UTC (permalink / raw) On 17-11-19 04:19 , Victor Porton wrote: > Is it worth to use Bounded_String for short strings (which are expected to > be say 12 chars max, as a program version string)? or is Unbounded_String > fast enough and this a preliminary optimization? What is "fast enough" depends on your application and your Ada implementation. If you worry about it, make some measurements. > Also not using Bounded_String at all may shorten the program code, right? By leaving out the code for the Bounded_String instances, you mean? Yes, but for any significant application the reduction in code size is probably very fractional, unless you make very many different instances of Bounded_Strings (and your Ada implementation does not share code between instances). > What is the main purpose of Bounded_String? As I understand it, the purpose is to let a program use string variables of dynamically varying length, without using dynamically allocated heap memory. The penalty is a fixed upper bound on the length, and perhaps more copying of characters from one variable to another (depending on the implementation). However, the avoidance of heap is only Implementation Advice (RM A.4.4(106)) so you should check what your Ada implementation does, if avoiding heap is important to you. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ . ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: When to use Bounded_String? 2017-11-19 9:55 ` Niklas Holsti @ 2017-11-20 5:38 ` J-P. Rosen 2017-11-20 7:32 ` Niklas Holsti 0 siblings, 1 reply; 13+ messages in thread From: J-P. Rosen @ 2017-11-20 5:38 UTC (permalink / raw) Le 19/11/2017 à 10:55, Niklas Holsti a écrit : > On 17-11-19 04:19 , Victor Porton wrote: >> What is the main purpose of Bounded_String? > > As I understand it, the purpose is to let a program use string variables > of dynamically varying length, without using dynamically allocated heap > memory. The penalty is a fixed upper bound on the length, and perhaps > more copying of characters from one variable to another (depending on > the implementation). > No, it's not just a matter of implementation. Bounded_String are a good fit for data types implemented as strings. A typical example is name, address, etc. from a person's data. These are represented as strings, you need variable length, and there is generally a maximum length (comming f.e. from the declaration in the underlying database). Note that each instantiation provides a different type, so you cannot assign a name to an address. -- J-P. Rosen Adalog 2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX Tel: +33 1 45 29 21 52, Fax: +33 1 45 29 25 00 http://www.adalog.fr ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: When to use Bounded_String? 2017-11-20 5:38 ` J-P. Rosen @ 2017-11-20 7:32 ` Niklas Holsti 0 siblings, 0 replies; 13+ messages in thread From: Niklas Holsti @ 2017-11-20 7:32 UTC (permalink / raw) On 17-11-20 07:38 , J-P. Rosen wrote: > Le 19/11/2017 à 10:55, Niklas Holsti a écrit : >> On 17-11-19 04:19 , Victor Porton wrote: >>> What is the main purpose of Bounded_String? >> >> As I understand it, the purpose is to let a program use string variables >> of dynamically varying length, without using dynamically allocated heap >> memory. The penalty is a fixed upper bound on the length, and perhaps >> more copying of characters from one variable to another (depending on >> the implementation). >> > No, it's not just a matter of implementation. Bounded_String are a good > fit for data types implemented as strings. A typical example is name, > address, etc. from a person's data. These are represented as strings, > you need variable length, Yes, but that could be done with Unbounded_String. > and there is generally a maximum length > (comming f.e. from the declaration in the underlying database). It's my impression that fixed length bounds on database fields are (happily) going out of fashion, and that the modern database systems support (practically) unbounded lengths, as do modern GUI systems. Anyway, I would not build a DB-imposed field-length limitation into the basic data types of the application, but would let the DB/API apply the length check. Then, if the DB length bound turns out to be too small in practical operation, it is enough to correct the DB definition; the application does not need to change. A limit on the length of a DB entry is only one of many checks that may be needed on the entry, such as the allowed character set or other lexical and syntactic constraints. > Note that each instantiation provides a different type, so you cannot > assign a name to an address. A side-effect of genericity. Deriving new types from Unbounded_String has the same effect, if needed. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ . ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: When to use Bounded_String? 2017-11-19 2:19 When to use Bounded_String? Victor Porton 2017-11-19 9:55 ` Niklas Holsti @ 2017-11-23 10:04 ` briot.emmanuel 2017-12-28 11:46 ` Vincent DIEMUNSCH 1 sibling, 1 reply; 13+ messages in thread From: briot.emmanuel @ 2017-11-23 10:04 UTC (permalink / raw) On Sunday, November 19, 2017 at 3:19:39 AM UTC+1, Victor Porton wrote: > Is it worth to use Bounded_String for short strings (which are expected to > be say 12 chars max, as a program version string)? or is Unbounded_String > fast enough and this a preliminary optimization? You could use GNATCOLL.Strings, which provide the short-string-optimization: when a string is shorter than 19 or 23 characters (32 bit and 64 bit systems) then no allocation takes place. They also provide a much larger number of operations than standard strings or unbounded_strings, are task safe, and handle unicode. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: When to use Bounded_String? 2017-11-23 10:04 ` briot.emmanuel @ 2017-12-28 11:46 ` Vincent DIEMUNSCH 2017-12-28 12:00 ` Dmitry A. Kazakov 0 siblings, 1 reply; 13+ messages in thread From: Vincent DIEMUNSCH @ 2017-12-28 11:46 UTC (permalink / raw) Le jeudi 23 novembre 2017 11:04:12 UTC+1, briot.e...@gmail.com a écrit : > You could use GNATCOLL.Strings, which provide the short-string-optimization: > when a string is shorter than 19 or 23 characters (32 bit and 64 bit systems) then > no allocation takes place. > They also provide a much larger number of operations than standard strings or > unbounded_strings, are task safe, and handle unicode. Yes, they are really a great improvement. But they would be perfect if : 1. they handled UTF-8 as the de-facto standard encoding, for strings. 2. they could see strings as sequences of 32-bits Unicode Code Points (Wide_Wide_Characters). ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: When to use Bounded_String? 2017-12-28 11:46 ` Vincent DIEMUNSCH @ 2017-12-28 12:00 ` Dmitry A. Kazakov 2017-12-28 12:29 ` Mehdi Saada 2017-12-28 14:28 ` vincent.diemunsch 0 siblings, 2 replies; 13+ messages in thread From: Dmitry A. Kazakov @ 2017-12-28 12:00 UTC (permalink / raw) On 2017-12-28 12:46, Vincent DIEMUNSCH wrote: > Le jeudi 23 novembre 2017 11:04:12 UTC+1, briot.e...@gmail.com a écrit : > >> You could use GNATCOLL.Strings, which provide the short-string-optimization: >> when a string is shorter than 19 or 23 characters (32 bit and 64 bit systems) then >> no allocation takes place. >> They also provide a much larger number of operations than standard strings or >> unbounded_strings, are task safe, and handle unicode. > > Yes, they are really a great improvement. But they would be perfect if : > 1. they handled UTF-8 as the de-facto standard encoding, for strings. You can ignore encoding and use them as if they were UTF-8 > 2. they could see strings as sequences of 32-bits Unicode Code Points (Wide_Wide_Characters). 23 / 4 = 5 characters P.S. Just never copy strings if you have performance concerns (even if you have none). Nothing to optimize then. Use string slices, pass string + an index to start at, do everything in a single pass, there is no reason to waste CPU time, memory and brain cells on "tokenizing". -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: When to use Bounded_String? 2017-12-28 12:00 ` Dmitry A. Kazakov @ 2017-12-28 12:29 ` Mehdi Saada 2017-12-29 0:42 ` Randy Brukardt 2017-12-29 9:11 ` Simon Wright 2017-12-28 14:28 ` vincent.diemunsch 1 sibling, 2 replies; 13+ messages in thread From: Mehdi Saada @ 2017-12-28 12:29 UTC (permalink / raw) Why hasn't GNATCOLL become part of the standard, since it's made by Adacore (correct me if I'm wrong), is alleguedly so better than Ada.Strings, and since GNAT is now the de-facto only fully-ceritified Ada 2012 compiler ? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: When to use Bounded_String? 2017-12-28 12:29 ` Mehdi Saada @ 2017-12-29 0:42 ` Randy Brukardt 2017-12-29 9:11 ` Simon Wright 1 sibling, 0 replies; 13+ messages in thread From: Randy Brukardt @ 2017-12-29 0:42 UTC (permalink / raw) "Mehdi Saada" <00120260a@gmail.com> wrote in message news:158d76ca-7061-400e-8077-222bd4e390d2@googlegroups.com... > Why hasn't GNATCOLL become part of the standard, since it's made > by Adacore (correct me if I'm wrong), is alleguedly so better than > Ada.Strings, > and since GNAT is now the de-facto only fully-ceritified Ada 2012 compiler > ? (1) Adacore has nothing (directly) to do with the Standard, other than financial support of ARG members. (2) No one has volunteered to do the work of converting the packages into the form of Standard. This is a LOT of effort (and I know, having done that with what became Ada.Directories and Ada.Calendar.Arithmetic -- which started out life as Claw packages). (3) Parts of GNATColl are very much dependent on GNAT, while the Standard remains independent of any particular implementation. (We don't want GNAT to be the only Ada 2012 compiler forever!) (4) It's debatable if GNATColl is really better than Ada.Strings. (5) In general, we've not wanted many different libraries in the Standard that do the same thing. If a replacement for Ada.Strings was to be adopted, it would have to provide much better functionality than the existing libraries, not just a small performance improvement. Randy. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: When to use Bounded_String? 2017-12-28 12:29 ` Mehdi Saada 2017-12-29 0:42 ` Randy Brukardt @ 2017-12-29 9:11 ` Simon Wright 1 sibling, 0 replies; 13+ messages in thread From: Simon Wright @ 2017-12-29 9:11 UTC (permalink / raw) Mehdi Saada <00120260a@gmail.com> writes: > GNAT is now the de-facto only fully-ceritified Ada 2012 compiler GNAT implements a lot of Ada 2012, but definitely not all of the optional parts; for instance, in both GNAT GPL 2017 and FSF GCC <= 8 Ada.Directories.Hierarchical_File_Names is unimplemented. (Ada.Directories.Name_Case_Equivalence is also missing, which I think is a mistake[1]). Even the fully-supported versions available only to AdaCore customers will have bugs/features, which may be OS-dependent; out of the box, Linux & macOS systems have problems around ceiling locking (don't know about Windows). I don't know what form of certification is available/provided to AdaCore customers, but it's likely to say something like "conforms to the ARM with the following exceptions: <list of issues>". None of this is intended as a criticism of AdaCore, who do an amazing job! [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80869 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: When to use Bounded_String? 2017-12-28 12:00 ` Dmitry A. Kazakov 2017-12-28 12:29 ` Mehdi Saada @ 2017-12-28 14:28 ` vincent.diemunsch 2017-12-29 0:36 ` Randy Brukardt 1 sibling, 1 reply; 13+ messages in thread From: vincent.diemunsch @ 2017-12-28 14:28 UTC (permalink / raw) Le jeudi 28 décembre 2017 13:00:46 UTC+1, Dmitry A. Kazakov a écrit : > > Yes, they are really a great improvement. But they would be perfect if : > > 1. they handled UTF-8 as the de-facto standard encoding, for strings. > > You can ignore encoding and use them as if they were UTF-8 > Sure. That's what is done, at least on Unixes (Linux and OSX). > > 2. they could see strings as sequences of 32-bits Unicode Code Points (Wide_Wide_Characters). > > 23 / 4 = 5 characters No. At least 5 characters if they are very complicated. But 23 ASCII Characters. The idea here is to decode the UTF-8 string to extract a character and give it in Unicode in the most common format for integers : 32-bits. The only limitation is that you would have sequential access to the string, not random access as with the usual array of characters. But I really don't see the point of having a random access to the characters in a string ! > P.S. Just never copy strings if you have performance concerns (even if > you have none). Nothing to optimize then. Use string slices, pass string > + an index to start at, do everything in a single pass, there is no > reason to waste CPU time, memory and brain cells on "tokenizing". True. Except for storing the identifiers in a symbol table... Kind regards, Vincent ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: When to use Bounded_String? 2017-12-28 14:28 ` vincent.diemunsch @ 2017-12-29 0:36 ` Randy Brukardt 2017-12-29 8:48 ` Dmitry A. Kazakov 0 siblings, 1 reply; 13+ messages in thread From: Randy Brukardt @ 2017-12-29 0:36 UTC (permalink / raw) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 1132 bytes --] <vincent.diemunsch@gmail.com> wrote in message news:37c30172-9386-45fb-86d0-a10998fcade8@googlegroups.com... Le jeudi 28 décembre 2017 13:00:46 UTC+1, Dmitry A. Kazakov a écrit : ... >> P.S. Just never copy strings if you have performance concerns (even if >> you have none). Nothing to optimize then. Use string slices, pass string >> + an index to start at, do everything in a single pass, there is no >> reason to waste CPU time, memory and brain cells on "tokenizing". > >True. Except for storing the identifiers in a symbol table... It's probably good advice in general, but as always it depends on the problem in question. If the problem can be solved better by a sequence of transformations rather than something monolithic, then copying is inevitable. For instance, in my spam filter, I have a transformed version of the message that contains just the text (eliminating the markup, line ends, and the like), to be used for phrase matching. Otherwise, the spammer could easily hide bad phrases by including (invisible) markup or line endings. That requires a copy of the string. Randy. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: When to use Bounded_String? 2017-12-29 0:36 ` Randy Brukardt @ 2017-12-29 8:48 ` Dmitry A. Kazakov 0 siblings, 0 replies; 13+ messages in thread From: Dmitry A. Kazakov @ 2017-12-29 8:48 UTC (permalink / raw) On 2017-12-29 01:36, Randy Brukardt wrote: > For instance, in my spam filter, I have a transformed version of > the message that contains just the text (eliminating the markup, line ends, > and the like), to be used for phrase matching. Otherwise, the spammer could > easily hide bad phrases by including (invisible) markup or line endings. > That requires a copy of the string. Though this is rather translation into another string [pattern]. So copying is reasonable and justified here. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2017-12-29 9:11 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-11-19 2:19 When to use Bounded_String? Victor Porton 2017-11-19 9:55 ` Niklas Holsti 2017-11-20 5:38 ` J-P. Rosen 2017-11-20 7:32 ` Niklas Holsti 2017-11-23 10:04 ` briot.emmanuel 2017-12-28 11:46 ` Vincent DIEMUNSCH 2017-12-28 12:00 ` Dmitry A. Kazakov 2017-12-28 12:29 ` Mehdi Saada 2017-12-29 0:42 ` Randy Brukardt 2017-12-29 9:11 ` Simon Wright 2017-12-28 14:28 ` vincent.diemunsch 2017-12-29 0:36 ` Randy Brukardt 2017-12-29 8:48 ` Dmitry A. Kazakov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox