comp.lang.ada
 help / color / mirror / Atom feed
* When to use Bounded_String?
@ 2017-11-19  2:19 Victor Porton
  2017-11-19  9:55 ` Niklas Holsti
  2017-11-23 10:04 ` briot.emmanuel
  0 siblings, 2 replies; 13+ messages in thread
From: Victor Porton @ 2017-11-19  2:19 UTC (permalink / raw)


Is it worth to use Bounded_String for short strings (which are expected to 
be say 12 chars max, as a program version string)? or is Unbounded_String 
fast enough and this a preliminary optimization?

Also not using Bounded_String at all may shorten the program code, right?

What is the main purpose of Bounded_String?

-- 
Victor Porton - http://portonvictor.org

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: When to use Bounded_String?
  2017-11-19  2:19 When to use Bounded_String? Victor Porton
@ 2017-11-19  9:55 ` Niklas Holsti
  2017-11-20  5:38   ` J-P. Rosen
  2017-11-23 10:04 ` briot.emmanuel
  1 sibling, 1 reply; 13+ messages in thread
From: Niklas Holsti @ 2017-11-19  9:55 UTC (permalink / raw)


On 17-11-19 04:19 , Victor Porton wrote:
> Is it worth to use Bounded_String for short strings (which are expected to
> be say 12 chars max, as a program version string)? or is Unbounded_String
> fast enough and this a preliminary optimization?

What is "fast enough" depends on your application and your Ada 
implementation. If you worry about it, make some measurements.

> Also not using Bounded_String at all may shorten the program code, right?

By leaving out the code for the Bounded_String instances, you mean? Yes, 
but for any significant application the reduction in code size is 
probably very fractional, unless you make very many different instances 
of Bounded_Strings (and your Ada implementation does not share code 
between instances).

> What is the main purpose of Bounded_String?

As I understand it, the purpose is to let a program use string variables 
of dynamically varying length, without using dynamically allocated heap 
memory. The penalty is a fixed upper bound on the length, and perhaps 
more copying of characters from one variable to another (depending on 
the implementation).

However, the avoidance of heap is only Implementation Advice (RM 
A.4.4(106)) so you should check what your Ada implementation does, if 
avoiding heap is important to you.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: When to use Bounded_String?
  2017-11-19  9:55 ` Niklas Holsti
@ 2017-11-20  5:38   ` J-P. Rosen
  2017-11-20  7:32     ` Niklas Holsti
  0 siblings, 1 reply; 13+ messages in thread
From: J-P. Rosen @ 2017-11-20  5:38 UTC (permalink / raw)


Le 19/11/2017 à 10:55, Niklas Holsti a écrit :
> On 17-11-19 04:19 , Victor Porton wrote:
>> What is the main purpose of Bounded_String?
> 
> As I understand it, the purpose is to let a program use string variables
> of dynamically varying length, without using dynamically allocated heap
> memory. The penalty is a fixed upper bound on the length, and perhaps
> more copying of characters from one variable to another (depending on
> the implementation).
> 
No, it's not just a matter of implementation. Bounded_String are a good
fit for data types implemented as strings. A typical example is name,
address, etc. from a person's data. These are represented as strings,
you need variable length, and there is generally a maximum length
(comming f.e. from the declaration in the underlying database).

Note that each instantiation provides a different type, so you cannot
assign a name to an address.
-- 
J-P. Rosen
Adalog
2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX
Tel: +33 1 45 29 21 52, Fax: +33 1 45 29 25 00
http://www.adalog.fr


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: When to use Bounded_String?
  2017-11-20  5:38   ` J-P. Rosen
@ 2017-11-20  7:32     ` Niklas Holsti
  0 siblings, 0 replies; 13+ messages in thread
From: Niklas Holsti @ 2017-11-20  7:32 UTC (permalink / raw)


On 17-11-20 07:38 , J-P. Rosen wrote:
> Le 19/11/2017 à 10:55, Niklas Holsti a écrit :
>> On 17-11-19 04:19 , Victor Porton wrote:
>>> What is the main purpose of Bounded_String?
>>
>> As I understand it, the purpose is to let a program use string variables
>> of dynamically varying length, without using dynamically allocated heap
>> memory. The penalty is a fixed upper bound on the length, and perhaps
>> more copying of characters from one variable to another (depending on
>> the implementation).
>>
> No, it's not just a matter of implementation. Bounded_String are a good
> fit for data types implemented as strings. A typical example is name,
> address, etc. from a person's data. These are represented as strings,
> you need variable length,

Yes, but that could be done with Unbounded_String.

> and there is generally a maximum length
> (comming f.e. from the declaration in the underlying database).

It's my impression that fixed length bounds on database fields are 
(happily) going out of fashion, and that the modern database systems 
support (practically) unbounded lengths, as do modern GUI systems.

Anyway, I would not build a DB-imposed field-length limitation into the 
basic data types of the application, but would let the DB/API apply the 
length check. Then, if the DB length bound turns out to be too small in 
practical operation, it is enough to correct the DB definition; the 
application does not need to change.

A limit on the length of a DB entry is only one of many checks that may 
be needed on the entry, such as the allowed character set or other 
lexical and syntactic constraints.

> Note that each instantiation provides a different type, so you cannot
> assign a name to an address.

A side-effect of genericity. Deriving new types from Unbounded_String 
has the same effect, if needed.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: When to use Bounded_String?
  2017-11-19  2:19 When to use Bounded_String? Victor Porton
  2017-11-19  9:55 ` Niklas Holsti
@ 2017-11-23 10:04 ` briot.emmanuel
  2017-12-28 11:46   ` Vincent DIEMUNSCH
  1 sibling, 1 reply; 13+ messages in thread
From: briot.emmanuel @ 2017-11-23 10:04 UTC (permalink / raw)


On Sunday, November 19, 2017 at 3:19:39 AM UTC+1, Victor Porton wrote:
> Is it worth to use Bounded_String for short strings (which are expected to 
> be say 12 chars max, as a program version string)? or is Unbounded_String 
> fast enough and this a preliminary optimization?


You could use GNATCOLL.Strings, which provide the short-string-optimization:
when a string is shorter than 19 or 23 characters (32 bit and 64 bit systems) then
no allocation takes place.
They also provide a much larger number of operations than standard strings or
unbounded_strings, are task safe, and handle unicode.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: When to use Bounded_String?
  2017-11-23 10:04 ` briot.emmanuel
@ 2017-12-28 11:46   ` Vincent DIEMUNSCH
  2017-12-28 12:00     ` Dmitry A. Kazakov
  0 siblings, 1 reply; 13+ messages in thread
From: Vincent DIEMUNSCH @ 2017-12-28 11:46 UTC (permalink / raw)


Le jeudi 23 novembre 2017 11:04:12 UTC+1, briot.e...@gmail.com a écrit :

> You could use GNATCOLL.Strings, which provide the short-string-optimization:
> when a string is shorter than 19 or 23 characters (32 bit and 64 bit systems) then
> no allocation takes place.
> They also provide a much larger number of operations than standard strings or
> unbounded_strings, are task safe, and handle unicode.

Yes, they are really a great improvement. But they would be perfect if :
1. they handled UTF-8 as the de-facto standard encoding, for strings.
2. they could see strings as sequences of 32-bits Unicode Code Points (Wide_Wide_Characters).


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: When to use Bounded_String?
  2017-12-28 11:46   ` Vincent DIEMUNSCH
@ 2017-12-28 12:00     ` Dmitry A. Kazakov
  2017-12-28 12:29       ` Mehdi Saada
  2017-12-28 14:28       ` vincent.diemunsch
  0 siblings, 2 replies; 13+ messages in thread
From: Dmitry A. Kazakov @ 2017-12-28 12:00 UTC (permalink / raw)


On 2017-12-28 12:46, Vincent DIEMUNSCH wrote:
> Le jeudi 23 novembre 2017 11:04:12 UTC+1, briot.e...@gmail.com a écrit :
> 
>> You could use GNATCOLL.Strings, which provide the short-string-optimization:
>> when a string is shorter than 19 or 23 characters (32 bit and 64 bit systems) then
>> no allocation takes place.
>> They also provide a much larger number of operations than standard strings or
>> unbounded_strings, are task safe, and handle unicode.
> 
> Yes, they are really a great improvement. But they would be perfect if :
> 1. they handled UTF-8 as the de-facto standard encoding, for strings.

You can ignore encoding and use them as if they were UTF-8

> 2. they could see strings as sequences of 32-bits Unicode Code Points (Wide_Wide_Characters).

23 / 4 = 5 characters

P.S. Just never copy strings if you have performance concerns (even if 
you have none). Nothing to optimize then. Use string slices, pass string 
+ an index to start at, do everything in a single pass, there is no 
reason to waste CPU time, memory and brain cells on "tokenizing".

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: When to use Bounded_String?
  2017-12-28 12:00     ` Dmitry A. Kazakov
@ 2017-12-28 12:29       ` Mehdi Saada
  2017-12-29  0:42         ` Randy Brukardt
  2017-12-29  9:11         ` Simon Wright
  2017-12-28 14:28       ` vincent.diemunsch
  1 sibling, 2 replies; 13+ messages in thread
From: Mehdi Saada @ 2017-12-28 12:29 UTC (permalink / raw)


Why hasn't GNATCOLL become part of the standard, since it's made by Adacore (correct me if I'm wrong), is alleguedly so better than Ada.Strings, and since GNAT is now the de-facto only fully-ceritified Ada 2012 compiler ? 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: When to use Bounded_String?
  2017-12-28 12:00     ` Dmitry A. Kazakov
  2017-12-28 12:29       ` Mehdi Saada
@ 2017-12-28 14:28       ` vincent.diemunsch
  2017-12-29  0:36         ` Randy Brukardt
  1 sibling, 1 reply; 13+ messages in thread
From: vincent.diemunsch @ 2017-12-28 14:28 UTC (permalink / raw)


Le jeudi 28 décembre 2017 13:00:46 UTC+1, Dmitry A. Kazakov a écrit :
> > Yes, they are really a great improvement. But they would be perfect if :
> > 1. they handled UTF-8 as the de-facto standard encoding, for strings.
> 
> You can ignore encoding and use them as if they were UTF-8
> 
Sure. That's what is done, at least on Unixes (Linux and OSX).


> > 2. they could see strings as sequences of 32-bits Unicode Code Points (Wide_Wide_Characters).
> 
> 23 / 4 = 5 characters

No. At least 5 characters if they are very complicated. But 23 ASCII Characters.
The idea here is to decode the UTF-8 string to extract a character and give it in Unicode in the most common format for integers : 32-bits.
 
The only limitation is that you would have sequential access to the string, not random access as with the usual array of characters. But I really don't see the
point of having a random access to the characters in a string !

> P.S. Just never copy strings if you have performance concerns (even if 
> you have none). Nothing to optimize then. Use string slices, pass string 
> + an index to start at, do everything in a single pass, there is no 
> reason to waste CPU time, memory and brain cells on "tokenizing".

True. Except for storing the identifiers in a symbol table...

Kind regards,

Vincent

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: When to use Bounded_String?
  2017-12-28 14:28       ` vincent.diemunsch
@ 2017-12-29  0:36         ` Randy Brukardt
  2017-12-29  8:48           ` Dmitry A. Kazakov
  0 siblings, 1 reply; 13+ messages in thread
From: Randy Brukardt @ 2017-12-29  0:36 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1132 bytes --]

<vincent.diemunsch@gmail.com> wrote in message 
news:37c30172-9386-45fb-86d0-a10998fcade8@googlegroups.com...
Le jeudi 28 décembre 2017 13:00:46 UTC+1, Dmitry A. Kazakov a écrit :
...
>> P.S. Just never copy strings if you have performance concerns (even if
>> you have none). Nothing to optimize then. Use string slices, pass string
>> + an index to start at, do everything in a single pass, there is no
>> reason to waste CPU time, memory and brain cells on "tokenizing".
>
>True. Except for storing the identifiers in a symbol table...

It's probably good advice in general, but as always it depends on the 
problem in question. If the problem can be solved better by a sequence of 
transformations rather than something monolithic, then copying is 
inevitable. For instance, in my spam filter, I have a transformed version of 
the message that contains just the text (eliminating the markup, line ends, 
and the like), to be used for phrase matching. Otherwise, the spammer could 
easily hide bad phrases by including (invisible) markup or line endings. 
That requires a copy of the string.

                           Randy.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: When to use Bounded_String?
  2017-12-28 12:29       ` Mehdi Saada
@ 2017-12-29  0:42         ` Randy Brukardt
  2017-12-29  9:11         ` Simon Wright
  1 sibling, 0 replies; 13+ messages in thread
From: Randy Brukardt @ 2017-12-29  0:42 UTC (permalink / raw)


"Mehdi Saada" <00120260a@gmail.com> wrote in message 
news:158d76ca-7061-400e-8077-222bd4e390d2@googlegroups.com...
> Why hasn't GNATCOLL become part of the standard, since it's made
> by Adacore (correct me if I'm wrong), is alleguedly so better than 
> Ada.Strings,
> and since GNAT is now the de-facto only fully-ceritified Ada 2012 compiler 
> ?

(1) Adacore has nothing (directly) to do with the Standard, other than 
financial support of ARG members.
(2) No one has volunteered to do the work of converting the packages into 
the form of Standard. This is a LOT of effort (and I know, having done that 
with what became Ada.Directories and Ada.Calendar.Arithmetic -- which 
started out life as Claw packages).
(3) Parts of GNATColl are very much dependent on GNAT, while the Standard 
remains independent of any particular implementation. (We don't want GNAT to 
be the only Ada 2012 compiler forever!)
(4) It's debatable if GNATColl is really better than Ada.Strings.
(5) In general, we've not wanted many different libraries in the Standard 
that do the same thing. If a replacement for Ada.Strings was to be adopted, 
it would have to provide much better functionality than the existing 
libraries, not just a small performance improvement.

                                        Randy.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: When to use Bounded_String?
  2017-12-29  0:36         ` Randy Brukardt
@ 2017-12-29  8:48           ` Dmitry A. Kazakov
  0 siblings, 0 replies; 13+ messages in thread
From: Dmitry A. Kazakov @ 2017-12-29  8:48 UTC (permalink / raw)


On 2017-12-29 01:36, Randy Brukardt wrote:
> For instance, in my spam filter, I have a transformed version of
> the message that contains just the text (eliminating the markup, line ends,
> and the like), to be used for phrase matching. Otherwise, the spammer could
> easily hide bad phrases by including (invisible) markup or line endings.
> That requires a copy of the string.

Though this is rather translation into another string [pattern]. So 
copying is reasonable and justified here.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: When to use Bounded_String?
  2017-12-28 12:29       ` Mehdi Saada
  2017-12-29  0:42         ` Randy Brukardt
@ 2017-12-29  9:11         ` Simon Wright
  1 sibling, 0 replies; 13+ messages in thread
From: Simon Wright @ 2017-12-29  9:11 UTC (permalink / raw)


Mehdi Saada <00120260a@gmail.com> writes:

> GNAT is now the de-facto only fully-ceritified Ada 2012 compiler

GNAT implements a lot of Ada 2012, but definitely not all of the
optional parts; for instance, in both GNAT GPL 2017 and FSF GCC <= 8
Ada.Directories.Hierarchical_File_Names is
unimplemented. (Ada.Directories.Name_Case_Equivalence is also missing,
which I think is a mistake[1]).

Even the fully-supported versions available only to AdaCore customers
will have bugs/features, which may be OS-dependent; out of the box,
Linux & macOS systems have problems around ceiling locking (don't know
about Windows).

I don't know what form of certification is available/provided to AdaCore
customers, but it's likely to say something like "conforms to the ARM
with the following exceptions: <list of issues>".

None of this is intended as a criticism of AdaCore, who do an amazing
job!

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80869


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-12-29  9:11 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-19  2:19 When to use Bounded_String? Victor Porton
2017-11-19  9:55 ` Niklas Holsti
2017-11-20  5:38   ` J-P. Rosen
2017-11-20  7:32     ` Niklas Holsti
2017-11-23 10:04 ` briot.emmanuel
2017-12-28 11:46   ` Vincent DIEMUNSCH
2017-12-28 12:00     ` Dmitry A. Kazakov
2017-12-28 12:29       ` Mehdi Saada
2017-12-29  0:42         ` Randy Brukardt
2017-12-29  9:11         ` Simon Wright
2017-12-28 14:28       ` vincent.diemunsch
2017-12-29  0:36         ` Randy Brukardt
2017-12-29  8:48           ` Dmitry A. Kazakov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox