From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!feeder.eternal-september.org!nntp-feed.chiark.greenend.org.uk!ewrotcd!newsfeed.xs3.de!io.xs3.de!news.jacob-sparre.dk!franka.jacob-sparre.dk!pnx.dk!.POSTED.109.56.52.37.mobile.3.dk!not-for-mail From: Jacob Sparre Andersen Newsgroups: comp.lang.ada Subject: Re: Strange crash on custom iterator Date: Wed, 04 Jul 2018 19:51:19 +0200 Organization: JSA Research & Innovation Message-ID: <87efginb3c.fsf@adaheads.home> References: <70c11a71-3832-4f57-8127-f3f1c48a052f@googlegroups.com> <64d8b4a1-a92c-4b90-b95c-e821749de969@googlegroups.com> <887212304.552080112.848502.laguest-archeia.com@nntp.aioe.org> <87muvan83x.fsf@adaheads.home> <1449870001.552246132.581310.laguest-archeia.com@nntp.aioe.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Injection-Info: franka.jacob-sparre.dk; posting-host="109.56.52.37.mobile.3.dk:109.56.52.37"; logging-data="9359"; mail-complaints-to="news@jacob-sparre.dk" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.4 (gnu/linux) Cancel-Lock: sha1:1vh6Fr5flUjAIdrnGHY7Su3VA7o= Xref: reader02.eternal-september.org comp.lang.ada:53604 Date: 2018-07-04T19:51:19+02:00 List-Id: J-P. Rosen writes: > !!!! I, and many others, often need to search substrings within a > string; actually, I would have a hard time finding an example of > string manipulation without indexing... When you search for a substring within a string, you're typically treating it in a very sequential manner. Maintaining a "cursor" pointing at the octet position in the UTF-8 encoded string would be just as practical in most (all?) of the string processing I can remember doing? Counting the number of code points(?) in a string takes longer time, but if you want the actual number of graphemes in the string, Wide_Wide_Character is practically just as slow as a UTF-8 encoded string. > So, you want different types, plus a typing system that would allow to > mix the types and make them compatible... You might as well put > everything in the same type! It would be nice if the encoding and character set of a string were "implementation details". I'm not sure how to do it, but I think it is worth trying to find a solution for Ada. (I think I was introduced to how the KDE library does it once, but IIRC only encoding was abstracted away.) Greetings, Jacob -- »Saving keystrokes is the job of the text editor, not the programming language.« -- Preben Randhol