From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!mx02.eternal-september.org!feeder.eternal-september.org!aioe.org!.POSTED!not-for-mail From: "Dmitry A. Kazakov" Newsgroups: comp.lang.ada Subject: Re: Out_File , excess line Date: Mon, 1 Feb 2016 09:32:46 +0100 Organization: Aioe.org NNTP Server Message-ID: References: <334e0b7a-114c-4bdd-a511-506479f8e572@googlegroups.com> NNTP-Posting-Host: bqgfK7NL3xTHnr0WRaLl4g.user.gioia.aioe.org Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: abuse@aioe.org User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 X-Notice: Filtered by postfilter v. 0.8.2 Xref: news.eternal-september.org comp.lang.ada:29312 Date: 2016-02-01T09:32:46+01:00 List-Id: On 01/02/2016 07:22, comicfanzine@gmail.com wrote: > With Dimitry's method , there is no more the excess line . > > However , there is neither no control for handling text by Unbounded_Strings . > > Maybe it can be solved by a type conversion betwen : > Stream_Element <--> Unbounded_String . > > It can't be done with Explicit type conversion . > > > How to do this ? You cannot do that without knowing the encoding, obviously. Considering: 1. ASCII, Latin-1, UTF-8 packed into an array of Character 2. A byte = 8-bit machine (except for some DSP it is almost any machine) Then a Stream_Element can be converted to Character by this way: Character'Val (Buffer (Index)) Or, let's take 1. UCS-2, UTF-16 packed into an array of Wide_Character 2. byte = 8-bit, again 3. Big-endian code points Wide_Character'Val ( Integer (Buffer (Index)) * 256 + Integer (Buffer (Index + 1)) ) For text processing you must consider how do you keep Unicode texts. There is basically two reasonable approaches: 1. String with the text stored in UTF-8 2. Wide_Wide_String or an array of code points declared as an integer type. The second method is rarely used because it wastes memory and performance while the advantage of having directly indexed characters is negligible for text processing. The first method assumes re-coding if the source is not UTF-8. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de