comp.lang.ada
 help / color / mirror / Atom feed
From: Dennis Lee Bieber <wlfraed@ix.netcom.com>
Subject: Re: unicode and wide_text_io
Date: Sat, 30 Dec 2017 10:33:26 -0500
Date: 2017-12-30T10:33:26-05:00	[thread overview]
Message-ID: <19cf4dhtoec32ti6nnnduqrgatdj27phvm@4ax.com> (raw)
In-Reply-To: p2822e$7eh$1@dont-email.me

On Sat, 30 Dec 2017 13:50:37 +0100, Björn Lundin <b.f.lundin@gmail.com>
declaimed the following:

>On 2017-12-28 23:36, Mehdi Saada wrote:
>> Myself:
>>> there are positions such as Wide_Character'Val(X) doesn't correspond to the Xth character in the UNICODE standard ??
>> Of course: Character'val(156) to 'val(255) are one byte long, whereas in UTF8 the corresponding code points are encoded with two bytes. Did I understood the lesson ?
>
>Yes - if it fits into 2 bytes. if not UTF-8 uses 3 and 4 bytes instead.
>So UTF-8 can use codepoints up to 32 bits (ca 4 billion)
>
>codepoint between
>1     -> 2**8  -1 = 1 byte

	Isn't that 0..2^7... Any byte with the MSB set is a multibyte code (and
number of MSB bits set before a 0 bit indicates how many bytes).

>2**8  -> 2**16 -1 = 2 bytes
>2**16 -> 2**24 -1 = 3 bytes
>2**24 -> 2**32 -1 = 4 bytes
>
>-- 
-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
    wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

  reply	other threads:[~2017-12-30 15:33 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-27 18:08 unicode and wide_text_io Mehdi Saada
2017-12-27 20:04 ` Dmitry A. Kazakov
2017-12-27 21:47   ` Dennis Lee Bieber
2017-12-27 22:32 ` Mehdi Saada
2017-12-27 22:33   ` Mehdi Saada
2017-12-27 22:48     ` Mehdi Saada
2017-12-27 23:32       ` Mehdi Saada
2017-12-27 23:57   ` Randy Brukardt
2017-12-28  5:20     ` Robert Eachus
2017-12-31 21:41       ` Keith Thompson
2017-12-28  9:04   ` Dmitry A. Kazakov
2017-12-28 11:06     ` Niklas Holsti
2017-12-28 11:50       ` Dmitry A. Kazakov
2017-12-28 13:15 ` Mehdi Saada
2017-12-28 14:25   ` Dmitry A. Kazakov
2017-12-28 14:32     ` Simon Wright
2017-12-28 15:28       ` Niklas Holsti
2017-12-28 15:47         ` 00120260b
2017-12-28 22:35           ` G.B.
2017-12-28 18:15         ` Simon Wright
2017-12-28 22:36 ` Mehdi Saada
2017-12-29  0:51   ` Randy Brukardt
2017-12-30 12:50   ` Björn Lundin
2017-12-30 15:33     ` Dennis Lee Bieber [this message]
2017-12-30 15:56       ` Dmitry A. Kazakov
2017-12-30 23:20       ` Björn Lundin
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox