* Re: Checking to see if a string is a letter
@ 2012-04-03 13:46 7% ` Dmitry A. Kazakov
0 siblings, 0 replies; 2+ results
From: Dmitry A. Kazakov @ 2012-04-03 13:46 UTC (permalink / raw)
On Tue, 03 Apr 2012 09:26:40 +0100, Simon Wright wrote:
> * use the standard library, Ada.Characters.Handling.Is_Letter (probably
> the easiest for you!)
Ada.Wide_Wide_Characters.Handling.Is_Letter for Unicode.
--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
^ permalink raw reply [relevance 7%]
* Re: Why no Ada.Wide_Directories?
@ 2011-10-18 22:54 6% ` ytomino
0 siblings, 0 replies; 2+ results
From: ytomino @ 2011-10-18 22:54 UTC (permalink / raw)
On Oct 19, 12:02 am, Adam Beneschan <a...@irvine.com> wrote:
> I think we have a terminology problem.
OK, sorry that my point of the argument was not put in order well.
Do confirming.
> Latin-1 is a set of characters (a subset of the full Unicode character set).
Yes.
And it's also used as name of encoding. (ISO 8859-1, like Yannick
calls)
> So I get
> confused when people talk about Latin-1 versus UTF-8 strings as if
> they were mutually exclusive. They're not, the way I understand the
> terms. You can have a string composed of Latin-1 characters that's
> represented using UTF-8 encoding; and the bits in that string would be
> different from a string of the same Latin-1 characters using the
> "regular" encoding, if any character in the string is in the 16#80#..
> 16#FF# range.
Yes.
"Latin-1 as character set" is not exclusive with Unicode (UCS-2 or
UCS-4).
"Latin-1 as encoding" is exclusive with UTF-8.
And then, I (we?) talked about "Latin-1 as encoding".
> On the other hand, I was confused by your statement
> "Ada.Character.Handling.To_Upper breaks UTF-8". I don't even see a
> way for this to make sense. Ada.Characters.Handling works on
> character types, and a character type is an enumeration type; but a
> UTF-8 "character" can't be an enumeration type at all, since it's a
> variable-length sequence of 8-bit bytes. I'm not quite sure what you
> meant here.
Ada.Characters and Ada.Strings are defined to work with "Latin-1 as
encoding" in String type.
Some subprograms (like To_Upper) in these will replace upper half
characters (16#80#..) to meaningless values in String holding UTF-8,
if we invoke these with UTF-8 String. (Equal_Case_Insensitive does not
replace characters, but returns meaningless value if parameters have
upper half characters encoded as UTF-8.)
Of course, Ada.Wide_Wide_Characters.Handling.To_Upper
(UTF_Encoding.Wide_Wide_Strings.Decode (any UTF-8 encoded string))
works fine.
> As to having utilities such as versions of Ada.Strings.Unbounded or
> Ada.Strings.Fixed that work directly on UTF-8-encoded strings (and
> versions of Ada.Characters that operate on single UTF-8-encoded
> characters): it's certainly possible to write a package like that, and
> anyone is free to do so, but I just don't think they'd be widely used
> enough to add to the Standard. I could be wrong.
I throught the standard library is going to be separated UTF-8 from
Latin-1, when read about UTF-8 mode of Form parameter that Randy says.
Latin-1 is not familiar for me usually, so I has wanted UTF-8 versions
of Ada.Characters. Sorry that my personal wish was mixed.
But it's certain that the standard library has some lacks for handling
non-ASCII file names.
By the way...
I probably will confuse you more :-)
Do you know that single code-point is NOT single letter for display?
Unicode has "composed character". The cases is existing that plural
code-points represent single real letter.
(refer http://www.unicode.org/reports/tr15/tr15-33.html)
In addition, Unicode has "variation selector", This is a decorator for
previous letter (possible to mix with composed character).
(refer http://www.unicode.org/Public/UNIDATA/StandardizedVariants.html)
Therefore, the difficulty of handling Wide_Wide_String is similar to
the difficulty of handling encoded (UTF-8 or other format) string, in
fact.
^ permalink raw reply [relevance 6%]
Results 1-2 of 2 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2011-10-14 6:58 Why no Ada.Wide_Directories? Michael Rohan
2011-10-15 1:06 ` ytomino
2011-10-17 21:33 ` Randy Brukardt
2011-10-17 23:47 ` ytomino
2011-10-18 1:10 ` Adam Beneschan
2011-10-18 2:32 ` ytomino
2011-10-18 15:02 ` Adam Beneschan
2011-10-18 22:54 6% ` ytomino
2012-04-03 2:11 Checking to see is a string is a letter deuteros
2012-04-03 4:18 ` Leo Brewin
2012-04-03 4:52 ` Checking to see if " deuteros
2012-04-03 5:15 ` Jeffrey Carter
2012-04-03 6:07 ` deuteros
2012-04-03 8:26 ` Simon Wright
2012-04-03 13:46 7% ` Dmitry A. Kazakov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox