* Strange crash on custom iterator @ 2018-06-30 10:48 Lucretia 2018-06-30 11:32 ` Simon Wright 0 siblings, 1 reply; 73+ messages in thread From: Lucretia @ 2018-06-30 10:48 UTC (permalink / raw) I finally got around to getting back to my iterator and on a first test implementation, i.e. to just iterate over each element of the array, the thing crashes in the Element function when accessing the array through the cursor. The source is https://bpaste.net/show/6c5fca4c0ffd and the gdb session is https://bpaste.net/show/5b0cf9d2be79 Any idea how it could be doing this? I'm wondering if this is because I'm not using a tagged type. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-06-30 10:48 Strange crash on custom iterator Lucretia @ 2018-06-30 11:32 ` Simon Wright 2018-06-30 12:02 ` Lucretia 0 siblings, 1 reply; 73+ messages in thread From: Simon Wright @ 2018-06-30 11:32 UTC (permalink / raw) Lucretia <laguest9000@googlemail.com> writes: > I finally got around to getting back to my iterator and on a first > test implementation, i.e. to just iterate over each element of the > array, the thing crashes in the Element function when accessing the > array through the cursor. > > The source is https://bpaste.net/show/6c5fca4c0ffd and the gdb session > is https://bpaste.net/show/5b0cf9d2be79 UCA.Encoding missing? ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-06-30 11:32 ` Simon Wright @ 2018-06-30 12:02 ` Lucretia 2018-06-30 14:25 ` Simon Wright 0 siblings, 1 reply; 73+ messages in thread From: Lucretia @ 2018-06-30 12:02 UTC (permalink / raw) On Saturday, 30 June 2018 12:32:14 UTC+1, Simon Wright wrote: > Lucretia <> writes: > > > I finally got around to getting back to my iterator and on a first > > test implementation, i.e. to just iterate over each element of the > > array, the thing crashes in the Element function when accessing the > > array through the cursor. > > > > The source is https://bpaste.net/show/6c5fca4c0ffd and the gdb session > > is https://bpaste.net/show/5b0cf9d2be79 > > UCA.Encoding missing? Balls! https://bpaste.net/show/a0d108820ce6 ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-06-30 12:02 ` Lucretia @ 2018-06-30 14:25 ` Simon Wright 2018-06-30 14:33 ` Lucretia 2018-06-30 14:34 ` Lucretia 0 siblings, 2 replies; 73+ messages in thread From: Simon Wright @ 2018-06-30 14:25 UTC (permalink / raw) Lucretia <laguest9000@googlemail.com> writes: > On Saturday, 30 June 2018 12:32:14 UTC+1, Simon Wright wrote: >> Lucretia <> writes: >> >> > I finally got around to getting back to my iterator and on a first >> > test implementation, i.e. to just iterate over each element of the >> > array, the thing crashes in the Element function when accessing the >> > array through the cursor. >> > >> > The source is https://bpaste.net/show/6c5fca4c0ffd and the gdb session >> > is https://bpaste.net/show/5b0cf9d2be79 >> >> UCA.Encoding missing? > > Balls! https://bpaste.net/show/a0d108820ce6 First, I think Has_Element should probably be function Has_Element (Position : in Cursor) return Boolean is begin return Position.Index in Position.Data'Range; end Has_Element; Second, there's something odd about the Address_To_Access_Conversions: in Iterate, the address of the passed Container (which is on the stack!!!) appears in I.Data, but I.Data's length is 0. I got it to work (at first glance) with type Cursor is record Data : Unicode_String_Access := null; Index : Positive := Positive'Last; end record; type Code_Point_Iterator is new Limited_Controlled and Code_Point_Iterators.Forward_Iterator with record Data : Unicode_String_Access := null; end record; and in Iterate return I : Code_Point_Iterator := (Limited_Controlled with Data => new Unicode_String'(Container)) do but of course you probably don't want the copy. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-06-30 14:25 ` Simon Wright @ 2018-06-30 14:33 ` Lucretia 2018-06-30 19:25 ` Simon Wright 2018-06-30 14:34 ` Lucretia 1 sibling, 1 reply; 73+ messages in thread From: Lucretia @ 2018-06-30 14:33 UTC (permalink / raw) On Saturday, 30 June 2018 15:25:39 UTC+1, Simon Wright wrote: > but of course you probably don't want the copy. Exactly! :( ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-06-30 14:33 ` Lucretia @ 2018-06-30 19:25 ` Simon Wright 2018-06-30 19:36 ` Luke A. Guest 0 siblings, 1 reply; 73+ messages in thread From: Simon Wright @ 2018-06-30 19:25 UTC (permalink / raw) Lucretia <laguest9000@googlemail.com> writes: > On Saturday, 30 June 2018 15:25:39 UTC+1, Simon Wright wrote: > >> but of course you probably don't want the copy. > > Exactly! :( I suspect that Unicode_String would need to be by-reference. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-06-30 19:25 ` Simon Wright @ 2018-06-30 19:36 ` Luke A. Guest 2018-07-01 18:06 ` Jacob Sparre Andersen 0 siblings, 1 reply; 73+ messages in thread From: Luke A. Guest @ 2018-06-30 19:36 UTC (permalink / raw) Simon Wright <simon@pushface.org> wrote: > Lucretia < > >> On Saturday, 30 June 2018 15:25:39 UTC+1, Simon Wright wrote: >> >>> but of course you probably don't want the copy. >> >> Exactly! :( > > I suspect that Unicode_String would need to be by-reference. > Yeah I think I’m going to have to make it tagged. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-06-30 19:36 ` Luke A. Guest @ 2018-07-01 18:06 ` Jacob Sparre Andersen 2018-07-01 19:59 ` Simon Wright 2018-07-02 8:31 ` Lucretia 0 siblings, 2 replies; 73+ messages in thread From: Jacob Sparre Andersen @ 2018-07-01 18:06 UTC (permalink / raw) Luke A. Guest wrote: > Simon Wright <simon@pushface.org> wrote: >> I suspect that Unicode_String would need to be by-reference. > > Yeah I think I’m going to have to make it tagged. You don't need to make it tagged, to pass it by reference. It is enough to make the formal parameter aliased. Greetings, Jacob -- What the Iron Maiden was to stupid tyrants, the committee was to Lord Vetinari; it was only slightly more expensive, far less messy, considerably more efficient and, best of all, you had to *force* people to climb inside the Iron Maiden. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-01 18:06 ` Jacob Sparre Andersen @ 2018-07-01 19:59 ` Simon Wright 2018-07-02 17:43 ` Luke A. Guest 2018-07-02 8:31 ` Lucretia 1 sibling, 1 reply; 73+ messages in thread From: Simon Wright @ 2018-07-01 19:59 UTC (permalink / raw) Jacob Sparre Andersen <jacob@jacob-sparre.dk> writes: > Luke A. Guest wrote: >> Simon Wright <simon@pushface.org> wrote: > >>> I suspect that Unicode_String would need to be by-reference. >> >> Yeah I think I’m going to have to make it tagged. > > You don't need to make it tagged, to pass it by reference. It is enough > to make the formal parameter aliased. Yes, that works (except you have to make the container you're iterating over aliased too). ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-01 19:59 ` Simon Wright @ 2018-07-02 17:43 ` Luke A. Guest 2018-07-02 19:42 ` Simon Wright 0 siblings, 1 reply; 73+ messages in thread From: Luke A. Guest @ 2018-07-02 17:43 UTC (permalink / raw) Simon Wright <> wrote: >> You don't need to make it tagged, to pass it by reference. It is enough >> to make the formal parameter aliased. > > Yes, that works (except you have to make the container you're iterating > over aliased too). I had to make the iterate for nation take “aliased in out” and make the array aliased, but it still does in the same place. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-02 17:43 ` Luke A. Guest @ 2018-07-02 19:42 ` Simon Wright 2018-07-03 14:08 ` Lucretia 0 siblings, 1 reply; 73+ messages in thread From: Simon Wright @ 2018-07-02 19:42 UTC (permalink / raw) [-- Attachment #1: Type: text/plain, Size: 450 bytes --] Luke A. Guest <laguest@archeia.com> writes: > Simon Wright <> wrote: > >>> You don't need to make it tagged, to pass it by reference. It is enough >>> to make the formal parameter aliased. >> >> Yes, that works (except you have to make the container you're iterating >> over aliased too). > > I had to make the iterate for nation take “aliased in out” and make the > array aliased, but it still does in the same place. This worked for me .. [-- Attachment #2: gnatchop-me --] [-- Type: text/plain, Size: 9880 bytes --] -- Copyright 2018, Luke A. Guest -- License TBD. with Ada.Characters.Latin_1; with Ada.Text_IO; use Ada.Text_IO; with UCA.Encoding; with UCA.Iterators; procedure Test is package L1 renames Ada.Characters.Latin_1; package Octet_IO is new Ada.Text_IO.Modular_IO (UCA.Octets); use Octet_IO; -- D : UCA.Octets := Character'Pos ('Q'); -- A : UCA.Unicode_String := UCA.To_Array (D); -- A2 : UCA.Unicode_String := UCA.Unicode_String'(1, 0, 0, 0, 0, 0, 1, 0); -- D2 : UCA.Octets := UCA.To_Octet (A2); -- package OA_IO is new Ada.Text_IO.Integer_IO (Num => UCA.Bits); use UCA.Encoding; A : aliased UCA.Unicode_String := +("ᚠᛇᚻ᛫ᛒᛦᚦ᛫ᚠᚱᚩᚠᚢᚱ᛫ᚠᛁᚱᚪ᛫ᚷᛖᚻᚹᛦᛚᚳᚢᛗ" & L1.LF & "Hello, world" & L1.LF & "Sîne klâwen durh die wolken sint geslagen," & L1.LF & "Τη γλώσσα μου έδωσαν ελληνική" & L1.LF & "मैं काँच खा सकता हूँ और मुझे उससे कोई चोट नहीं पहुंचती." & L1.LF & "میں کانچ کھا سکتا ہوں اور مجھے تکلیف نہیں ہوتی"); B : aliased UCA.Unicode_String := (225, 154, 160, 225, 155, 135, 225, 154, 187, 225, 155, 171, 225, 155, 146, 225, 155, 166, 225, 154, 166, 225, 155, 171, 225, 154, 160, 225, 154, 177, 225, 154, 169, 225, 154, 160, 225, 154, 162, 225, 154, 177, 225, 155, 171, 225, 154, 160, 225, 155, 129, 225, 154, 177, 225, 154, 170, 225, 155, 171, 225, 154, 183, 225, 155, 150, 225, 154, 187, 225, 154, 185, 225, 155, 166, 225, 155, 154, 225, 154, 179, 225, 154, 162, 225, 155, 151, 10, 72, 101, 108, 108, 111, 44, 32, 119, 111, 114, 108, 100, 10, 83, 195, 174, 110, 101, 32, 107, 108, 195, 162, 119, 101, 110, 32, 100, 117, 114, 104, 32, 100, 105, 101, 32, 119, 111, 108, 107, 101, 110, 32, 115, 105, 110, 116, 32, 103, 101, 115, 108, 97, 103, 101, 110, 44, 10, 206, 164, 206, 183, 32, 206, 179, 206, 187, 207, 142, 207, 131, 207, 131, 206, 177, 32, 206, 188, 206, 191, 207, 133, 32, 206, 173, 206, 180, 207, 137, 207, 131, 206, 177, 206, 189, 32, 206, 181, 206, 187, 206, 187, 206, 183, 206, 189, 206, 185, 206, 186, 206, 174, 10, 224, 164, 174, 224, 165, 136, 224, 164, 130, 32, 224, 164, 149, 224, 164, 190, 224, 164, 129, 224, 164, 154, 32, 224, 164, 150, 224, 164, 190, 32, 224, 164, 184, 224, 164, 149, 224, 164, 164, 224, 164, 190, 32, 224, 164, 185, 224, 165, 130, 224, 164, 129, 32, 224, 164, 148, 224, 164, 176, 32, 224, 164, 174, 224, 165, 129, 224, 164, 157, 224, 165, 135, 32, 224, 164, 137, 224, 164, 184, 224, 164, 184, 224, 165, 135, 32, 224, 164, 149, 224, 165, 139, 224, 164, 136, 32, 224, 164, 154, 224, 165, 139, 224, 164, 159, 32, 224, 164, 168, 224, 164, 185, 224, 165, 128, 224, 164, 130, 32, 224, 164, 170, 224, 164, 185, 224, 165, 129, 224, 164, 130, 224, 164, 154, 224, 164, 164, 224, 165, 128, 46, 10, 217, 133, 219, 140, 218, 186, 32, 218, 169, 216, 167, 217, 134, 218, 134, 32, 218, 169, 218, 190, 216, 167, 32, 216, 179, 218, 169, 216, 170, 216, 167, 32, 219, 129, 217, 136, 218, 186, 32, 216, 167, 217, 136, 216, 177, 32, 217, 133, 216, 172, 218, 190, 219, 146, 32, 216, 170, 218, 169, 217, 132, 219, 140, 217, 129, 32, 217, 134, 219, 129, 219, 140, 218, 186, 32, 219, 129, 217, 136, 216, 170, 219, 140); begin -- Put_Line ("A => " & To_UTF_8_String (A)); Put_Line ("A => " & L1.LF & String (+A)); Put_Line ("A => "); Put ('('); for E of A loop Put (Item => E, Base => 2); Put (", "); end loop; Put (')'); New_Line; Put_Line ("B => " & L1.LF & String (+B)); Put_Line ("A (Iterated) => "); for I in UCA.Iterators.Iterate (A) loop Put (UCA.Iterators.Element (I)); -- ERROR! Dies in Element, Data has nothing gdb => p position - $1 = (data => (), index => 1) end loop; New_Line; end Test; with Ada.Strings.UTF_Encoding; with Ada.Unchecked_Conversion; package UCA is use Ada.Strings.UTF_Encoding; type Octets is mod 2 ** 8 with Size => 8; type Unicode_String is array (Positive range <>) of Octets with Pack => True; type Unicode_String_Access is access all Unicode_String; -- This should match Wide_Wide_Character in size. type Code_Points is mod 2 ** 32 with Static_Predicate => Code_Points in 0 .. 16#0000_D7FF# or Code_Points in 16#0000_E000# .. 16#0010_FFFF#, Size => 32; private type Bits is range 0 .. 1 with Size => 1; type Bit_Range is range 0 .. Octets'Size - 1; end UCA; with Ada.Finalization; with Ada.Iterator_Interfaces; private with System.Address_To_Access_Conversions; package UCA.Iterators is --------------------------------------------------------------------------------------------------------------------- -- Iteration over code points. --------------------------------------------------------------------------------------------------------------------- type Cursor is private; pragma Preelaborable_Initialization (Cursor); function Has_Element (Position : in Cursor) return Boolean; function Element (Position : in Cursor) return Octets; package Code_Point_Iterators is new Ada.Iterator_Interfaces (Cursor, Has_Element); function Iterate (Container : aliased in Unicode_String) return Code_Point_Iterators.Forward_Iterator'Class; function Iterate (Container : aliased in Unicode_String; Start : in Cursor) return Code_Point_Iterators.Forward_Iterator'Class; --------------------------------------------------------------------------------------------------------------------- -- Iteration over grapheme clusters. --------------------------------------------------------------------------------------------------------------------- private use Ada.Finalization; package Convert is new System.Address_To_Access_Conversions (Unicode_String); type Cursor is record Data : Convert.Object_Pointer := null; Index : Positive := Positive'Last; end record; type Code_Point_Iterator is new Limited_Controlled and Code_Point_Iterators.Forward_Iterator with record Data : Convert.Object_Pointer := null; end record; overriding function First (Object : in Code_Point_Iterator) return Cursor; overriding function Next (Object : in Code_Point_Iterator; Position : Cursor) return Cursor; end UCA.Iterators; with Ada.Text_IO; use Ada.Text_IO; package body UCA.Iterators is package Octet_IO is new Ada.Text_IO.Modular_IO (UCA.Octets); use Octet_IO; use type Convert.Object_Pointer; function Has_Element (Position : in Cursor) return Boolean is begin return Position.Index in Position.Data'Range; end Has_Element; function Element (Position : in Cursor) return Octets is begin if Position.Data = null then raise Constraint_Error with "Fuck!"; end if; Put ("<< Element - " & Positive'Image (Position.Index) & " - "); Put (Position.Data (Position.Index)); Put_Line (" >>"); return Position.Data (Position.Index); end Element; function Iterate (Container : aliased in Unicode_String) return Code_Point_Iterators.Forward_Iterator'Class is begin Put_Line ("<< iterate >>"); return I : Code_Point_Iterator := (Limited_Controlled with Data => Convert.To_Pointer (Container'Address)) do if I.Data = null then Put_Line ("Data => null"); else Put_Line ("Data => not null - Length: " & Positive'Image (I.Data'Length)); end if; null; end return; end Iterate; function Iterate (Container : aliased in Unicode_String; Start : in Cursor) return Code_Point_Iterators.Forward_Iterator'Class is begin Put_Line ("<< iterate >>"); return I : Code_Point_Iterator := (Limited_Controlled with Data => Convert.To_Pointer (Container'Address)) do if I.Data = null then Put_Line ("Data => null"); else Put_Line ("Data => not null"); end if; null; end return; end Iterate; --------------------------------------------------------------------------------------------------------------------- -- Iteration over grapheme clusters. --------------------------------------------------------------------------------------------------------------------- overriding function First (Object : in Code_Point_Iterator) return Cursor is begin return (Data => Object.Data, Index => Positive'First); end First; overriding function Next (Object : in Code_Point_Iterator; Position : Cursor) return Cursor is begin return (Data => Object.Data, Index => Position.Index + 1); end Next; end UCA.Iterators; -- Copyright © 2018, Luke A. Guest with Ada.Unchecked_Conversion; package body UCA.Encoding is function To_Unicode_String (Str : in String) return Unicode_String is Result : Unicode_String (1 .. Str'Length) with Address => Str'Address; begin return Result; end To_Unicode_String; function To_String (Str : in Unicode_String) return String is Result : String (1 .. Str'Length) with Address => Str'Address; begin return Result; end To_String; end UCA.Encoding; package UCA.Encoding is use Ada.Strings.UTF_Encoding; function To_Unicode_String (Str : in String) return Unicode_String; function To_String (Str : in Unicode_String) return String; function "+" (Str : in String) return Unicode_String renames To_Unicode_String; function "+" (Str : in Unicode_String) return String renames To_String; end UCA.Encoding; ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-02 19:42 ` Simon Wright @ 2018-07-03 14:08 ` Lucretia 2018-07-03 14:17 ` J-P. Rosen 0 siblings, 1 reply; 73+ messages in thread From: Lucretia @ 2018-07-03 14:08 UTC (permalink / raw) On Monday, 2 July 2018 20:42:59 UTC+1, Simon Wright wrote: > This worked for me .. Thanks, needed the extra in Has_Element as well. But there are other issues as well: 1) Cannot pass an array which has been declared and initialised to an aliased parameter: procedure Mem is type Unicode_String is array (Positive range <>) of Integer; procedure Inner (B : aliased in out Unicode_String) is null; S : aliased Unicode_String(1..10) := (others => Integer'First); -- S : aliased Unicode_String := (1..10 => Integer'First); begin Inner (S); end Mem; 2) raised STORAGE_ERROR : stack overflow or erroneous memory access, when using "'Access" instead of "package Convert is new System.Address_To_Access_Conversions (Unicode_String);" and "'Address" ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-03 14:08 ` Lucretia @ 2018-07-03 14:17 ` J-P. Rosen 2018-07-03 15:06 ` Lucretia 0 siblings, 1 reply; 73+ messages in thread From: J-P. Rosen @ 2018-07-03 14:17 UTC (permalink / raw) Le 03/07/2018 à 16:08, Lucretia a écrit : > type Unicode_String is array (Positive range <>) of Integer; Array of Integer???? For a Unicode_String... Btw, do you know the package Ada.Strings.UTF_Encoding ? -- J-P. Rosen Adalog 2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX Tel: +33 1 45 29 21 52, Fax: +33 1 45 29 25 00 http://www.adalog.fr ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-03 14:17 ` J-P. Rosen @ 2018-07-03 15:06 ` Lucretia 2018-07-03 15:45 ` J-P. Rosen 0 siblings, 1 reply; 73+ messages in thread From: Lucretia @ 2018-07-03 15:06 UTC (permalink / raw) On Tuesday, 3 July 2018 15:17:14 UTC+1, J-P. Rosen wrote: > Le 03/07/2018 à 16:08, Lucretia a écrit : > > type Unicode_String is array (Positive range <>) of Integer; > Array of Integer???? For a Unicode_String... Firstly read the rest of this thread, secondly, i should've renamed that in that simple test, because IT'S A TEST to show an error in the compiler. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86391 > Btw, do you know the package Ada.Strings.UTF_Encoding ? Yes, I'm well aware of this completely useless type. 1) It's a subtype of String, which is incorrect as UTF-8 is not a superset of Latin 1, this should never have been allowed. 2) Ada needs a decent Unicode library not this half-arsed crap we have now. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-03 15:06 ` Lucretia @ 2018-07-03 15:45 ` J-P. Rosen 2018-07-03 15:55 ` Lucretia 2018-07-03 15:57 ` Dmitry A. Kazakov 0 siblings, 2 replies; 73+ messages in thread From: J-P. Rosen @ 2018-07-03 15:45 UTC (permalink / raw) Le 03/07/2018 à 17:06, Lucretia a écrit : > 1) It's a subtype of String, which is incorrect as UTF-8 is not a > superset of Latin 1, this should never have been allowed. In the first version of the AI, it was a different type. This has been discussed, and found much more user-friendly to have it as a subtype of String. Please read the discussions. > 2) Ada needs a decent Unicode library not this half-arsed crap we > have now. This package is about encoding only. What would you expect from a Unicode library? -- J-P. Rosen Adalog 2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX Tel: +33 1 45 29 21 52, Fax: +33 1 45 29 25 00 http://www.adalog.fr ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-03 15:45 ` J-P. Rosen @ 2018-07-03 15:55 ` Lucretia 2018-07-03 17:00 ` J-P. Rosen 2018-07-03 15:57 ` Dmitry A. Kazakov 1 sibling, 1 reply; 73+ messages in thread From: Lucretia @ 2018-07-03 15:55 UTC (permalink / raw) On Tuesday, 3 July 2018 16:46:00 UTC+1, J-P. Rosen wrote: > Le 03/07/2018 à 17:06, Lucretia a écrit : > > 1) It's a subtype of String, which is incorrect as UTF-8 is not a > > superset of Latin 1, this should never have been allowed. > In the first version of the AI, it was a different type. This has been > discussed, and found much more user-friendly to have it as a subtype of > String. Please read the discussions. > > > 2) Ada needs a decent Unicode library not this half-arsed crap we > > have now. > This package is about encoding only. What would you expect from a > Unicode library? Iterators over basic elements, the octets, Iterators over code points, Iterators over grapheme clusters, BIDI Iterators , etc. Access to the UCD. Unicode Regexps, streams, to start. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-03 15:55 ` Lucretia @ 2018-07-03 17:00 ` J-P. Rosen 0 siblings, 0 replies; 73+ messages in thread From: J-P. Rosen @ 2018-07-03 17:00 UTC (permalink / raw) Le 03/07/2018 à 17:55, Lucretia a écrit : >> This package is about encoding only. What would you expect from a >> Unicode library? > Iterators over basic elements, the octets, Iterators over code > points, Iterators over grapheme clusters, BIDI Iterators , etc. > Access to the UCD. Unicode Regexps, streams, to start. Fine, you are welcome to propose a specification (not for the next version of the standard, it's too late), and if it is useful it might interest compiler vendors anyway, or you may provide your own implementation as free software. Regarding the above package, it serves its purpose, no more, no less. -- J-P. Rosen Adalog 2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX Tel: +33 1 45 29 21 52, Fax: +33 1 45 29 25 00 http://www.adalog.fr ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-03 15:45 ` J-P. Rosen 2018-07-03 15:55 ` Lucretia @ 2018-07-03 15:57 ` Dmitry A. Kazakov 2018-07-03 16:07 ` Lucretia 1 sibling, 1 reply; 73+ messages in thread From: Dmitry A. Kazakov @ 2018-07-03 15:57 UTC (permalink / raw) On 2018-07-03 17:45, J-P. Rosen wrote: > Le 03/07/2018 à 17:06, Lucretia a écrit : >> 1) It's a subtype of String, which is incorrect as UTF-8 is not a >> superset of Latin 1, this should never have been allowed. > In the first version of the AI, it was a different type. This has been > discussed, and found much more user-friendly to have it as a subtype of > String. Please read the discussions. It must be both a different type with a distinct representation and constraints and a subtype (in non-Ada sense, the way Integer is a subtype of Universal_Integer). >> 2) Ada needs a decent Unicode library not this half-arsed crap we >> have now. > This package is about encoding only. What would you expect from a > Unicode library? Proper typing, for a start? P.S. It is clear that no decent library may exist without fixing Ada type system. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-03 15:57 ` Dmitry A. Kazakov @ 2018-07-03 16:07 ` Lucretia 2018-07-03 16:36 ` Dmitry A. Kazakov 0 siblings, 1 reply; 73+ messages in thread From: Lucretia @ 2018-07-03 16:07 UTC (permalink / raw) On Tuesday, 3 July 2018 16:57:07 UTC+1, Dmitry A. Kazakov wrote: > On 2018-07-03 17:45, J-P. Rosen wrote: > >> 2) Ada needs a decent Unicode library not this half-arsed crap we > >> have now. > > This package is about encoding only. What would you expect from a > > Unicode library? > > Proper typing, for a start? > > P.S. It is clear that no decent library may exist without fixing Ada > type system. In what way is it broken? ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-03 16:07 ` Lucretia @ 2018-07-03 16:36 ` Dmitry A. Kazakov 2018-07-03 16:42 ` Lucretia ` (2 more replies) 0 siblings, 3 replies; 73+ messages in thread From: Dmitry A. Kazakov @ 2018-07-03 16:36 UTC (permalink / raw) On 2018-07-03 18:07, Lucretia wrote: > On Tuesday, 3 July 2018 16:57:07 UTC+1, Dmitry A. Kazakov wrote: >> On 2018-07-03 17:45, J-P. Rosen wrote: > >>>> 2) Ada needs a decent Unicode library not this half-arsed crap we >>>> have now. >>> This package is about encoding only. What would you expect from a >>> Unicode library? >> >> Proper typing, for a start? >> >> P.S. It is clear that no decent library may exist without fixing Ada >> type system. > > In what way is it broken? It is not broken, it misses key features like interface inheritance. E.g. UTF8_String and String must share interfaces but have different representations. Strings [and characters] is a network of related mutually and implicitly convertible types. There is no way to design that in Ada without magic. Without having types related, we get a geometric explosion of packages: character type x encoding method x fixed/bounded/unbounded. Clearly nobody would ever add UTF-8 into this mess, because this will double the number of packages where strings are used. Static polymorphism (generics/overloading) does not work here. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-03 16:36 ` Dmitry A. Kazakov @ 2018-07-03 16:42 ` Lucretia 2018-07-03 16:45 ` Lucretia 2018-07-03 20:18 ` Dmitry A. Kazakov 2018-07-03 18:54 ` Dan'l Miller 2018-07-04 7:33 ` J-P. Rosen 2 siblings, 2 replies; 73+ messages in thread From: Lucretia @ 2018-07-03 16:42 UTC (permalink / raw) On Tuesday, 3 July 2018 17:36:12 UTC+1, Dmitry A. Kazakov wrote: > Without having types related, we get a geometric explosion of packages: > character type x encoding method x fixed/bounded/unbounded. Clearly > nobody would ever add UTF-8 into this mess, because this will double the > number of packages where strings are used. Static polymorphism > (generics/overloading) does not work here. Well, they kind of already did that by subtyping UTF_String from String, of which it's not a subtype, it's just they are both arrays of 8-bit entities. Am i wrong, should I just implement what I need on top of the standard lib and just use the UTF* types in my code? What about unbounded_utf_strings? Just use the normal unbounded_string? It's not like it's going to be checking for it to be correct utf8 is it, but I can't write an iterator for that from outside the rts though. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-03 16:42 ` Lucretia @ 2018-07-03 16:45 ` Lucretia 2018-07-03 20:18 ` Dmitry A. Kazakov 1 sibling, 0 replies; 73+ messages in thread From: Lucretia @ 2018-07-03 16:45 UTC (permalink / raw) On Tuesday, 3 July 2018 17:42:53 UTC+1, Lucretia wrote: >the normal unbounded_string? It's not like it's going to be checking for it to be correct utf8 is it, but I can't write an iterator for that from outside the rts though. Well, that's not quite correct, I could. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-03 16:42 ` Lucretia 2018-07-03 16:45 ` Lucretia @ 2018-07-03 20:18 ` Dmitry A. Kazakov 2018-07-03 21:04 ` Lucretia 1 sibling, 1 reply; 73+ messages in thread From: Dmitry A. Kazakov @ 2018-07-03 20:18 UTC (permalink / raw) On 2018-07-03 18:42, Lucretia wrote: > On Tuesday, 3 July 2018 17:36:12 UTC+1, Dmitry A. Kazakov wrote: > >> Without having types related, we get a geometric explosion of packages: >> character type x encoding method x fixed/bounded/unbounded. Clearly >> nobody would ever add UTF-8 into this mess, because this will double the >> number of packages where strings are used. Static polymorphism >> (generics/overloading) does not work here. > > Well, they kind of already did that by subtyping UTF_String from String, of which it's not a subtype, it's just they are both arrays of 8-bit entities. No. Both are arrays of code points and arrays of octets. The ranges of code points are different. The correspondence between code points and octets are different. Thus the subtyping is broken. > Am i wrong, should I just implement what I need on top of the standard lib and just use the UTF* types in my code? What about unbounded_utf_strings? Just use the normal unbounded_string? It's not like it's going to be checking for it to be correct utf8 is it, but I can't write an iterator for that from outside the rts though. There is no way to do it right in Ada for now. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-03 20:18 ` Dmitry A. Kazakov @ 2018-07-03 21:04 ` Lucretia 2018-07-04 1:26 ` Dan'l Miller 2018-07-04 7:21 ` Dmitry A. Kazakov 0 siblings, 2 replies; 73+ messages in thread From: Lucretia @ 2018-07-03 21:04 UTC (permalink / raw) On Tuesday, 3 July 2018 21:18:28 UTC+1, Dmitry A. Kazakov wrote: > > Well, they kind of already did that by subtyping UTF_String from String, of which it's not a subtype, it's just they are both arrays of 8-bit entities. > > No. Both are arrays of code points and arrays of octets. The ranges of > code points are different. The correspondence between code points and > octets are different. Thus the subtyping is broken. I know the difference between code points and octets and their arrays. I was saying that UTF_String is not a valid subtype of String because String is Latin 1 and UTF_String is a superset of 7-bit ASCII, not 8-bit Latin 1. > > Am i wrong, should I just implement what I need on top of the standard lib and just use the UTF* types in my code? What about unbounded_utf_strings? Just use the normal unbounded_string? It's not like it's going to be checking for it to be correct utf8 is it, but I can't write an iterator for that from outside the rts though. > > There is no way to do it right in Ada for now. What do you mean exactly???? ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-03 21:04 ` Lucretia @ 2018-07-04 1:26 ` Dan'l Miller 2018-07-04 1:59 ` Lucretia 2018-07-04 7:21 ` Dmitry A. Kazakov 1 sibling, 1 reply; 73+ messages in thread From: Dan'l Miller @ 2018-07-04 1:26 UTC (permalink / raw) On Tuesday, July 3, 2018 at 4:04:54 PM UTC-5, Lucretia wrote: > On Tuesday, 3 July 2018 21:18:28 UTC+1, Dmitry A. Kazakov wrote: > > > > Well, they kind of already did that by subtyping UTF_String from String, of which it's not a subtype, it's just they are both arrays of 8-bit entities. > > > > No. Both are arrays of code points and arrays of octets. The ranges of > > code points are different. The correspondence between code points and > > octets are different. Thus the subtyping is broken. > > I know the difference between code points and octets and their arrays. I was saying that UTF_String is > not a valid subtype of String because String is Latin 1 and UTF_String is a superset of 7-bit ASCII, not > 8-bit Latin 1. Well, there are 2 ways of looking at UTF-8: before versus after parsing. is not a superset: One is whether each 8-bit value in Latin-1 has the same value in the UTF-8 octet-by-octet representation •prior• to parsing. Using this analysis, all of the upper 128 values have a different meaning than in Latin-1. is a superset: But the other way of looking at UTF-8 is what character is represented by the multi-byte encoding •after• parsing. In this view, the lowest 256 values of Unicode/ISO10646 conform to Latin-1 (with some quibbling over whether the mark-parity control codes from 16#80 to 16#9F have precisely the same meaning versus reserved/unencoded at various editions of various standards). > > > Am i wrong, should I just implement what I need on top of the standard lib and just use the UTF* types in my code? What about unbounded_utf_strings? Just use the normal unbounded_string? It's not like it's going to be checking for it to be correct utf8 is it, but I can't write an iterator for that from outside the rts though. > > > > There is no way to do it right in Ada for now. > > What do you mean exactly???? He means that it needs his extrapolation-of-Steelman-3-3F idea for compile-time tagged types that are not tagged records. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 1:26 ` Dan'l Miller @ 2018-07-04 1:59 ` Lucretia 2018-07-04 7:37 ` Dmitry A. Kazakov ` (2 more replies) 0 siblings, 3 replies; 73+ messages in thread From: Lucretia @ 2018-07-04 1:59 UTC (permalink / raw) On Wednesday, 4 July 2018 02:26:52 UTC+1, Dan'l Miller wrote: > > I know the difference between code points and octets and their arrays. I was saying that UTF_String is > > not a valid subtype of String because String is Latin 1 and UTF_String is a superset of 7-bit ASCII, not > > 8-bit Latin 1. > > Well, there are 2 ways of looking at UTF-8: before versus after parsing. > > is not a superset: > One is whether each 8-bit value in Latin-1 has the same value in the UTF-8 octet-by-octet representation •prior• to parsing. Using this analysis, all of the upper 128 values have a different meaning than in Latin-1. You're answering a question that wasn't asked. > is a superset: > But the other way of looking at UTF-8 is what character is represented by the multi-byte encoding •after• parsing. In this view, the lowest 256 values of Unicode/ISO10646 conform to Latin-1 (with some quibbling over whether the mark-parity control codes from 16#80 to 16#9F have precisely the same meaning versus reserved/unencoded at various editions of various standards). And again, wasn't asked. > > > > Am i wrong, should I just implement what I need on top of the standard lib and just use the UTF* types in my code? What about unbounded_utf_strings? Just use the normal unbounded_string? It's not like it's going to be checking for it to be correct utf8 is it, but I can't write an iterator for that from outside the rts though. > > > > > > There is no way to do it right in Ada for now. > > > > What do you mean exactly???? > > He means that it needs his extrapolation-of-Steelman-3-3F idea for compile-time tagged types that are not tagged records. I've just read it. Yeah, I agree that Ada should be able to extend records with data, not functions/procedures, but I don't see how the lack of that is a hindrance to creating a decent unicode lib. The fact that he refuses to answer such a simple question, i.e "WTF are you on about?" explains a lot. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 1:59 ` Lucretia @ 2018-07-04 7:37 ` Dmitry A. Kazakov 2018-07-04 12:46 ` Dan'l Miller 2018-07-04 13:37 ` Dennis Lee Bieber 2 siblings, 0 replies; 73+ messages in thread From: Dmitry A. Kazakov @ 2018-07-04 7:37 UTC (permalink / raw) On 2018-07-04 03:59, Lucretia wrote: > I've just read it. Yeah, I agree that Ada should be able to extend records with data, not functions/procedures, but I don't see how the lack of that is a hindrance to creating a decent unicode lib. The fact that he refuses to answer such a simple question, i.e "WTF are you on about?" explains a lot. I never refuse answering. Let me state the requirements of a sane implementation: 1. All types related. You can pass String where UTF8_String is expected and conversely keeping the *semantics*. That means that when one string contains a-umlaut it stays a-umlaut in another string. 2. All strings are arrays of Unicode points. You can iterate characters even in an UTF-8 string or a DEC RADIX-50 string. 3. All strings are arrays of the corresponding representation units. You can iterate representation units. 4. All string representations stripped of the the bounds have machine representations in the stated encoding. You can pass a flat UTF-8 string down to a C library with no fuss. 5. For any string operation one can provide either a type-specific implementation or inherit a body from another strings type (see #1). You can write a specific Put_Line for Latin-1 string or use (inherit) Put_Line for UTF-8 string. -------------- The point is that this is impossible in Ada. If you think otherwise, you are welcome to outline a way. What fixes required to be able to implement it is a subject of serious discussion about the Ada type system, which nobody seemingly is interested in, at the time. Though I am ready when you are. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 1:59 ` Lucretia 2018-07-04 7:37 ` Dmitry A. Kazakov @ 2018-07-04 12:46 ` Dan'l Miller 2018-07-04 13:37 ` Dennis Lee Bieber 2 siblings, 0 replies; 73+ messages in thread From: Dan'l Miller @ 2018-07-04 12:46 UTC (permalink / raw) On Tuesday, July 3, 2018 at 8:59:59 PM UTC-5, Lucretia wrote: > On Wednesday, 4 July 2018 02:26:52 UTC+1, Dan'l Miller wrote: > > > > I know the difference between code points and octets and their arrays. I was saying that UTF_String is > > > not a valid subtype of String because String is Latin 1 and UTF_String is a superset of 7-bit ASCII, not > > > 8-bit Latin 1. > > > > Well, there are 2 ways of looking at UTF-8: before versus after parsing. > > > > is not a superset: > > One is whether each 8-bit value in Latin-1 has the same value in the UTF-8 octet-by-octet representation •prior• to parsing. Using this analysis, all of the upper 128 values have a different meaning than in Latin-1. > > You're answering a question that wasn't asked. > > > is a superset: > > But the other way of looking at UTF-8 is what character is represented by the multi-byte encoding •after• parsing. In this view, the lowest 256 values of Unicode/ISO10646 conform to Latin-1 (with some quibbling over whether the mark-parity control codes from 16#80 to 16#9F have precisely the same meaning versus reserved/unencoded at various editions of various standards). > > And again, wasn't asked. It is quite on-topic though. This difference of looking at it from the pre-parsed versus post-parsed perspectives is at the heart of the difference of opinion of Luke/Dmitry (String) versus J-P. Rosen (Wide_String and Wide_Wide_String) arising throughout this thread. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 1:59 ` Lucretia 2018-07-04 7:37 ` Dmitry A. Kazakov 2018-07-04 12:46 ` Dan'l Miller @ 2018-07-04 13:37 ` Dennis Lee Bieber 2 siblings, 0 replies; 73+ messages in thread From: Dennis Lee Bieber @ 2018-07-04 13:37 UTC (permalink / raw) On Tue, 3 Jul 2018 18:59:57 -0700 (PDT), Lucretia <laguest9000@googlemail.com> declaimed the following: > >I've just read it. Yeah, I agree that Ada should be able to extend records with data, not functions/procedures, but I don't see how the lack of that is a hindrance to creating a decent unicode lib. The fact that he refuses to answer such a simple question, i.e "WTF are you on about?" explains a lot. Ah, but what IS a "decent unicode lib(rary)"? There is a regular over in comp.lang.python who tends to rant that Python 3.2 (maybe 3.1) "broke" unicode handling because some of his non-real-world benchmarks run slower. Current Python3 internally uses 1, 2, or 4 bytes per character in a string based upon the widest individual character. If everything fits in 8-bits, it uses 1-byte/char strings. If even one character requires 16-bits, the entire string will use 2-byte/char. Of course, since strings are immutable in Python, there is no concern about replacing one char in a 1-byte/char string with a 2-byte char -- one has to create a whole new string, which operation detects the presence of a 2-byte char and allocates all characters as 2-byte wide. The scheme allows for direct indexing of characters -- no confusion of indexing a prefix byte, or misinterpreting a suffix byte. I don't think such a string type would go far in Ada: one loses mutation in place, and also can not define memory usage limits ahead of time (unless one provides for worst case 4-byte/char -- in which case one might just use that all the way through and retain in place mutation). -- Wulfraed Dennis Lee Bieber AF6VN wlfraed@ix.netcom.com HTTP://wlfraed.home.netcom.com/ ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-03 21:04 ` Lucretia 2018-07-04 1:26 ` Dan'l Miller @ 2018-07-04 7:21 ` Dmitry A. Kazakov 1 sibling, 0 replies; 73+ messages in thread From: Dmitry A. Kazakov @ 2018-07-04 7:21 UTC (permalink / raw) On 2018-07-03 23:04, Lucretia wrote: > On Tuesday, 3 July 2018 21:18:28 UTC+1, Dmitry A. Kazakov wrote: > >>> Well, they kind of already did that by subtyping UTF_String from String, of which it's not a subtype, it's just they are both arrays of 8-bit entities. >> >> No. Both are arrays of code points and arrays of octets. The ranges of >> code points are different. The correspondence between code points and >> octets are different. Thus the subtyping is broken. > > I know the difference between code points and octets and their arrays. I was saying that UTF_String is not a valid subtype of String because String is Latin 1 and UTF_String is a superset of 7-bit ASCII, not 8-bit Latin 1. No, that does not break subtyping if Constraint_Error is in the contract. Subtyping is broken when the array of Latin-1 code points (String) corresponds to the array of representation units (octets of UTF8_String). Array of Latin-1 code points corresponds to the array of Unicode code points. It has nothing to do with the underlying encoding, whatever it might be. Each string implements two unrelated array interfaces: 1. Array of encoding units, e.g. array of octets 2. Array of code points #1 and #2 are historically confused because one resembles another for a certain class encodings like ASCII, UCS-2, UCS-4. They are absolutely different for UTF-8 and UTF-16. >>> Am i wrong, should I just implement what I need on top of the standard lib and just use the UTF* types in my code? What about unbounded_utf_strings? Just use the normal unbounded_string? It's not like it's going to be checking for it to be correct utf8 is it, but I can't write an iterator for that from outside the rts though. >> >> There is no way to do it right in Ada for now. > > What do you mean exactly???? For simplicity start with designing character types: Character, Wide_Character and Wide_Wide_Character as related types. X : Character; -- Character'Size = 8 Y : Wide_Character := Y; -- This must be legal Already this is impossible in Ada. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-03 16:36 ` Dmitry A. Kazakov 2018-07-03 16:42 ` Lucretia @ 2018-07-03 18:54 ` Dan'l Miller 2018-07-03 20:22 ` Dmitry A. Kazakov 2018-07-04 7:33 ` J-P. Rosen 2 siblings, 1 reply; 73+ messages in thread From: Dan'l Miller @ 2018-07-03 18:54 UTC (permalink / raw) On Tuesday, July 3, 2018 at 11:36:12 AM UTC-5, Dmitry A. Kazakov wrote: > On 2018-07-03 18:07, Lucretia wrote: > > On Tuesday, 3 July 2018 16:57:07 UTC+1, Dmitry A. Kazakov wrote: > >> On 2018-07-03 17:45, J-P. Rosen wrote: > > > >>>> 2) Ada needs a decent Unicode library not this half-arsed crap we > >>>> have now. > >>> This package is about encoding only. What would you expect from a > >>> Unicode library? > >> > >> Proper typing, for a start? > >> > >> P.S. It is clear that no decent library may exist without fixing Ada > >> type system. > > > > In what way is it broken? > > It is not broken, it misses key features like interface inheritance. Wait, what? No extension of interfaces in Ada, eh? https://www.adacore.com/gems/gem-48 type Animal is interface; type Animal_Extension_1 is interface and Animal; Or are we now claiming that Ada's •extension• is not •inheritance•? (But ironically that Ada83's subtyping is inheritance, as per a prior recent thread's debate.) ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-03 18:54 ` Dan'l Miller @ 2018-07-03 20:22 ` Dmitry A. Kazakov 0 siblings, 0 replies; 73+ messages in thread From: Dmitry A. Kazakov @ 2018-07-03 20:22 UTC (permalink / raw) On 2018-07-03 20:54, Dan'l Miller wrote: > On Tuesday, July 3, 2018 at 11:36:12 AM UTC-5, Dmitry A. Kazakov wrote: >> On 2018-07-03 18:07, Lucretia wrote: >>> On Tuesday, 3 July 2018 16:57:07 UTC+1, Dmitry A. Kazakov wrote: >>>> On 2018-07-03 17:45, J-P. Rosen wrote: >>> >>>>>> 2) Ada needs a decent Unicode library not this half-arsed crap we >>>>>> have now. >>>>> This package is about encoding only. What would you expect from a >>>>> Unicode library? >>>> >>>> Proper typing, for a start? >>>> >>>> P.S. It is clear that no decent library may exist without fixing Ada >>>> type system. >>> >>> In what way is it broken? >> >> It is not broken, it misses key features like interface inheritance. > > Wait, what? No extension of interfaces in Ada, eh? Tagged record extensions are unsuitable for the purpose. If you don't believe me, try yourself. > Or are we now claiming that Ada's •extension• is not •inheritance•? Tagged extension is inheritance. The reverse is false. Not all inheritance is by tagged extension. > (But ironically that Ada83's subtyping is inheritance, as per a prior recent thread's debate.) It is, but it won't do this job either. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-03 16:36 ` Dmitry A. Kazakov 2018-07-03 16:42 ` Lucretia 2018-07-03 18:54 ` Dan'l Miller @ 2018-07-04 7:33 ` J-P. Rosen 2018-07-04 7:53 ` Dmitry A. Kazakov 2 siblings, 1 reply; 73+ messages in thread From: J-P. Rosen @ 2018-07-04 7:33 UTC (permalink / raw) Le 03/07/2018 à 18:36, Dmitry A. Kazakov a écrit : > E.g. UTF8_String and String must share interfaces but have different > representations. No. UTF_8 is useful only for IOs, as soon as you want to use a UTF string, you need to convert it to a Wide_String. Why? Because even the simplest operation (Length, Indexing) are O(N) and are mostly equivalent to decoding the whole string. -- J-P. Rosen Adalog 2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX Tel: +33 1 45 29 21 52, Fax: +33 1 45 29 25 00 http://www.adalog.fr ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 7:33 ` J-P. Rosen @ 2018-07-04 7:53 ` Dmitry A. Kazakov 2018-07-04 9:55 ` J-P. Rosen 2018-07-04 19:02 ` G. B. 0 siblings, 2 replies; 73+ messages in thread From: Dmitry A. Kazakov @ 2018-07-04 7:53 UTC (permalink / raw) On 2018-07-04 09:33, J-P. Rosen wrote: > Le 03/07/2018 à 18:36, Dmitry A. Kazakov a écrit : >> E.g. UTF8_String and String must share interfaces but have different >> representations. > No. UTF_8 is useful only for IOs, as soon as you want to use a UTF > string, you need to convert it to a Wide_String. I cannot. Wide_String is UCS-2 which is not full Unicode. Anyway, whatever conversion of representations needed it must be transparent to the user. > Why? Because even the simplest operation (Length, Indexing) are O(N) and > are mostly equivalent to decoding the whole string. Premature optimization, huh? And you still need UTF-8 string type even if you are going to convert it to something else. Back to the square one, how to design an UTF-8 string type? -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 7:53 ` Dmitry A. Kazakov @ 2018-07-04 9:55 ` J-P. Rosen 2018-07-04 10:01 ` Dmitry A. Kazakov 2018-07-04 19:02 ` G. B. 1 sibling, 1 reply; 73+ messages in thread From: J-P. Rosen @ 2018-07-04 9:55 UTC (permalink / raw) Le 04/07/2018 à 09:53, Dmitry A. Kazakov a écrit : > On 2018-07-04 09:33, J-P. Rosen wrote: >> Le 03/07/2018 à 18:36, Dmitry A. Kazakov a écrit : >>> E.g. UTF8_String and String must share interfaces but have >>> different representations. >> No. UTF_8 is useful only for IOs, as soon as you want to use a UTF >> string, you need to convert it to a Wide_String. > > I cannot. Wide_String is UCS-2 which is not full Unicode. For most purposes, Wide_String is sufficient, unless you really need to support emojis or ancient chinese. In those cases, decode to Wide_Wide_String, no problem. > Anyway, whatever conversion of representations needed it must be > transparent to the user. > >> Why? Because even the simplest operation (Length, Indexing) are >> O(N) and are mostly equivalent to decoding the whole string. > > Premature optimization, huh? And you still need UTF-8 string type > even if you are going to convert it to something else. Back to the > square one, how to design an UTF-8 string type? > Choosing a representation that allows a more efficient algorithm is proper design, not premature optimization. And the point is that when you receive a string, you don't know before looking at the BOM (or other recognition techniques) whether the octets you received are pure Latin-1 or UTF_8 encoded. So you need to store it in a plain String. We discussed that point, and the agreement was that making a different type would force the user to many conversions that would bring nothing but trouble, and make Ada once again look impractical out of excessive purism. -- J-P. Rosen Adalog 2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX Tel: +33 1 45 29 21 52, Fax: +33 1 45 29 25 00 http://www.adalog.fr ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 9:55 ` J-P. Rosen @ 2018-07-04 10:01 ` Dmitry A. Kazakov 2018-07-04 11:30 ` J-P. Rosen 0 siblings, 1 reply; 73+ messages in thread From: Dmitry A. Kazakov @ 2018-07-04 10:01 UTC (permalink / raw) On 2018-07-04 11:55, J-P. Rosen wrote: > Le 04/07/2018 à 09:53, Dmitry A. Kazakov a écrit : >> On 2018-07-04 09:33, J-P. Rosen wrote: >> Premature optimization, huh? And you still need UTF-8 string type >> even if you are going to convert it to something else. Back to the >> square one, how to design an UTF-8 string type? >> > Choosing a representation that allows a more efficient algorithm is > proper design, not premature optimization. But UTF-8 is actually more efficient in most cases than Wide_Wide_String. Random string indexing is practically never used. > And the point is that when you receive a string, you don't know before > looking at the BOM (or other recognition techniques) whether the octets > you received are pure Latin-1 or UTF_8 encoded. So you need to store it > in a plain String. That is not a string at all, it is a stream array or an array of octets. > We discussed that point, and the agreement was that making a different > type would force the user to many conversions that would bring nothing > but trouble, and make Ada once again look impractical out of excessive > purism. Exactly my point. Explicit conversion are necessary because Ada's type system is unable to model strings in a type-safe way. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 10:01 ` Dmitry A. Kazakov @ 2018-07-04 11:30 ` J-P. Rosen 2018-07-04 13:27 ` Dmitry A. Kazakov 2018-07-04 17:51 ` Jacob Sparre Andersen 0 siblings, 2 replies; 73+ messages in thread From: J-P. Rosen @ 2018-07-04 11:30 UTC (permalink / raw) Le 04/07/2018 à 12:01, Dmitry A. Kazakov a écrit : > But UTF-8 is actually more efficient in most cases than > Wide_Wide_String. Random string indexing is practically never used. !!!! I, and many others, often need to search substrings within a string; actually, I would have a hard time finding an example of string manipulation without indexing... >> We discussed that point, and the agreement was that making a different >> type would force the user to many conversions that would bring nothing >> but trouble, and make Ada once again look impractical out of excessive >> purism. > > Exactly my point. Explicit conversion are necessary because Ada's type > system is unable to model strings in a type-safe way. So, you want different types, plus a typing system that would allow to mix the types and make them compatible... You might as well put everything in the same type! Anyway, the ARG has to deal with Ada as it is, not as Dmitry dreams it should be... -- J-P. Rosen Adalog 2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX Tel: +33 1 45 29 21 52, Fax: +33 1 45 29 25 00 http://www.adalog.fr ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 11:30 ` J-P. Rosen @ 2018-07-04 13:27 ` Dmitry A. Kazakov 2018-07-04 14:37 ` Dan'l Miller 2018-07-04 17:51 ` Jacob Sparre Andersen 1 sibling, 1 reply; 73+ messages in thread From: Dmitry A. Kazakov @ 2018-07-04 13:27 UTC (permalink / raw) On 2018-07-04 13:30, J-P. Rosen wrote: > Le 04/07/2018 à 12:01, Dmitry A. Kazakov a écrit : >> But UTF-8 is actually more efficient in most cases than >> Wide_Wide_String. Random string indexing is practically never used. > !!!! I, and many others, often need to search substrings within a > string; actually, I would have a hard time finding an example of string > manipulation without indexing... > >>> We discussed that point, and the agreement was that making a different >>> type would force the user to many conversions that would bring nothing >>> but trouble, and make Ada once again look impractical out of excessive >>> purism. >> >> Exactly my point. Explicit conversion are necessary because Ada's type >> system is unable to model strings in a type-safe way. > So, you want different types, plus a typing system that would allow to > mix the types and make them compatible. Yes, because they are semantically same: arrays of code points. > .. You might as well put > everything in the same type! No, because they must have different representations. > Anyway, the ARG has to deal with Ada as it is, not as Dmitry dreams it > should be... It requires someone more influential, wise and knowledgeable than me to make and then push such a proposal. I would be satisfied if more people saw the roots of problems with strings etc. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 13:27 ` Dmitry A. Kazakov @ 2018-07-04 14:37 ` Dan'l Miller 2018-07-04 14:43 ` Dan'l Miller ` (2 more replies) 0 siblings, 3 replies; 73+ messages in thread From: Dan'l Miller @ 2018-07-04 14:37 UTC (permalink / raw) On Wednesday, July 4, 2018 at 8:27:53 AM UTC-5, Dmitry A. Kazakov wrote: > On 2018-07-04 13:30, J-P. Rosen wrote: > > Le 04/07/2018 à 12:01, Dmitry A. Kazakov a écrit : > >> But UTF-8 is actually more efficient in most cases than > >> Wide_Wide_String. Random string indexing is practically never used. > > !!!! I, and many others, often need to search substrings within a > > string; actually, I would have a hard time finding an example of string > > manipulation without indexing... > > > >>> We discussed that point, and the agreement was that making a different > >>> type would force the user to many conversions that would bring nothing > >>> but trouble, and make Ada once again look impractical out of excessive > >>> purism. > >> > >> Exactly my point. Explicit conversion are necessary because Ada's type > >> system is unable to model strings in a type-safe way. > > So, you want different types, plus a typing system that would allow to > > mix the types and make them compatible. > > Yes, because they are semantically same: arrays of code points. > > > .. You might as well put > > everything in the same type! > > No, because they must have different representations. > > > Anyway, the ARG has to deal with Ada as it is, not as Dmitry dreams it > > should be... > > It requires someone more influential, wise and knowledgeable than me to > make and then push such a proposal. I would be satisfied if more people > saw the roots of problems with strings etc. I think that perhaps /all/ readers of this see at least one •problem• with UTF-8 (and perhaps Unicode/ISO10646 in general in Ada, regardless of choice of encoding) in Ada's String (and perhaps Wide_String and Wide_Wide_String too). The difficulty is that •no one• has the single •solution• for this problem or these concomitant problems. Not even J-P. Rosen is a possessor of complete solution in his Wide_Wide_String recommendation, because his replies seem to factually-incorrectly imply that there exists a fully-normalized single-codepoint character in Unicode/ISO10646 for each grapheme/letter. The following article provides 7 examples in 4 languages (2 of which are European languages, no less!) where a single grapheme's most-compact representation in Unicode/ISO10646 is a multi-codepoint sequence. The absolutely most infamous of these 7 examples is the Lithuanian one. Because through flukes of sociopolitical history, Vietnamese, French, German, and so forth all had pre-1992 ISO standards or IBM-Microsoft-Apple code-pages for their letters with diacritics, their languages' letters with diacritics got standardized in Unicode/ISO10646 as single codepoints, e.g., ü as U+FC instead of ¨ U+308 followed by u U+75. Poor old Lithuania was under Soviet occupation from 1944 to 1991, during which the Soviets tried to suppress the Lithuanian language. Due to this suppression, the Soviet character-encoding standards never standardized encodings for Lithuanian letters with all the Lithuanian-specific diacritical marks, such as the 2 example letters given in the article linked above. Because the timespan was so short from the Soviet occupation leaving Lithuania in 1991 to the 1992 cut-off of pre-existing character-encoding standards to which Unicode/ISO10646 must be encode as single codepoints, poor old Lithuanian characters are 2nd-class citizens in Unicode/ISO10646, whereas all the Western European languages (and their former colonies) with diacritical marks are first-class citizens in Unicode/ISO10646. This is a cause of somewhat of a protracted slow-motion multidecade trench warfare between Lithuania and Unicode/ISO10646 over this issue, made worse every time someone elsewhere on the planet whips up a brand-new character-with-single-codepoint that has never ever existed in the history of humankind and then standardizes this brand-new contrived grapheme-with-single-codepoint in Unicode/ISO10646. Oh, but Japan and Silicon Valley can devise emojis galore in recent years and not be restricted by strict enforcement of this no-preexisting-character-encoding rule. Why? I guess because emojis are cool, but Lithuanian characters are booooorrrrrrrring. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 14:37 ` Dan'l Miller @ 2018-07-04 14:43 ` Dan'l Miller 2018-07-04 14:57 ` J-P. Rosen 2018-07-04 15:41 ` Lucretia 2 siblings, 0 replies; 73+ messages in thread From: Dan'l Miller @ 2018-07-04 14:43 UTC (permalink / raw) On Wednesday, July 4, 2018 at 9:37:40 AM UTC-5, Dan'l Miller wrote: > On Wednesday, July 4, 2018 at 8:27:53 AM UTC-5, Dmitry A. Kazakov wrote: > > On 2018-07-04 13:30, J-P. Rosen wrote: > > > Le 04/07/2018 à 12:01, Dmitry A. Kazakov a écrit : > > >> But UTF-8 is actually more efficient in most cases than > > >> Wide_Wide_String. Random string indexing is practically never used. > > > !!!! I, and many others, often need to search substrings within a > > > string; actually, I would have a hard time finding an example of string > > > manipulation without indexing... > > > > > >>> We discussed that point, and the agreement was that making a different > > >>> type would force the user to many conversions that would bring nothing > > >>> but trouble, and make Ada once again look impractical out of excessive > > >>> purism. > > >> > > >> Exactly my point. Explicit conversion are necessary because Ada's type > > >> system is unable to model strings in a type-safe way. > > > So, you want different types, plus a typing system that would allow to > > > mix the types and make them compatible. > > > > Yes, because they are semantically same: arrays of code points. > > > > > .. You might as well put > > > everything in the same type! > > > > No, because they must have different representations. > > > > > Anyway, the ARG has to deal with Ada as it is, not as Dmitry dreams it > > > should be... > > > > It requires someone more influential, wise and knowledgeable than me to > > make and then push such a proposal. I would be satisfied if more people > > saw the roots of problems with strings etc. > > I think that perhaps /all/ readers of this see at least one •problem• with UTF-8 (and perhaps Unicode/ISO10646 in general in Ada, regardless of choice of encoding) in Ada's String (and perhaps Wide_String and Wide_Wide_String too). > > The difficulty is that •no one• has the single •solution• for this problem or these concomitant problems. Not even J-P. Rosen is a possessor of complete solution in his Wide_Wide_String recommendation, because his replies seem to factually-incorrectly imply that there exists a fully-normalized single-codepoint character in Unicode/ISO10646 for each grapheme/letter. The following article provides 7 examples in 4 languages (2 of which are European languages, no less!) where a single grapheme's most-compact representation in Unicode/ISO10646 is a multi-codepoint sequence. > > The absolutely most infamous of these 7 examples is the Lithuanian one. Because through flukes of sociopolitical history, Vietnamese, French, German, and so forth all had pre-1992 ISO standards or IBM-Microsoft-Apple code-pages for their letters with diacritics, their languages' letters with diacritics got standardized in Unicode/ISO10646 as single codepoints, e.g., ü as U+FC instead of ¨ U+308 followed by u U+75. Poor old Lithuania was under Soviet occupation from 1944 to 1991, during which the Soviets tried to suppress the Lithuanian language. Due to this suppression, the Soviet character-encoding standards never standardized encodings for Lithuanian letters with all the Lithuanian-specific diacritical marks, such as the 2 example letters given in the article linked above. Because the timespan was so short from the Soviet occupation leaving Lithuania in 1991 to the 1992 cut-off of pre-existing character-encoding standards to which Unicode/ISO10646 must be encode as single codepoints, poor old Lithuanian characters are 2nd-class citizens in Unicode/ISO10646, whereas all the Western European languages (and their former colonies) with diacritical marks are first-class citizens in Unicode/ISO10646. This is a cause of somewhat of a protracted slow-motion multidecade trench warfare between Lithuania and Unicode/ISO10646 over this issue, made worse every time someone elsewhere on the planet whips up a brand-new character-with-single-codepoint that has never ever existed in the history of humankind and then standardizes this brand-new contrived grapheme-with-single-codepoint in Unicode/ISO10646. > > Oh, but Japan and Silicon Valley can devise emojis galore in recent years and not be restricted by strict enforcement of this no-preexisting-character-encoding rule. Why? I guess because emojis are cool, but Lithuanian characters are booooorrrrrrrring. Oh, it would help if I would press the paste key: http://unicode.org/standard/where ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 14:37 ` Dan'l Miller 2018-07-04 14:43 ` Dan'l Miller @ 2018-07-04 14:57 ` J-P. Rosen 2018-07-04 15:41 ` Lucretia 2 siblings, 0 replies; 73+ messages in thread From: J-P. Rosen @ 2018-07-04 14:57 UTC (permalink / raw) Le 04/07/2018 à 16:37, Dan'l Miller a écrit : > The difficulty is that •no one• has the single •solution• for this > problem or these concomitant problems. Not even J-P. Rosen is a > possessor of complete solution in his Wide_Wide_String > recommendation, because his replies seem to factually-incorrectly > imply that there exists a fully-normalized single-codepoint character > in Unicode/ISO10646 for each grapheme/letter. You are right that characters not in normalized form (not only lithuanians!) may have a representation as several code points, which implies O(N) for some operations... But if it is encoded in UTF-8, you need an extra O(N) operation to first decode the code point. The difference between the two is still there. -- J-P. Rosen Adalog 2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX Tel: +33 1 45 29 21 52, Fax: +33 1 45 29 25 00 http://www.adalog.fr ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 14:37 ` Dan'l Miller 2018-07-04 14:43 ` Dan'l Miller 2018-07-04 14:57 ` J-P. Rosen @ 2018-07-04 15:41 ` Lucretia 2018-07-04 16:55 ` Dan'l Miller 2 siblings, 1 reply; 73+ messages in thread From: Lucretia @ 2018-07-04 15:41 UTC (permalink / raw) On Wednesday, 4 July 2018 15:37:40 UTC+1, Dan'l Miller wrote: > The difficulty is that •no one• has the single •solution• for this problem or these concomitant problems. Not even J-P. Rosen is a possessor of complete solution in his Wide_Wide_String recommendation, because his replies seem to factually-incorrectly imply that there exists a fully-normalized single-codepoint character in Unicode/ISO10646 for each grapheme/letter. JP Rosen told me to go read the AI on the matter, which I did. He states they talked about it, there's not much talking in the AI at all! Bob Dewar states they shouldn't really abuse the *String types by subtyping and does exactly that by introducing a package he wrote to handle UTF using those subtypes. The rest of the AI is about how to fit that into the standard. Back then, they should've chosen the Unicode standard over the ISO10646 as it's freely available, yes the encodings are interchangeable, but that's not really the point. They should've decided to obsolete the current mess, the same way they did with ASCII and made String and Unbounded_String UTF-8 encoded. They could still have the old latin based strings as compatibility types. They should've made all source be encoded the same way, which they did anyway for the iso spec. Then defined a bunch of iterators for the types based on code points, grapheme clusters, word/line boundaries, bidi, etc. Then taken out all references to characters as that concept isn't really applicable to Unicode as a "character" can be one or more code points. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 15:41 ` Lucretia @ 2018-07-04 16:55 ` Dan'l Miller 2018-07-04 18:01 ` Shark8 0 siblings, 1 reply; 73+ messages in thread From: Dan'l Miller @ 2018-07-04 16:55 UTC (permalink / raw) On Wednesday, July 4, 2018 at 10:41:49 AM UTC-5, Lucretia wrote: > On Wednesday, 4 July 2018 15:37:40 UTC+1, Dan'l Miller wrote: > > > The difficulty is that •no one• has the single •solution• for this problem or these concomitant > > problems. Not even J-P. Rosen is a possessor of complete solution in his Wide_Wide_String > > recommendation, because his replies seem to factually-incorrectly imply that there exists a fully > > normalized single-codepoint character in Unicode/ISO10646 for each grapheme/letter. > > JP Rosen told me to go read the AI on the matter, which I did. He states they talked about it, there's not > much talking in the AI at all! Bob Dewar states they shouldn't really abuse the *String types by subtyping > and does exactly that by introducing a package he wrote to handle UTF using those subtypes. The rest > of the AI is about how to fit that into the standard. > > Back then, they should've chosen the Unicode standard over the ISO10646 as it's freely available, yes > the encodings are interchangeable, but that's not really the point. 1) As a fellow ISO standard (ISO8652), Ada is compelled by ISO rules to comply with ISO standards (instead of other standards bodies) when an ISO standard exists for that topic. 2) In the end, what difference to Ada would actually occur by the ARG considering Unicode the normative reference instead of ISO10646 the normative reference. The Unicode-specific extensions are higher in the food chain (e.g., bidirectional algorithms) than Ada's libraries (or language) have ever bitten off to chew. > They should've decided to obsolete the current mess, the same way they did with ASCII and made String > and Unbounded_String UTF-8 encoded. They could still have the old latin based strings as compatibility > types. They should've made all source be encoded the same way, which they did anyway for the iso > spec. > > Then defined a bunch of iterators for the types based on code points, grapheme clusters, word/line > boundaries, bidi, etc. Yes, parsing/decoding iterators over UTF-8 and UTF-16 would be awesome. Where-is-the-next-fully-formed-grapheme iterators would be awesome for UTF-32 and UCS4 to make processing of combining characters (both in never-single-codepoint graphemes and in not-normalized-but-could-have-been multi-codepoint sequences) would be awesome. But then again, why bother waiting decade or two for the standard library? Ada could have a Boost-esque library outside of the ISO8652 standard, where, say, Luke & Dmitry contribute such a better solution. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 16:55 ` Dan'l Miller @ 2018-07-04 18:01 ` Shark8 2018-07-04 18:57 ` Dmitry A. Kazakov 0 siblings, 1 reply; 73+ messages in thread From: Shark8 @ 2018-07-04 18:01 UTC (permalink / raw) On Wednesday, July 4, 2018 at 10:55:08 AM UTC-6, Dan'l Miller wrote: > > 1) As a fellow ISO standard (ISO8652), Ada is compelled by ISO rules to comply with ISO standards (instead of other standards bodies) when an ISO standard exists for that topic. Except that, if you'll allow me to be blunt and politically incorrect, Unicode is a terrible [non-]solution to the problem. Mark my words: Building/standardizing on Unicode will only bring pain and suffering. "The purpose of standardization is to aid the creative craftsman, *not* to enforce the common mediocrity." — Author unknown; found on a blackboard at Eglin Air Force Base Unicode is fatally flawed because it does enforce the common mediocrity. Much like, eg, strings for representing paths is the common way to do things, so too is Unicode overly tied to the wrongheaded manner of doing things: discarding actual structure in favor of ad hoc calculation, eliminating semantically useful [and needed information] and hoping to be able to recover it with later processing. As an example, the sentence "The Hebrew word for 'man' is 'אדם' (Adam)." is *NOT* merely a sequence of graphemes, codepoints, and/or bytes. It is a semantically meaningful text consisting of multiple languages... and *this* is what Unicode discards. A much better way to handle something like this would be a sort of multi-lingual 'string'/'sequence' type, where the above would be in a Lisp-ish structure: ((English-string "The Hebrew word for " (quotation "man") "is "), (quotation (Hebrew-string "אדם")), English-string (parenthetical "Adam")). But Unicode discards all that information, instead opting for ('T', 'h', 'e', ' ', 'H', 'e' 'b' ...) and offloading the structure-recovery to whatever text-processing / display-method API there is. But this is all par-for-course within computer-science and "the industry" -- Welcome to the wonderful world of unix/C "small tools" and "pipes" where text processing is mandatory and at every step of the problem you discard all type-information, forcing everything downstream to re-parse the text -- all over again; "enforce the common mediocrity" thy name is Unix/C. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 18:01 ` Shark8 @ 2018-07-04 18:57 ` Dmitry A. Kazakov 2018-07-04 19:53 ` Shark8 0 siblings, 1 reply; 73+ messages in thread From: Dmitry A. Kazakov @ 2018-07-04 18:57 UTC (permalink / raw) On 2018-07-04 20:01, Shark8 wrote: > On Wednesday, July 4, 2018 at 10:55:08 AM UTC-6, Dan'l Miller wrote: > As an example, the sentence "The Hebrew word for 'man' is 'אדם' (Adam)." is *NOT* merely a sequence of graphemes, codepoints, and/or bytes. It is a semantically meaningful text consisting of multiple languages... and *this* is what Unicode discards. And rightly so. Like 91093835.6 is just a number instead "meaningful": the mass of a stationary electron. One fundamental principle of software design is abstraction in the sense of throwing away unnecessary information. A printer may know nothing about Hebrew. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 18:57 ` Dmitry A. Kazakov @ 2018-07-04 19:53 ` Shark8 2018-07-04 20:05 ` Lucretia 2018-07-04 20:43 ` Dmitry A. Kazakov 0 siblings, 2 replies; 73+ messages in thread From: Shark8 @ 2018-07-04 19:53 UTC (permalink / raw) On Wednesday, July 4, 2018 at 12:57:40 PM UTC-6, Dmitry A. Kazakov wrote: > On 2018-07-04 20:01, Shark8 wrote: > > On Wednesday, July 4, 2018 at 10:55:08 AM UTC-6, Dan'l Miller wrote: > > > As an example, the sentence "The Hebrew word for 'man' is 'אדם' (Adam)." is *NOT* merely a sequence of graphemes, codepoints, and/or bytes. It is a semantically meaningful text consisting of multiple languages... and *this* is what Unicode discards. > > And rightly so. Like 91093835.6 is just a number instead "meaningful": > the mass of a stationary electron. > > One fundamental principle of software design is abstraction in the sense > of throwing away unnecessary information. A printer may know nothing > about Hebrew. Interesting how you're ready, willing and able to conflate all portions of data-storage/-management into a single operation: printing. But let's take a step backward; what about displaying the text? One certainly could argue that Unicode is a good solution in this arena, after all havng the ability to encode all of human language is it's stated design-goal, so surely it must be well-suited to that, right? Not really. The mere fact of "combining characters" makes unicode no more suited to textual display than a sort of hypothetical Forth/PostScript where each word/token/character is processed by the display driver and rendered appropriately. (The aforementioned Lisp-like structure being executed is the procedure: "painting/displaying the english text, switch-to-hebrew, print/display hebrew text, switch-to-english, print/display english text", which of course can be further decomposed to "print 'T' [horizontal-stroke, vertical stroke] print 'h' [vertical stroke, curved stroke, vertical stroke] print 'e' [horizontal stroke, curved stroke] ....") This is the essential idea behind PostScript printers; and it works well. (The same/analogous procedure must be executed in SW and transmitted to the printer in non-PostScript printers; usually using some proprietary printer-control-language, which is essentially what printer-drivers *ARE*.) So, even working backward from your example of printing, where you claim that "knowledge of Hebrew is unneeded" is... well dubious. It's certainly needed somewhere along the line for this example. My contention that "sequence of codepoints + font" is flatly stupid for a multi-language system. Arguably it's stupid for a single-language system, too. As an example we could use paths: "root\projects\x\source" is flatly moronic*, and we can see this by how it pops up in multi-platform development: should there be a terminal '\'? do those '\' characters need escaped? do they need to be replaced with '/'? What we have is a sequence (root, projects, x, source) which corresponds to a path down a tree, but the common "industry practice" is to think of this as a string of characters and "parse/reparse/regex/reparse/whatever" textual manipulations to read what the structure is rather than sensibly save the structural information. * forced upon us by stupid, thin APIs to the OS. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 19:53 ` Shark8 @ 2018-07-04 20:05 ` Lucretia 2018-07-04 22:04 ` Shark8 2018-07-04 20:43 ` Dmitry A. Kazakov 1 sibling, 1 reply; 73+ messages in thread From: Lucretia @ 2018-07-04 20:05 UTC (permalink / raw) On Wednesday, 4 July 2018 20:53:21 UTC+1, Shark8 wrote: > But let's take a step backward; what about displaying the text? One certainly could argue that Unicode is a good solution in this arena, after all havng the ability to encode all of human language is it's stated design-goal, so surely it must be well-suited to that, right? You're wrong. Unicode is not about displaying text, it even says that in the spec, it's about representation. Stop trying to force Unicode into Lisp or Forth or whatever to try to add meaning to text. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 20:05 ` Lucretia @ 2018-07-04 22:04 ` Shark8 2018-07-05 0:12 ` Dan'l Miller 0 siblings, 1 reply; 73+ messages in thread From: Shark8 @ 2018-07-04 22:04 UTC (permalink / raw) On Wednesday, July 4, 2018 at 2:05:17 PM UTC-6, Lucretia wrote: > On Wednesday, 4 July 2018 20:53:21 UTC+1, Shark8 wrote: > > > But let's take a step backward; what about displaying the text? One certainly could argue that Unicode is a good solution in this arena, after all havng the ability to encode all of human language is it's stated design-goal, so surely it must be well-suited to that, right? > > You're wrong. Unicode is not about displaying text, it even says that in the spec, it's about representation. Stop trying to force Unicode into Lisp or Forth or whatever to try to add meaning to text. I didn't say it *was*, I used display as an example. But you bring up a good point: it's a terrible representation, for all that I've said, and more. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 22:04 ` Shark8 @ 2018-07-05 0:12 ` Dan'l Miller 2018-07-05 1:46 ` Shark8 0 siblings, 1 reply; 73+ messages in thread From: Dan'l Miller @ 2018-07-05 0:12 UTC (permalink / raw) On Wednesday, July 4, 2018 at 5:04:13 PM UTC-5, Shark8 wrote: > On Wednesday, July 4, 2018 at 2:05:17 PM UTC-6, Lucretia wrote: > > On Wednesday, 4 July 2018 20:53:21 UTC+1, Shark8 wrote: > > > > > But let's take a step backward; what about displaying the text? One certainly could argue that Unicode is a good solution in this arena, after all havng the ability to encode all of human language is it's stated design-goal, so surely it must be well-suited to that, right? > > > > You're wrong. Unicode is not about displaying text, it even says that in the spec, it's about representation. Stop trying to force Unicode into Lisp or Forth or whatever to try to add meaning to text. > > I didn't say it *was*, I used display as an example. > But you bring up a good point: it's a terrible representation, for all that I've said, and more. Shark8, it seems that your criticisms were that instead of representing the Hebrew letters, we ought to represent the whole Hebrew word. Isn't that an entirely different problem-space higher in the food chain? My qualms with Unicode is that it gets into far more topics than character encoding and then for some odd reason refuses to standardize single-codepoint representation of some language's letters (and then for some even odder reason standardizes offbeat emojis far beyond the original Japanese single-codepoint representations of old 1980s emoticons). I guess all that billion codepoints beyond BMP is reserved for all the extra-terrestrial space-alien languages, not for us mere mortals on planet Earth. Poor old Lithuanian needs to not only stand in line behind all the Western European nations (and their former colonies) but also poor old Lithuanian needs to stand in line behind E.T. Shark8, what would be the better solution for character-encoding itself? (not whole words) ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-05 0:12 ` Dan'l Miller @ 2018-07-05 1:46 ` Shark8 2018-07-05 2:07 ` Luke A. Guest 0 siblings, 1 reply; 73+ messages in thread From: Shark8 @ 2018-07-05 1:46 UTC (permalink / raw) On Wednesday, July 4, 2018 at 6:12:06 PM UTC-6, Dan'l Miller wrote: > On Wednesday, July 4, 2018 at 5:04:13 PM UTC-5, Shark8 wrote: > > On Wednesday, July 4, 2018 at 2:05:17 PM UTC-6, Lucretia wrote: > > > On Wednesday, 4 July 2018 20:53:21 UTC+1, Shark8 wrote: > > > > > > > But let's take a step backward; what about displaying the text? One certainly could argue that Unicode is a good solution in this arena, after all havng the ability to encode all of human language is it's stated design-goal, so surely it must be well-suited to that, right? > > > > > > You're wrong. Unicode is not about displaying text, it even says that in the spec, it's about representation. Stop trying to force Unicode into Lisp or Forth or whatever to try to add meaning to text. > > > > I didn't say it *was*, I used display as an example. > > But you bring up a good point: it's a terrible representation, for all that I've said, and more. > > Shark8, it seems that your criticisms were that instead of representing the Hebrew letters, we ought to represent the whole Hebrew word. Isn't that an entirely different problem-space higher in the food chain? > > My qualms with Unicode is that it gets into far more topics than character encoding and then for some odd reason refuses to standardize single-codepoint representation of some language's letters (and then for some even odder reason standardizes offbeat emojis far beyond the original Japanese single-codepoint representations of old 1980s emoticons). I guess all that billion codepoints beyond BMP is reserved for all the extra-terrestrial space-alien languages, not for us mere mortals on planet Earth. Poor old Lithuanian needs to not only stand in line behind all the Western European nations (and their former colonies) but also poor old Lithuanian needs to stand in line behind E.T. > > Shark8, what would be the better solution for character-encoding itself? (not whole words) Whole-word isn't a terrible idea, per se. But the thrust I was getting at is the delination between languages: with Unicode it's a sequence of codepoints, independent of the actual item (word, sentence, etc) other than [perhaps] graphic-presented. That the example is (Eng,Eng,Eng...Eng, Heb,Heb,Heb,Heb, Eng,Eng,Eng...) codepoints is not the problem, though related, because it discards all information in favor of (num, num, num, num, ...) rather than actually considering alternate languages: IMO, ("The Hebrew word for man" (quote ADAM) (quote "Adam") ".") is much better as 'text' because we're preserving structure: [ENGLISH [THIS SECTION HEBREW] ENGLISH]. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-05 1:46 ` Shark8 @ 2018-07-05 2:07 ` Luke A. Guest 2018-07-05 16:47 ` Shark8 0 siblings, 1 reply; 73+ messages in thread From: Luke A. Guest @ 2018-07-05 2:07 UTC (permalink / raw) Shark8 <onewingedshark@gmail.com> wrote: >> Shark8, what would be the better solution for character-encoding itself? >> (not whole words) > > Whole-word isn't a terrible idea, per se. But the thrust I was getting at > is the delination between languages: with Unicode it's a sequence of > codepoints, independent of the actual item (word, sentence, etc) other > than [perhaps] graphic-presented. That the example is (Eng,Eng,Eng...Eng, > Heb,Heb,Heb,Heb, Eng,Eng,Eng...) codepoints is not the problem, though > related, because it discards all information in favor of (num, num, num, > num, ...) rather than actually considering alternate languages: IMO, > ("The Hebrew word for man" (quote ADAM) (quote "Adam") ".") is much > better as 'text' because we're preserving structure: [ENGLISH [THIS > SECTION HEBREW] ENGLISH]. > I don’t understand why you think Unicode should carry linguistic information when all it has ever been designed to do is encode symbols across all languages and their direction. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-05 2:07 ` Luke A. Guest @ 2018-07-05 16:47 ` Shark8 2018-07-05 17:19 ` Dan'l Miller 0 siblings, 1 reply; 73+ messages in thread From: Shark8 @ 2018-07-05 16:47 UTC (permalink / raw) On Wednesday, July 4, 2018 at 8:07:56 PM UTC-6, Luke A. Guest wrote: > Shark8 wrote: > > >> Shark8, what would be the better solution for character-encoding itself? > >> (not whole words) > > > > Whole-word isn't a terrible idea, per se. But the thrust I was getting at > > is the delination between languages: with Unicode it's a sequence of > > codepoints, independent of the actual item (word, sentence, etc) other > > than [perhaps] graphic-presented. That the example is (Eng,Eng,Eng...Eng, > > Heb,Heb,Heb,Heb, Eng,Eng,Eng...) codepoints is not the problem, though > > related, because it discards all information in favor of (num, num, num, > > num, ...) rather than actually considering alternate languages: IMO, > > ("The Hebrew word for man" (quote ADAM) (quote "Adam") ".") is much > > better as 'text' because we're preserving structure: [ENGLISH [THIS > > SECTION HEBREW] ENGLISH]. > > > > I don’t understand why you think Unicode should carry linguistic > information when all it has ever been designed to do is encode symbols > across all languages and their direction. I'm not saying that "Unicode should" do *anything* -- I'm saying Unicode solves *the wrong problem*. "Encoding symbols" ties everything to a stupidly primitive level, forcing everything to such lowest common denominator so as to apply "the unix way" processing to text: discard all structural information, all semantic information, and have "some tool" regenerate it later... just like "the unix way" discards type-information in favor of forcing ad-hoc parsing on unstructured-text at every step between it's "small tools" connected together with 'pipes'. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-05 16:47 ` Shark8 @ 2018-07-05 17:19 ` Dan'l Miller 2018-07-05 19:14 ` Shark8 0 siblings, 1 reply; 73+ messages in thread From: Dan'l Miller @ 2018-07-05 17:19 UTC (permalink / raw) On Thursday, July 5, 2018 at 11:47:33 AM UTC-5, Shark8 wrote: > On Wednesday, July 4, 2018 at 8:07:56 PM UTC-6, Luke A. Guest wrote: > > Shark8 wrote: > > > > >> Shark8, what would be the better solution for character-encoding itself? > > >> (not whole words) > > > > > > Whole-word isn't a terrible idea, per se. But the thrust I was getting at > > > is the delination between languages: with Unicode it's a sequence of > > > codepoints, independent of the actual item (word, sentence, etc) other > > > than [perhaps] graphic-presented. That the example is (Eng,Eng,Eng...Eng, > > > Heb,Heb,Heb,Heb, Eng,Eng,Eng...) codepoints is not the problem, though > > > related, because it discards all information in favor of (num, num, num, > > > num, ...) rather than actually considering alternate languages: IMO, > > > ("The Hebrew word for man" (quote ADAM) (quote "Adam") ".") is much > > > better as 'text' because we're preserving structure: [ENGLISH [THIS > > > SECTION HEBREW] ENGLISH]. > > > > > > > I don’t understand why you think Unicode should carry linguistic > > information when all it has ever been designed to do is encode symbols > > across all languages and their direction. > > I'm not saying that "Unicode should" do *anything* -- I'm saying Unicode solves *the wrong problem*. > > "Encoding symbols" ties everything to a stupidly primitive level, forcing everything to such lowest > common denominator so as to apply "the unix way" processing to text: discard all structural information, > all semantic information, and have "some tool" regenerate it later... just like "the unix way" discards > type-information in favor of forcing ad-hoc parsing on unstructured-text at every step between it's > "small tools" connected together with 'pipes'. At some level I could conceivably agree with you in principle that a strictly-linear sequence of unadorned symbols is too low-level is some designs to be useful. For example, there was a time in the 1970s through early 1980s when Texas Instruments microprocessors excessively modeled a Turing machine's tapes (dual-tape model). No one nowadays would think that a processor should be strictly & intentionally designed to overtly model a Turing machine directly right down to the linear streams/tapes of symbols. Unicode/ISO10646 is asinine in its insistence on a sequence of •multiple• codepoints being the ••shortest possible•• representation of some individual letter in some natural language. Programmers want one-letter-one-codepoint representation in all languages—not some Turing-machine tape to process sequentially statefully, as Unicode demands even in its 32-bit UCS4 or UTF-32 representations. Programmers don't want any “well, yeah but …” situations at all when they just finished executing the fully-normalize-all-the-codepoints-in-this-string subprogram (but that “well yeah but …” is the world we suffer in with Unicode/ISO10646 as currently defined). But, Shark8, you seem to criticizing something a little different than that. In some alternate universe where Unicode or ISO10646 transpired entirely differently, what would Unicode-done-right* look like, especially w.r.t. Ada strings. It seems that you are alluding to some sort of multiple-strand string or something like that (not merely allocating the billion nonBMP codepoints better so that we would have a one-letter-one-codepoint axiom). * Yeah, I know, in Unicode done right, there wouldn't be any Unicode or ISO10646 at all, but what would there be instead and what would the strawman look like at all in Ada? ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-05 17:19 ` Dan'l Miller @ 2018-07-05 19:14 ` Shark8 0 siblings, 0 replies; 73+ messages in thread From: Shark8 @ 2018-07-05 19:14 UTC (permalink / raw) On Thursday, July 5, 2018 at 11:20:00 AM UTC-6, Dan'l Miller wrote: > > But, Shark8, you seem to criticizing something a little different than that. In some alternate universe where Unicode or ISO10646 transpired entirely differently, what would Unicode-done-right* look like, especially w.r.t. Ada strings. It seems that you are alluding to some sort of multiple-strand string or something like that (not merely allocating the billion nonBMP codepoints better so that we would have a one-letter-one-codepoint axiom). Well, Ada does like 'disassembling' things [concepts, etc] into usable component pieces, traditionally-speaking. So, I'd expect the multilingual problem-space would likely be decomposed into some usable/useful sets of types/subprograms. To borrow from other ISO stuff, perhaps something like: -- ISO 639-1 PACKAGE LANGUAGES IS Type Code is ( ab, aa, [...], za, zu ); -- other stuff. END LANGUAGES; PACKAGE LANGUAGES.CONSTRUCTS IS -- A Text is a full sequence of linguistically meaningful data, a sequence of contexts. Type Text is private; -- subprograms... -- Essentailly a "string" w/ a language context. Type Context( Language : Code; Length : Natural ) is private; -- subprograms PRIVATE --... END LANGUAGES.CONSTRUCTS; Or something; the point is the preservation of the structure/context of the sequence-of-symbols\words\graphemes\whatever to provide a solid multilingual foundation rather than throwing away all context, shoving everything in the Unicode-blender and having to deal with string-of-hexadecimal-sludge (codepoints) which, in-turn, forces reconstruction of the lost structures and contexts... maybe involving the [ab]use of RegEx, that always seems to be an answer when dealing with textually-represented data, hence why so many of our peers seem to think that RegEx is suitable for parsing/processing HTML.... Yes, it bucks the "everything is a string" mentality of C/unix influenced OS-APIs; where the analog of a path would be an actual vector of names [eg ("root", "projects", "source", "file.adb")] rather than a plain text-string [eg "root\projects\source\file.adb"] if applied to the OS as well. The whole purpose is, as stated up-thread, "to aid the creative craftsman, not enforce mediocrity". ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 19:53 ` Shark8 2018-07-04 20:05 ` Lucretia @ 2018-07-04 20:43 ` Dmitry A. Kazakov 1 sibling, 0 replies; 73+ messages in thread From: Dmitry A. Kazakov @ 2018-07-04 20:43 UTC (permalink / raw) On 2018-07-04 21:53, Shark8 wrote: > But let's take a step backward; what about displaying the text? One certainly could argue that Unicode is a good solution in this arena, after all havng the ability to encode all of human language is it's stated design-goal, so surely it must be well-suited to that, right? Some written languages, some human, some formal. > Not really. > The mere fact of "combining characters" makes unicode no more suited to textual display than a sort of hypothetical Forth/PostScript where each word/token/character is processed by the display driver and rendered appropriately. (The aforementioned Lisp-like structure being executed is the procedure: "painting/displaying the english text, switch-to-hebrew, print/display hebrew text, switch-to-english, print/display english text", which of course can be further decomposed to "print 'T' [horizontal-stroke, vertical stroke] print 'h' [vertical stroke, curved stroke, vertical stroke] print 'e' [horizontal stroke, curved stroke] ....") Written languages evolve in order to adapt to the methods of writing. Many old methods do not fit well into Unicode. And, honestly, Unicode tried way too much to embrace things better to drop. I miss ASCII times, really. It forced English (and nobody cared about correct spelling of the word naïve (:-)) [*] > Arguably it's stupid for a single-language system, too. As an example we could use paths: "root\projects\x\source" is flatly moronic*, and we can see this by how it pops up in multi-platform development: should there be a terminal '\'? do those '\' characters need escaped? do they need to be replaced with '/'? What we have is a sequence (root, projects, x, source) which corresponds to a path down a tree, but the common "industry practice" is to think of this as a string of characters and "parse/reparse/regex/reparse/whatever" textual manipulations to read what the structure is rather than sensibly save the structural information. Textual representation is ambiguous and this has nothing to do with the text, but with its meaning. Path is ambiguous and its meaning (the target file) is even more ambiguous. If you want it less ambiguous use sector and block numbers and color stickers to mark hard drives. What about 001 vs 1 vs 3/2 ... infinite list follows. Meaning of a text is not the text itself. It is a fallacy. The meaning of a numeric literal is not the literal itself. The meaning of a Unicode string has nothing to do with Unicode. And, fundamentally, there is an infinite hierarchy of object-meta languages that never ends in some ultimate, final "Om", expressing everything and nothing. ----------------- * Do Chinese still write top-to-bottom? -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 11:30 ` J-P. Rosen 2018-07-04 13:27 ` Dmitry A. Kazakov @ 2018-07-04 17:51 ` Jacob Sparre Andersen 2018-07-04 18:06 ` Shark8 2018-07-05 18:06 ` Randy Brukardt 1 sibling, 2 replies; 73+ messages in thread From: Jacob Sparre Andersen @ 2018-07-04 17:51 UTC (permalink / raw) J-P. Rosen <rosen@adalog.fr> writes: > !!!! I, and many others, often need to search substrings within a > string; actually, I would have a hard time finding an example of > string manipulation without indexing... When you search for a substring within a string, you're typically treating it in a very sequential manner. Maintaining a "cursor" pointing at the octet position in the UTF-8 encoded string would be just as practical in most (all?) of the string processing I can remember doing? Counting the number of code points(?) in a string takes longer time, but if you want the actual number of graphemes in the string, Wide_Wide_Character is practically just as slow as a UTF-8 encoded string. > So, you want different types, plus a typing system that would allow to > mix the types and make them compatible... You might as well put > everything in the same type! It would be nice if the encoding and character set of a string were "implementation details". I'm not sure how to do it, but I think it is worth trying to find a solution for Ada. (I think I was introduced to how the KDE library does it once, but IIRC only encoding was abstracted away.) Greetings, Jacob -- »Saving keystrokes is the job of the text editor, not the programming language.« -- Preben Randhol ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 17:51 ` Jacob Sparre Andersen @ 2018-07-04 18:06 ` Shark8 2018-07-04 18:59 ` Dan'l Miller ` (2 more replies) 2018-07-05 18:06 ` Randy Brukardt 1 sibling, 3 replies; 73+ messages in thread From: Shark8 @ 2018-07-04 18:06 UTC (permalink / raw) On Wednesday, July 4, 2018 at 11:51:20 AM UTC-6, Jacob Sparre Andersen wrote: > > It would be nice if the encoding and character set of a string were > "implementation details". I'm not sure how to do it, but I think it is > worth trying to find a solution for Ada. (I think I was introduced to > how the KDE library does it once, but IIRC only encoding was abstracted > away.) Indeed so! This is the way we /should/ have strings; where [[Wide_]Wide_]String are all generic with things like 'character-set' and 'search' and 'encoding' as formal parameters. Sadly this will likely never happen because it would break backwards compatibility. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 18:06 ` Shark8 @ 2018-07-04 18:59 ` Dan'l Miller 2018-07-04 19:01 ` Dmitry A. Kazakov 2018-07-04 21:00 ` Jacob Sparre Andersen 2 siblings, 0 replies; 73+ messages in thread From: Dan'l Miller @ 2018-07-04 18:59 UTC (permalink / raw) On Wednesday, July 4, 2018 at 1:06:17 PM UTC-5, Shark8 wrote: > On Wednesday, July 4, 2018 at 11:51:20 AM UTC-6, Jacob Sparre Andersen wrote: > > > > It would be nice if the encoding and character set of a string were > > "implementation details". I'm not sure how to do it, but I think it is > > worth trying to find a solution for Ada. (I think I was introduced to > > how the KDE library does it once, but IIRC only encoding was abstracted > > away.) > > Indeed so! > This is the way we /should/ have strings; where [[Wide_]Wide_]String are all generic with things like 'character-set' and 'search' and 'encoding' as formal parameters. > > Sadly this will likely never happen because it would break backwards compatibility. Then do it outside of the standardization process in a Boost-esque library on GitHub/GitLab/SourceForge to launch a de facto standard that establishes ISO's vaunted ‘established industry practice’. If C++ can do it, then so can Ada. That being said, I believe that a far better model than Boost's exists for the cream rising to the top. Instead of battle-of-the-emails-establishes-king-of-the-hill dominance hierarchies (with all due respect to the esteemed Jordan Peterson), I would recommend multiple concurrently-competing library designs, then a rigorous (repeated? annual?) bake-off among the competitors, evaluating multiple criteria: runtime performance, engineering-time design flexibility/tunabilty/ease-of-use, maintainability over time. Oh, call them the Yellow, Blue, Red, and Green libraries. Who did •that• before for language definition? … but for potential standard library content instead this time. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 18:06 ` Shark8 2018-07-04 18:59 ` Dan'l Miller @ 2018-07-04 19:01 ` Dmitry A. Kazakov 2018-07-05 18:08 ` Randy Brukardt 2018-07-04 21:00 ` Jacob Sparre Andersen 2 siblings, 1 reply; 73+ messages in thread From: Dmitry A. Kazakov @ 2018-07-04 19:01 UTC (permalink / raw) On 2018-07-04 20:06, Shark8 wrote: > On Wednesday, July 4, 2018 at 11:51:20 AM UTC-6, Jacob Sparre Andersen wrote: >> >> It would be nice if the encoding and character set of a string were >> "implementation details". I'm not sure how to do it, but I think it is >> worth trying to find a solution for Ada. (I think I was introduced to >> how the KDE library does it once, but IIRC only encoding was abstracted >> away.) > > Indeed so! > This is the way we /should/ have strings; where [[Wide_]Wide_]String are all generic with things like 'character-set' and 'search' and 'encoding' as formal parameters. > > Sadly this will likely never happen because it would break backwards compatibility. It would break nothing. Old package will become renamings of new instances. Well, except for dire deforestation should new RM be ever printed... -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 19:01 ` Dmitry A. Kazakov @ 2018-07-05 18:08 ` Randy Brukardt 2018-07-05 19:41 ` Dmitry A. Kazakov 0 siblings, 1 reply; 73+ messages in thread From: Randy Brukardt @ 2018-07-05 18:08 UTC (permalink / raw) "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message news:phj5j9$bju$1@gioia.aioe.org... > On 2018-07-04 20:06, Shark8 wrote: >> On Wednesday, July 4, 2018 at 11:51:20 AM UTC-6, Jacob Sparre Andersen >> wrote: >>> >>> It would be nice if the encoding and character set of a string were >>> "implementation details". I'm not sure how to do it, but I think it is >>> worth trying to find a solution for Ada. (I think I was introduced to >>> how the KDE library does it once, but IIRC only encoding was abstracted >>> away.) >> >> Indeed so! >> This is the way we /should/ have strings; where [[Wide_]Wide_]String are >> all generic with things like 'character-set' and 'search' and 'encoding' >> as formal parameters. >> >> Sadly this will likely never happen because it would break backwards >> compatibility. > > It would break nothing. Old package will become renamings of new > instances. Well, except for dire deforestation should new RM be ever > printed... That's not possible. As you like to say, String /= String'Class. The new libraries would almost all take String'Class (or whatever stand-in there is). Randy. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-05 18:08 ` Randy Brukardt @ 2018-07-05 19:41 ` Dmitry A. Kazakov 0 siblings, 0 replies; 73+ messages in thread From: Dmitry A. Kazakov @ 2018-07-05 19:41 UTC (permalink / raw) On 2018-07-05 20:08, Randy Brukardt wrote: > "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message > news:phj5j9$bju$1@gioia.aioe.org... >> On 2018-07-04 20:06, Shark8 wrote: >>> On Wednesday, July 4, 2018 at 11:51:20 AM UTC-6, Jacob Sparre Andersen >>> wrote: >>>> >>>> It would be nice if the encoding and character set of a string were >>>> "implementation details". I'm not sure how to do it, but I think it is >>>> worth trying to find a solution for Ada. (I think I was introduced to >>>> how the KDE library does it once, but IIRC only encoding was abstracted >>>> away.) >>> >>> Indeed so! >>> This is the way we /should/ have strings; where [[Wide_]Wide_]String are >>> all generic with things like 'character-set' and 'search' and 'encoding' >>> as formal parameters. >>> >>> Sadly this will likely never happen because it would break backwards >>> compatibility. >> >> It would break nothing. Old package will become renamings of new >> instances. Well, except for dire deforestation should new RM be ever >> printed... > > That's not possible. As you like to say, String /= String'Class. The new > libraries would almost all take String'Class (or whatever stand-in there > is). Possible, but as useless as existing implementation. I wished to say that there is no difference between overloading string types and overloading string types from generic instances. If Ada.Text_IO became renaming of Ada.Generic_Text_IO (...) the would change nothing. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 18:06 ` Shark8 2018-07-04 18:59 ` Dan'l Miller 2018-07-04 19:01 ` Dmitry A. Kazakov @ 2018-07-04 21:00 ` Jacob Sparre Andersen 2 siblings, 0 replies; 73+ messages in thread From: Jacob Sparre Andersen @ 2018-07-04 21:00 UTC (permalink / raw) Shark8 <onewingedshark@gmail.com> writes: > This is the way we /should/ have strings; where [[Wide_]Wide_]String > are all generic with things like 'character-set' and 'search' and > 'encoding' as formal parameters. If you made them generic, they would be different types. That breaks with one of the wishes Dmitry listed. Greetings, Jacob -- "Magnetohydrodynamics combines the intuitive nature of Maxwell's equations with the easy solvability of the Navier-Stokes equations." ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 17:51 ` Jacob Sparre Andersen 2018-07-04 18:06 ` Shark8 @ 2018-07-05 18:06 ` Randy Brukardt 1 sibling, 0 replies; 73+ messages in thread From: Randy Brukardt @ 2018-07-05 18:06 UTC (permalink / raw) "Jacob Sparre Andersen" <jacob@jacob-sparre.dk> wrote in message news:87efginb3c.fsf@adaheads.home... > J-P. Rosen <rosen@adalog.fr> writes: ... >> So, you want different types, plus a typing system that would allow to >> mix the types and make them compatible... You might as well put >> everything in the same type! > > It would be nice if the encoding and character set of a string were > "implementation details". I'm not sure how to do it, but I think it is > worth trying to find a solution for Ada. (I think I was introduced to > how the KDE library does it once, but IIRC only encoding was abstracted > away.) It's relatively easy to do (see the first version of AI12-0021-1 for one way), but it is pervasive (if useful) and difficult to make efficient. And you have to throw away essentially everything that currently takes a String -- that's a bridge too far for almost everyone. A bit of additional language support (around conversions) would make it more possible as a library, but the "throw everything away" aspect makes it unlikely to get wide use. My personal opinion about this is that the ARG (as a whole) really does not care about these issues; the "solution" for Ada 2020 is a few more Wide_Wide_ madness packages. My view is that this is really more about checking off a box (we were asked to do *something* and we did *something*, now go away) than about any attempt to fix the issues. (Admittedly, it's too late to do anything else for Ada 2020 -- large new proposals are out-of-bounds now, they have to wait another cycle. But another set of junky patches doesn't really help anything other than the reduce the obvious pressure for a real solution.) Randy. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 7:53 ` Dmitry A. Kazakov 2018-07-04 9:55 ` J-P. Rosen @ 2018-07-04 19:02 ` G. B. 2018-07-04 19:16 ` Dmitry A. Kazakov 1 sibling, 1 reply; 73+ messages in thread From: G. B. @ 2018-07-04 19:02 UTC (permalink / raw) Dmitry A. Kazakov <mailbox@dmitry-kazakov.de> wrote: > Back to the square > one, how to design an UTF-8 string type? Never. What is the proper representation of 3? Which role does a UTF play, other than during I/O operations? So, that’s a type that stands for certain I/O operations of certain objects... Practically, that’s properly typed proper procedures, no? ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 19:02 ` G. B. @ 2018-07-04 19:16 ` Dmitry A. Kazakov 2018-07-04 20:40 ` G. B. 0 siblings, 1 reply; 73+ messages in thread From: Dmitry A. Kazakov @ 2018-07-04 19:16 UTC (permalink / raw) On 2018-07-04 21:02, G. B. wrote: > Dmitry A. Kazakov <mailbox@dmitry-kazakov.de> wrote: > >> Back to the square >> one, how to design an UTF-8 string type? > > Never. > > What is the proper representation of 3? What is 3 here? > Which role does a UTF play, other than during I/O operations? UTF-8 is a preferable encoding for most text processing purposes. Should string types never be fixed, a quick and dirty solution would be throwing wide string types away and declaring String with all its bastards (Unbounded_String etc) UTF-8. > So, that’s a > type that stands for certain I/O operations of certain objects... No, it is a type that stands for string. > Practically, that’s properly typed proper procedures, no? You lost me here again. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 19:16 ` Dmitry A. Kazakov @ 2018-07-04 20:40 ` G. B. 2018-07-04 20:55 ` Dmitry A. Kazakov 0 siblings, 1 reply; 73+ messages in thread From: G. B. @ 2018-07-04 20:40 UTC (permalink / raw) Dmitry A. Kazakov <mailbox@dmitry-kazakov.de> wrote: > On 2018-07-04 21:02, G. B. wrote: >> Dmitry A. Kazakov <mailbox@dmitry-kazakov.de> wrote: >> >>> Back to the square >>> one, how to design an UTF-8 string type? >> >> Never. >> >> What is the proper representation of 3? > > What is 3 here? It names a value of some type. >> Which role does a UTF play, other than during I/O operations? > > UTF-8 is a preferable encoding for most text processing purposes. Like finding the number of characters that some Ada string has? > Should string types never be fixed, a quick and dirty solution would be > throwing wide string types away Maybe. Sort of works, in Java. > and declaring String with all its > bastards (Unbounded_String etc) UTF-8. I’d not want encoding here. >> Practically, that’s properly typed proper procedures, no? > > You lost me here again. A string to be output somewhere may need an encoding. (‘H’, ‘e’, ‘l’, ‘l’, ‘o’) does not need one to be useful, but output is performed by a value of type File * String * Encoding -> Void: a properly typed procedure. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 20:40 ` G. B. @ 2018-07-04 20:55 ` Dmitry A. Kazakov 2018-07-04 21:21 ` G.B. 0 siblings, 1 reply; 73+ messages in thread From: Dmitry A. Kazakov @ 2018-07-04 20:55 UTC (permalink / raw) On 2018-07-04 22:40, G. B. wrote: > Dmitry A. Kazakov <mailbox@dmitry-kazakov.de> wrote: >> On 2018-07-04 21:02, G. B. wrote: >>> Dmitry A. Kazakov <mailbox@dmitry-kazakov.de> wrote: >>> >>>> Back to the square >>>> one, how to design an UTF-8 string type? >>> >>> Never. >>> >>> What is the proper representation of 3? >> >> What is 3 here? > > It names a value of some type. Which type? You name the type, I name the representation. >>> Which role does a UTF play, other than during I/O operations? >> >> UTF-8 is a preferable encoding for most text processing purposes. > > Like finding the number of characters that some Ada string has? This operation is practically never required in text processing. The strength (and a design goal) of UTF-8 is that almost all useful operations defined in terms of characters are directly mapped into operations defined on octets. The rest may have whatever complexity, nobody cares. >> Should string types never be fixed, a quick and dirty solution would be >> throwing wide string types away > > Maybe. Sort of works, in Java. > >> and declaring String with all its >> bastards (Unbounded_String etc) UTF-8. > > I’d not want encoding here. There is always some. UTF-8 is a choice with the best balance of advantages vs disadvantage. >>> Practically, that’s properly typed proper procedures, no? >> >> You lost me here again. > > A string to be output somewhere may need an encoding. (‘H’, ‘e’, ‘l’, ‘l’, > ‘o’) does not need one to be useful, but output is performed by a value of > type File * String * Encoding -> Void: a properly typed procedure. It is almost never decomposed this way. Encoding is a part of string type representation. File I/O is usually untyped or weakly typed. At best the encoding is a parameter of file open. Ada text I/O packages are designed to deal with a single type of strings with encoding taken from the string type. But I still have no idea what you want to say by that. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 20:55 ` Dmitry A. Kazakov @ 2018-07-04 21:21 ` G.B. 2018-07-05 7:55 ` Dmitry A. Kazakov 0 siblings, 1 reply; 73+ messages in thread From: G.B. @ 2018-07-04 21:21 UTC (permalink / raw) On 04.07.18 22:55, Dmitry A. Kazakov wrote: > On 2018-07-04 22:40, G. B. wrote: >> Dmitry A. Kazakov <mailbox@dmitry-kazakov.de> wrote: >>> On 2018-07-04 21:02, G. B. wrote: >>>> Dmitry A. Kazakov <mailbox@dmitry-kazakov.de> wrote: >>>> >>>>> Back to the square >>>>> one, how to design an UTF-8 string type? >>>> >>>> Never. >>>> >>>> What is the proper representation of 3? >>> >>> What is 3 here? >> >> It names a value of some type. > > Which type? You name the type, Any type whose objects' values include the value 3 and which does not specify a representation in source, like Standard.Integer. >> I’d not want encoding here. > > There is always some. Not in source, where design is fixed explicitly. > But I still have no idea what you want to say by that. A properly typed procedure object handles the use case of encoding I/O in a type safe way. The type is not that of string-with-something composites. It is the type which covers the use case procedurally. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-04 21:21 ` G.B. @ 2018-07-05 7:55 ` Dmitry A. Kazakov 2018-07-06 8:28 ` G.B. 0 siblings, 1 reply; 73+ messages in thread From: Dmitry A. Kazakov @ 2018-07-05 7:55 UTC (permalink / raw) On 2018-07-04 23:21, G.B. wrote: > On 04.07.18 22:55, Dmitry A. Kazakov wrote: >> On 2018-07-04 22:40, G. B. wrote: >>> Dmitry A. Kazakov <mailbox@dmitry-kazakov.de> wrote: >>>> On 2018-07-04 21:02, G. B. wrote: >>>>> Dmitry A. Kazakov <mailbox@dmitry-kazakov.de> wrote: >>>>> >>>>>> Back to the square >>>>>> one, how to design an UTF-8 string type? >>>>> >>>>> Never. >>>>> >>>>> What is the proper representation of 3? >>>> >>>> What is 3 here? >>> >>> It names a value of some type. >> >> Which type? You name the type, > > Any type whose objects' values include the value 3 > and which does not specify a representation in source, > like Standard.Integer. Any type from a set of types? You mean a class-wide object then. The representation of a class-wide object is (Tag, Value). So, name the specific type and you get the representation. You cannot skip that step. There is no values and representations of without types. >>> I’d not want encoding here. >> >> There is always some. > > Not in source, where design is fixed explicitly. > >> But I still have no idea what you want to say by that. > > A properly typed procedure object handles the use case of > encoding I/O in a type safe way. The type is not that of > string-with-something composites. It is the type which covers > the use case procedurally. I do not quite understand this either, but it sounds more right than wrong. So? P.S. It would be much easier, if you first stated a proposition and then illustrated it with an example, rather than trowing an example without any hits as to what class of circumstances this example is supposed to represent. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-05 7:55 ` Dmitry A. Kazakov @ 2018-07-06 8:28 ` G.B. 2018-07-06 8:57 ` Dmitry A. Kazakov 0 siblings, 1 reply; 73+ messages in thread From: G.B. @ 2018-07-06 8:28 UTC (permalink / raw) On 05.07.18 09:55, Dmitry A. Kazakov wrote: >>>>>>> Back to the square >>>>>>> one, how to design an UTF-8 string type? >>>>>> >>>>>> Never. >>>>>> >>>>>> What is the proper representation of 3? >>>>> >>>>> What is 3 here? >>>> >>>> It names a value of some type. >>> >>> Which type? You name the type, >> >> Any type whose objects' values include the value 3 >> and which does not specify a representation in source, >> like Standard.Integer. > > The representation of a class-wide object is (Tag, Value). Obviously, 3 is not, given integers in Ada. Also, your use of "representation" seems to exclude Ada representation of the Value part the pair introduced above. How's that? So, is your "representation" an enthymeme that stipulates some definitions affecting 3 in Ada source texts? > So, name the specific type and you get the representation. What is the representation declared by Standard.Integer? > There is no values and representations of without types. A red herring. >>>> I’d not want encoding here. >>> >>> There is always some. >> >> Not in source, where design is fixed explicitly. Anything on this one? >>> But I still have no idea what you want to say by that. >> >> A properly typed procedure object handles the use case of >> encoding I/O in a type safe way. The type is not that of >> string-with-something composites. It is the type which covers >> the use case procedurally. > > I do not quite understand this either, but it sounds more right than wrong. So? So stop considering type for just data objects, consider types for operation objects instead and then need to perpetually entangle string objects with encoding objects is gone. P.S.: A question is neither a proposition nor an example, but it has been helpful in the past. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-06 8:28 ` G.B. @ 2018-07-06 8:57 ` Dmitry A. Kazakov 0 siblings, 0 replies; 73+ messages in thread From: Dmitry A. Kazakov @ 2018-07-06 8:57 UTC (permalink / raw) On 2018-07-06 10:28, G.B. wrote: > On 05.07.18 09:55, Dmitry A. Kazakov wrote: > >>>>>>>> Back to the square >>>>>>>> one, how to design an UTF-8 string type? >>>>>>> >>>>>>> Never. >>>>>>> >>>>>>> What is the proper representation of 3? >>>>>> >>>>>> What is 3 here? >>>>> >>>>> It names a value of some type. >>>> >>>> Which type? You name the type, >>> >>> Any type whose objects' values include the value 3 >>> and which does not specify a representation in source, >>> like Standard.Integer. >> >> The representation of a class-wide object is (Tag, Value). > > Obviously, 3 is not, given integers in Ada. Of course it is. The representation of Integer 3 is the representation of Integer 3 and is not the representation of Integer'Class 3. > Also, your use of > "representation" seems to exclude Ada representation of > the Value part the pair introduced above. How's that? Not at all. If Integer'Class existed then representation of X : Integer'Class := 3: would be exactly Integer'Tag Integer'(3) whereas the representation of Y : Integer := 3: is, as always: Integer'(3) Types Integer'Class and Integer are different and have different representations. Each type has a representation of its own, no? >> So, name the specific type and you get the representation. > > What is the representation declared by Standard.Integer? It is not declared, it is implied, usually the machine representation of signed integer of the machine word length. >> There is no values and representations of without types. > > A red herring. But true, regardless. Questions like what is the representation of 3 are meaningless. The answer is "any". >>>>> I’d not want encoding here. >>>> >>>> There is always some. >>> >>> Not in source, where design is fixed explicitly. > > Anything on this one? If you formulate the question so that I could understand it then ... >>>> But I still have no idea what you want to say by that. >>> >>> A properly typed procedure object handles the use case of >>> encoding I/O in a type safe way. The type is not that of >>> string-with-something composites. It is the type which covers >>> the use case procedurally. >> >> I do not quite understand this either, but it sounds more right than >> wrong. So? > > So stop considering type for just data objects, consider types for > operation objects instead and then need to perpetually entangle > string objects with encoding objects is gone. Where I consider type as data objects and how is that relevant? -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-07-01 18:06 ` Jacob Sparre Andersen 2018-07-01 19:59 ` Simon Wright @ 2018-07-02 8:31 ` Lucretia 1 sibling, 0 replies; 73+ messages in thread From: Lucretia @ 2018-07-02 8:31 UTC (permalink / raw) On Sunday, 1 July 2018 19:06:43 UTC+1, Jacob Sparre Andersen wrote: > Luke A. Guest wrote: > > Simon Wright <> wrote: > > >> I suspect that Unicode_String would need to be by-reference. > > > > Yeah I think I’m going to have to make it tagged. > > You don't need to make it tagged, to pass it by reference. It is enough > to make the formal parameter aliased. Same crash. ^ permalink raw reply [flat|nested] 73+ messages in thread
* Re: Strange crash on custom iterator 2018-06-30 14:25 ` Simon Wright 2018-06-30 14:33 ` Lucretia @ 2018-06-30 14:34 ` Lucretia 1 sibling, 0 replies; 73+ messages in thread From: Lucretia @ 2018-06-30 14:34 UTC (permalink / raw) On Saturday, 30 June 2018 15:25:39 UTC+1, Simon Wright wrote: > First, I think Has_Element should probably be Thanks, BTW :) ^ permalink raw reply [flat|nested] 73+ messages in thread
end of thread, other threads:[~2018-07-06 8:57 UTC | newest] Thread overview: 73+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-06-30 10:48 Strange crash on custom iterator Lucretia 2018-06-30 11:32 ` Simon Wright 2018-06-30 12:02 ` Lucretia 2018-06-30 14:25 ` Simon Wright 2018-06-30 14:33 ` Lucretia 2018-06-30 19:25 ` Simon Wright 2018-06-30 19:36 ` Luke A. Guest 2018-07-01 18:06 ` Jacob Sparre Andersen 2018-07-01 19:59 ` Simon Wright 2018-07-02 17:43 ` Luke A. Guest 2018-07-02 19:42 ` Simon Wright 2018-07-03 14:08 ` Lucretia 2018-07-03 14:17 ` J-P. Rosen 2018-07-03 15:06 ` Lucretia 2018-07-03 15:45 ` J-P. Rosen 2018-07-03 15:55 ` Lucretia 2018-07-03 17:00 ` J-P. Rosen 2018-07-03 15:57 ` Dmitry A. Kazakov 2018-07-03 16:07 ` Lucretia 2018-07-03 16:36 ` Dmitry A. Kazakov 2018-07-03 16:42 ` Lucretia 2018-07-03 16:45 ` Lucretia 2018-07-03 20:18 ` Dmitry A. Kazakov 2018-07-03 21:04 ` Lucretia 2018-07-04 1:26 ` Dan'l Miller 2018-07-04 1:59 ` Lucretia 2018-07-04 7:37 ` Dmitry A. Kazakov 2018-07-04 12:46 ` Dan'l Miller 2018-07-04 13:37 ` Dennis Lee Bieber 2018-07-04 7:21 ` Dmitry A. Kazakov 2018-07-03 18:54 ` Dan'l Miller 2018-07-03 20:22 ` Dmitry A. Kazakov 2018-07-04 7:33 ` J-P. Rosen 2018-07-04 7:53 ` Dmitry A. Kazakov 2018-07-04 9:55 ` J-P. Rosen 2018-07-04 10:01 ` Dmitry A. Kazakov 2018-07-04 11:30 ` J-P. Rosen 2018-07-04 13:27 ` Dmitry A. Kazakov 2018-07-04 14:37 ` Dan'l Miller 2018-07-04 14:43 ` Dan'l Miller 2018-07-04 14:57 ` J-P. Rosen 2018-07-04 15:41 ` Lucretia 2018-07-04 16:55 ` Dan'l Miller 2018-07-04 18:01 ` Shark8 2018-07-04 18:57 ` Dmitry A. Kazakov 2018-07-04 19:53 ` Shark8 2018-07-04 20:05 ` Lucretia 2018-07-04 22:04 ` Shark8 2018-07-05 0:12 ` Dan'l Miller 2018-07-05 1:46 ` Shark8 2018-07-05 2:07 ` Luke A. Guest 2018-07-05 16:47 ` Shark8 2018-07-05 17:19 ` Dan'l Miller 2018-07-05 19:14 ` Shark8 2018-07-04 20:43 ` Dmitry A. Kazakov 2018-07-04 17:51 ` Jacob Sparre Andersen 2018-07-04 18:06 ` Shark8 2018-07-04 18:59 ` Dan'l Miller 2018-07-04 19:01 ` Dmitry A. Kazakov 2018-07-05 18:08 ` Randy Brukardt 2018-07-05 19:41 ` Dmitry A. Kazakov 2018-07-04 21:00 ` Jacob Sparre Andersen 2018-07-05 18:06 ` Randy Brukardt 2018-07-04 19:02 ` G. B. 2018-07-04 19:16 ` Dmitry A. Kazakov 2018-07-04 20:40 ` G. B. 2018-07-04 20:55 ` Dmitry A. Kazakov 2018-07-04 21:21 ` G.B. 2018-07-05 7:55 ` Dmitry A. Kazakov 2018-07-06 8:28 ` G.B. 2018-07-06 8:57 ` Dmitry A. Kazakov 2018-07-02 8:31 ` Lucretia 2018-06-30 14:34 ` Lucretia
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox