* Data table text I/O package? @ 2005-06-15 9:57 Jacob Sparre Andersen 2005-06-15 11:43 ` Preben Randhol ` (2 more replies) 0 siblings, 3 replies; 68+ messages in thread From: Jacob Sparre Andersen @ 2005-06-15 9:57 UTC (permalink / raw) I do quite a lot of work, where I manipulate data stored in (tabulator separated) text files [1]. Does anybody know of a package which handles the inclusion of a header line with the column names in an elegant way? It should preferably include automated testing that the header is correct, when a file is opened, and automated creation of the header when a file is created. TIA, Jacob [1] Yes, I know that binary files are faster to read and write, but they complicate file transfer between different platforms and "visual inspection" of the data. -- City X'ers mail van (building instructions): http://lego.jacob-sparre.dk/CityXers/Postbil/ ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-15 9:57 Data table text I/O package? Jacob Sparre Andersen @ 2005-06-15 11:43 ` Preben Randhol 2005-06-15 13:35 ` Jacob Sparre Andersen 2005-06-15 19:30 ` Simon Wright 2005-06-15 22:40 ` Lionel Draghi 2 siblings, 1 reply; 68+ messages in thread From: Preben Randhol @ 2005-06-15 11:43 UTC (permalink / raw) To: Jacob Sparre Andersen; +Cc: comp.lang.ada Jacob Sparre Andersen <sparre@nbi.dk> wrote on 15/06/2005 (10:01) : > I do quite a lot of work, where I manipulate data stored in (tabulator > separated) text files [1]. Does anybody know of a package which > handles the inclusion of a header line with the column names in an > elegant way? It should preferably include automated testing that the > header is correct, when a file is opened, and automated creation of > the header when a file is created. Not sure what you are asking. Do you want to load the data into lists comforming to a header name? You can use charles with maps and lists. Or you want something that splits the line ? Preben ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-15 11:43 ` Preben Randhol @ 2005-06-15 13:35 ` Jacob Sparre Andersen 2005-06-15 14:12 ` Preben Randhol [not found] ` <20050615141236.GA90053@pvv.org> 0 siblings, 2 replies; 68+ messages in thread From: Jacob Sparre Andersen @ 2005-06-15 13:35 UTC (permalink / raw) Preben Randhol wrote: > Jacob Sparre Andersen <sparre@nbi.dk> wrote on 15/06/2005 (10:01) : > > I do quite a lot of work, where I manipulate data stored in > > (tabulator separated) text files [1]. Does anybody know of a > > package which handles the inclusion of a header line with the > > column names in an elegant way? It should preferably include > > automated testing that the header is correct, when a file is > > opened, and automated creation of the header when a file is > > created. > > Not sure what you are asking. Do you want to load the data into > lists comforming to a header name? That was not what I was trying to ask for. Generally I just run my data analysis tools as "filters", where I can manage with processing one line (or a few lines) at a time. The important part is to have the checking of the headers and the generation of Put_Line and Get_Line procedures automated based on a record type (and not too much more). Since I need records (for type checking) and not just simple arrays, I can't manage with a generic package, but have to put some code generation into the system (or can I play some tricks with streams?). > You can use charles with maps and lists. I'll see if I can find the package in Charles, which does this. > Or you want something that splits the line? I have that already. Jacob -- Brakzand II: http://lego.jacob-sparre.dk/Transport/Skibe/Brakzand_II/ ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-15 13:35 ` Jacob Sparre Andersen @ 2005-06-15 14:12 ` Preben Randhol 2005-06-15 15:02 ` Jacob Sparre Andersen [not found] ` <20050615141236.GA90053@pvv.org> 1 sibling, 1 reply; 68+ messages in thread From: Preben Randhol @ 2005-06-15 14:12 UTC (permalink / raw) To: Jacob Sparre Andersen; +Cc: comp.lang.ada Jacob Sparre Andersen <sparre@nbi.dk> wrote on 15/06/2005 (13:38) : > The important part is to have the checking of the headers and the > generation of Put_Line and Get_Line procedures automated based on a > record type (and not too much more). Since I need records (for type > checking) and not just simple arrays, I can't manage with a generic > package, but have to put some code generation into the system (or can > I play some tricks with streams?). So the header might be: Integer Float Text and the data could be: 1 0.9 Start point 2 0.3 Minimum 3 6.0 End point and then you want to check the data and validate that they are of the correct type as indicated by the header? To generate the header you want that the package finds out which type a certain data type is and output this type in the header? Preben ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-15 14:12 ` Preben Randhol @ 2005-06-15 15:02 ` Jacob Sparre Andersen 2005-06-15 16:17 ` Preben Randhol 2005-06-15 18:58 ` Randy Brukardt 0 siblings, 2 replies; 68+ messages in thread From: Jacob Sparre Andersen @ 2005-06-15 15:02 UTC (permalink / raw) Preben Randhol wrote: > Jacob Sparre Andersen <sparre@nbi.dk> wrote on 15/06/2005 (13:38) : > > The important part is to have the checking of the headers and the > > generation of Put_Line and Get_Line procedures automated based on > > a record type (and not too much more). Since I need records (for > > type checking) and not just simple arrays, I can't manage with a > > generic package, but have to put some code generation into the > > system (or can I play some tricks with streams?). > > So the header might be: > > Integer Float Text Not quite. The headers would be field names, not just types. I.e.: Gene ID p-value Expression-level Description Human cromosome GE29031 0.04539 245.45 Cyclin-B1 17 > and the data could be: > > 1 0.9 Start point > 2 0.3 Minimum > 3 6.0 End point > > and then you want to check the data and validate that they are of > the correct type as indicated by the header? To generate the header > you want that the package finds out which type a certain data type > is and output this type in the header? Sort of. Except that I would use the names of the fields in the record and not just the types of the fields. One of my problems is that I have different kinds of files (in terms of meaning of the numbers) where the types for all practical purposes are the same. But it seems like it might be more efficient to code a library like that by hand for each case, even though it means that I miss the automated checking (my main reason for using Ada). Jacob (who should remember not to want the impossible every day) -- �If you're going to have crime, it might as well be organized crime.� -- Lord Vetinari ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-15 15:02 ` Jacob Sparre Andersen @ 2005-06-15 16:17 ` Preben Randhol 2005-06-15 16:58 ` Dmitry A. Kazakov 2005-06-15 18:58 ` Randy Brukardt 1 sibling, 1 reply; 68+ messages in thread From: Preben Randhol @ 2005-06-15 16:17 UTC (permalink / raw) To: Jacob Sparre Andersen; +Cc: comp.lang.ada On Wed, Jun 15, 2005 at 05:02:37PM +0200, Jacob Sparre Andersen wrote: > Preben Randhol wrote: > > Jacob Sparre Andersen <sparre@nbi.dk> wrote on 15/06/2005 (13:38) : > > > > The important part is to have the checking of the headers and the > > > generation of Put_Line and Get_Line procedures automated based on > > > a record type (and not too much more). Since I need records (for > > > type checking) and not just simple arrays, I can't manage with a > > > generic package, but have to put some code generation into the > > > system (or can I play some tricks with streams?). > > > > So the header might be: > > > > Integer Float Text > > Not quite. The headers would be field names, not just types. I.e.: > > Gene ID p-value Expression-level Description Human cromosome > GE29031 0.04539 245.45 Cyclin-B1 17 > > > and the data could be: > > > > 1 0.9 Start point > > 2 0.3 Minimum > > 3 6.0 End point > > > > and then you want to check the data and validate that they are of > > the correct type as indicated by the header? To generate the header > > you want that the package finds out which type a certain data type > > is and output this type in the header? > > Sort of. Except that I would use the names of the fields in the > record and not just the types of the fields. > > One of my problems is that I have different kinds of files (in terms > of meaning of the numbers) where the types for all practical purposes > are the same. So you have different files with for example p-value, and in all the p-value is a float? If so, then I would have made a map something like: Gene => "Text" ID => "Float" p-value => "My_Float" (In case you have a special type) Expression-level => "Text" Description => "Text" Human cromosome => "Text" ... and when you read in the values you can do a Is_Valid_Type (Column_Type : String; Value : String) return Boolean is begin if Column_Type = "Float" then declare F : Float := Float'Value (Value); begin return true; exception when => others return false; end; elsif Column_Type = "Integer" then ... -- Preben Randhol -------------- http://www.pvv.org/~randhol/Ada95 -- "Have another drink, not-Corporal Nobby?" said Sergeant Colon unsteadily. "I do not mind if I do, not-Sgt Colon," said Nobby. -- The joys of working undercover (Terry Pratchett, Guards! Guards!) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-15 16:17 ` Preben Randhol @ 2005-06-15 16:58 ` Dmitry A. Kazakov 2005-06-15 17:30 ` Marius Amado Alves 0 siblings, 1 reply; 68+ messages in thread From: Dmitry A. Kazakov @ 2005-06-15 16:58 UTC (permalink / raw) On Wed, 15 Jun 2005 18:17:06 +0200, Preben Randhol wrote: > On Wed, Jun 15, 2005 at 05:02:37PM +0200, Jacob Sparre Andersen wrote: >> Preben Randhol wrote: >>> Jacob Sparre Andersen <sparre@nbi.dk> wrote on 15/06/2005 (13:38) : >> >>> > The important part is to have the checking of the headers and the >>> > generation of Put_Line and Get_Line procedures automated based on >>> > a record type (and not too much more). Since I need records (for >>> > type checking) and not just simple arrays, I can't manage with a >>> > generic package, but have to put some code generation into the >>> > system (or can I play some tricks with streams?). >>> >>> So the header might be: >>> >>> Integer Float Text >> >> Not quite. The headers would be field names, not just types. I.e.: >> >> Gene ID p-value Expression-level Description Human cromosome >> GE29031 0.04539 245.45 Cyclin-B1 17 >> >>> and the data could be: >>> >>> 1 0.9 Start point >>> 2 0.3 Minimum >>> 3 6.0 End point >>> >>> and then you want to check the data and validate that they are of >>> the correct type as indicated by the header? To generate the header >>> you want that the package finds out which type a certain data type >>> is and output this type in the header? >> >> Sort of. Except that I would use the names of the fields in the >> record and not just the types of the fields. >> >> One of my problems is that I have different kinds of files (in terms >> of meaning of the numbers) where the types for all practical purposes >> are the same. > > So you have different files with for example p-value, and in all the > p-value is a float? > > If so, then I would have made a map something like: > > Gene => "Text" > ID => "Float" > p-value => "My_Float" (In case you have a special type) > Expression-level => "Text" > Description => "Text" > Human cromosome => "Text" > ... > > > and when you read in the values you can do a > > Is_Valid_Type (Column_Type : String; Value : String) return Boolean > is > begin > if Column_Type = "Float" then > declare > F : Float := Float'Value (Value); > begin > return true; > exception > when => others > return false; > end; > elsif Column_Type = "Integer" then > ... One could have tagged objects and handles to them. Then the header string could form a list of handles to the objects. The read loop could then look like: loop Get_Line (Buffer, Length); declare Line : constant String := Buffer (Buffer'First, Length); begin Pointer := Line'First; for Field in Handles_List'Range loop Get (Line, Pointer); -- Skip blanks Get (Line, Pointer, Ptr (Handles_List (Field)).all); -- Dispatches end loop; end; Get (Line, Pointer); -- Skip blanks if Pointer <= Line'Last then -- Unrecognized rest end if; end loop; -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-15 16:58 ` Dmitry A. Kazakov @ 2005-06-15 17:30 ` Marius Amado Alves 2005-06-15 18:41 ` Dmitry A. Kazakov 0 siblings, 1 reply; 68+ messages in thread From: Marius Amado Alves @ 2005-06-15 17:30 UTC (permalink / raw) To: comp.lang.ada > One could have tagged objects and handles to them. Then the header > string > could form a list of handles to the objects... Or that, yes, in the absence of reflexivity. A list of polymorphs instead of a (dynamically created) record type. Standard. The Ada way. The way of any current mainstream OO language, really. ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-15 17:30 ` Marius Amado Alves @ 2005-06-15 18:41 ` Dmitry A. Kazakov 2005-06-15 19:09 ` Marius Amado Alves 0 siblings, 1 reply; 68+ messages in thread From: Dmitry A. Kazakov @ 2005-06-15 18:41 UTC (permalink / raw) On Wed, 15 Jun 2005 18:30:52 +0100, Marius Amado Alves wrote: >> One could have tagged objects and handles to them. Then the header >> string >> could form a list of handles to the objects... > > Or that, yes, in the absence of reflexivity. A list of polymorphs > instead of a (dynamically created) record type. Standard. The Ada way. > The way of any current mainstream OO language, really. BTW, completely unrealistic, but. I'm unsure if this will be legal in Ada 2006: declare type Record is tagged null record; begin case Filed (1).Type is when Float_Type => declare type R1 is new Record with record Float_Field_1 : Float; end record; begin case Filed (2).Type is when Float_Type => declare type R2 is new R1 with record Float_Field_2 : Float; end record; begin ... -- somewhere dee-e-e-ply nested: type RN is new RN-1 with record ... -- has all fields of all types! end; end; when Int_Type => ... (:-)) Provided that this is legal, then one could then try to factor it out using generics... Though the number of fields has to be fixed. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-15 18:41 ` Dmitry A. Kazakov @ 2005-06-15 19:09 ` Marius Amado Alves 0 siblings, 0 replies; 68+ messages in thread From: Marius Amado Alves @ 2005-06-15 19:09 UTC (permalink / raw) To: comp.lang.ada I think this is legal even in Ada 95 (renamed Record to Record_Type). > I'm unsure if this will be legal in Ada 2006: > > declare > type Record_Type is tagged null record; > begin > case Filed (1).Type is > when Float_Type => > declare > type R1 is new Record_Type with record > Float_Field_1 : Float; > end record; > begin > case Filed (2).Type is > when Float_Type => > declare > type R2 is new R1 with record > Float_Field_2 : Float; > end record; > begin > ... > -- somewhere dee-e-e-ply nested: > type RN is new RN-1 with record ... > -- has all fields of all types! > end; > end; > when Int_Type => > ... ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-15 15:02 ` Jacob Sparre Andersen 2005-06-15 16:17 ` Preben Randhol @ 2005-06-15 18:58 ` Randy Brukardt 2005-06-16 9:55 ` Jacob Sparre Andersen 1 sibling, 1 reply; 68+ messages in thread From: Randy Brukardt @ 2005-06-15 18:58 UTC (permalink / raw) "Jacob Sparre Andersen" <sparre@nbi.dk> wrote in message news:m2hdfzek8i.fsf@hugin.crs4.it... > Preben Randhol wrote: > > Jacob Sparre Andersen <sparre@nbi.dk> wrote on 15/06/2005 (13:38) : > > > > The important part is to have the checking of the headers and the > > > generation of Put_Line and Get_Line procedures automated based on > > > a record type (and not too much more). Since I need records (for > > > type checking) and not just simple arrays, I can't manage with a > > > generic package, but have to put some code generation into the > > > system (or can I play some tricks with streams?). > > > > So the header might be: > > > > Integer Float Text > > Not quite. The headers would be field names, not just types. I.e.: > > Gene ID p-value Expression-level Description Human cromosome > GE29031 0.04539 245.45 Cyclin-B1 17 I may be dense, but isn't this the purpose of XML? If so, why reinvent the wheel? (I personally think XML is way overused, more because it *can* be used than that it is worthwhile for the application. But this seems to be exactly the application that it was designed for. You'll end up with something like XML eventually anyway, why not start with it?) Randy. ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-15 18:58 ` Randy Brukardt @ 2005-06-16 9:55 ` Jacob Sparre Andersen 2005-06-16 10:53 ` Marius Amado Alves 2005-06-30 3:02 ` Randy Brukardt 0 siblings, 2 replies; 68+ messages in thread From: Jacob Sparre Andersen @ 2005-06-16 9:55 UTC (permalink / raw) Randy Brukardt wrote: > I may be dense, but isn't this the purpose of XML? If so, why > reinvent the wheel? The purpose of XML is to be _the_ universal file format. a) I don't want a universal file format. b) I don't believe in a universal file format. c) XML is (almost) less readable than a binary file my purposes. d) I'm _not_ going to switch away from tabulator separated tables for purposes, where tabulator separated tables are a sensible representation of the data in textual form. > (I personally think XML is way overused, more because it *can* be > used than that it is worthwhile for the application. But this seems > to be exactly the application that it was designed for. You'll end > up with something like XML eventually anyway, why not start with > it?) I'm afraid you completely misunderstood my problem. It is not a matter of a selecting a file format. It is the matter of automagically generating code for reading and writing that file format. Jacob -- "I am an old man now, and when I die and go to Heaven there are two matters on which I hope enlightenment. One is quantum electro-dynamics and the other is turbulence of fluids. About the former, I am rather optimistic." Sir Horace Lamb. ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-16 9:55 ` Jacob Sparre Andersen @ 2005-06-16 10:53 ` Marius Amado Alves 2005-06-16 12:24 ` Robert A Duff 2005-06-16 14:01 ` Georg Bauhaus 2005-06-30 3:02 ` Randy Brukardt 1 sibling, 2 replies; 68+ messages in thread From: Marius Amado Alves @ 2005-06-16 10:53 UTC (permalink / raw) To: comp.lang.ada On 16 Jun 2005, at 10:55, Jacob Sparre Andersen wrote: > Randy Brukardt wrote: > >> I may be dense, but isn't this the purpose of XML? If so, why >> reinvent the wheel? > > d) I'm _not_ going to switch away from tabulator separated tables for > purposes, where tabulator separated tables are a sensible > representation of the data in textual form. Indeed. XML is for semi-structured data and/or text data with Unicode etc. For tables of atomic data tab separated is better. More readable, efficient, sensible, not requiring a monster XML library. > It is the matter of > automagically generating code for reading and writing that file > format. Yes. This is interesting, useful, and easy. From the header you get the field names, from the first data line with deduce the data types. With these elements you can generate the record type and procedures to read the file. A trick I often use to deduce data types is based on 'Value: function Get_Type (Value : String) return Data_Type is F : Float; I : Integer; begin F := Float'Value (Value); return Type_Float; exception when Constraint_Error => begin I := Integer'Value (Value); return Type_Integer; exception when Constraint_Error => return Type_String; end; end; ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-16 10:53 ` Marius Amado Alves @ 2005-06-16 12:24 ` Robert A Duff 2005-06-16 14:01 ` Georg Bauhaus 1 sibling, 0 replies; 68+ messages in thread From: Robert A Duff @ 2005-06-16 12:24 UTC (permalink / raw) Marius Amado Alves <amado.alves@netcabo.pt> writes: > Yes. This is interesting, useful, and easy. From the header you get the > field names, from the first data line with deduce the data types. With > these elements you can generate the record type and procedures to read > the file. Hmm. Interesting idea. But you will lose the full power of Ada's type system. You cannot, in general, deduce the type from the data, in Ada. I mean, 123 could be any integer type, and a typical Ada program has many integer types. For that matter, how do you know 123 is not intended to be Type_String, in your example below? >... A trick I often use to deduce data types is based on 'Value: I believe this trick will run afoul of RM-11.6. It probably works in practise, but I think that an implementation is allowed to return Type_Float, no matter what string you pass to Value! Did I mention that I don't like 11.6? ;-) > function Get_Type (Value : String) return Data_Type is > F : Float; > I : Integer; > begin > F := Float'Value (Value); > return Type_Float; > exception > when Constraint_Error => > begin > I := Integer'Value (Value); > return Type_Integer; > exception > when Constraint_Error => > return Type_String; > end; > end; - Bob ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-16 10:53 ` Marius Amado Alves 2005-06-16 12:24 ` Robert A Duff @ 2005-06-16 14:01 ` Georg Bauhaus 2005-06-16 12:27 ` Dmitry A. Kazakov 2005-06-16 13:26 ` Marius Amado Alves 1 sibling, 2 replies; 68+ messages in thread From: Georg Bauhaus @ 2005-06-16 14:01 UTC (permalink / raw) Marius Amado Alves wrote: > For tables of atomic data tab separated is better. Note the crucial bits in this general statement. 1) You had really better have *atomic* data. 2) You had better have the format as your own format and no data exchange with any system requiring "just your table files, please". Tab separated atomic data can be "semi-structured" too. Consider 04/06/05 and tell me wich calender date that is, in [choose country here]. It makes litte sense to say XML = semi, TAB = atomic without specifying what exactly you mean by semi-structure data. Consider <Date y="2005" m="June" d="04"/> If a program maintains a table of calender dates for internal use, then 2005-06-04, or 2005 TAB 06 TAB 04 save space and is easy to use. But it also restricts the table to an internal data format. Choice of TabSV depends on the requirements, doesn't it? In particular on how many different programs will use the data, who is going to "read" them in which ways, special purpose or not, are there industry standards, etc.. I wonder whether Ada programmers will like a data format like Date'(y => 2005, m => -"June", d => 04) and still keep saying that it must have scientifically proven readability advantages, and that XML is verbose. And before you answer, think of the word "habit". Georg ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-16 14:01 ` Georg Bauhaus @ 2005-06-16 12:27 ` Dmitry A. Kazakov 2005-06-16 14:46 ` Georg Bauhaus 2005-06-16 13:26 ` Marius Amado Alves 1 sibling, 1 reply; 68+ messages in thread From: Dmitry A. Kazakov @ 2005-06-16 12:27 UTC (permalink / raw) On Thu, 16 Jun 2005 16:01:57 +0200, Georg Bauhaus wrote: > Marius Amado Alves wrote: > >> For tables of atomic data tab separated is better. > > Note the crucial bits in this general statement. > > 1) You had really better have *atomic* data. > > 2) You had better have the format as your own format and > no data exchange with any system requiring "just > your table files, please". > > Tab separated atomic data can be "semi-structured" > too. Consider 04/06/05 and tell me wich calender date that > is, in [choose country here]. > > It makes litte sense to say XML = semi, TAB = atomic without > specifying what exactly you mean by semi-structure data. > Consider > > <Date y="2005" m="June" d="04"/> > > If a program maintains a table of calender dates > for internal use, then 2005-06-04, or 2005 TAB 06 TAB 04 > save space and is easy to use. But it also restricts > the table to an internal data format. Not necessarily. There is a better technique to parse strings than to tokenize them first. Get rid of scanner. Just take the date from the current position of the string and advance the position to the first character following the date. Because the procedure that gets the date knows the format it also knows where the date ends. It can also support various concurrent formats, provided that they are distinguishable. This way you can parse a string virtually knowing nothing about the formats of its fields. An additional advantage is that error messages (if it comes to a more advanced system) will be pretty easy to generate. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-16 12:27 ` Dmitry A. Kazakov @ 2005-06-16 14:46 ` Georg Bauhaus 2005-06-16 14:51 ` Dmitry A. Kazakov 0 siblings, 1 reply; 68+ messages in thread From: Georg Bauhaus @ 2005-06-16 14:46 UTC (permalink / raw) Dmitry A. Kazakov wrote: > There is a better technique to parse strings than to tokenize them first. > Get rid of scanner. Just take the date from the current position of the > string and advance the position to the first character following the date. > Because the procedure that gets the date knows the format it also knows > where the date ends. It can also support various concurrent formats, > provided that they are distinguishable. This way you can parse a string > virtually knowing nothing about the formats of its fields. An additional > advantage is that error messages (if it comes to a more advanced system) > will be pretty easy to generate. IIUC, what you describe is a (more binary) DTD, either language-standardised or proprietary. And also, what does the sentence "don't scan a string, and don't produce tokens, but advance [something] to the first character following the date that was taken[?] form the string" mean, other than a contradiction in terms? ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-16 14:46 ` Georg Bauhaus @ 2005-06-16 14:51 ` Dmitry A. Kazakov 2005-06-20 11:19 ` Georg Bauhaus 0 siblings, 1 reply; 68+ messages in thread From: Dmitry A. Kazakov @ 2005-06-16 14:51 UTC (permalink / raw) On Thu, 16 Jun 2005 16:46:39 +0200, Georg Bauhaus wrote: > Dmitry A. Kazakov wrote: > >> There is a better technique to parse strings than to tokenize them first. >> Get rid of scanner. Just take the date from the current position of the >> string and advance the position to the first character following the date. >> Because the procedure that gets the date knows the format it also knows >> where the date ends. It can also support various concurrent formats, >> provided that they are distinguishable. This way you can parse a string >> virtually knowing nothing about the formats of its fields. An additional >> advantage is that error messages (if it comes to a more advanced system) >> will be pretty easy to generate. > > IIUC, what you describe is a (more binary) DTD, either language-standardised > or proprietary. > > And also, what does the sentence "don't scan a string, and don't > produce tokens, but advance [something] to the first character > following the date that was taken[?] form the string" mean, > other than a contradiction in terms? Field_1 : Float; Field_2 : Integer; ... Line : String := ...; -- The current line Pointer : Integer; -- The current position in Line Pointer := Line'First; Get (Line, Pointer, Delimiters); -- Skip blanks Get (Line, Pointer, Field_1); -- Get field and move Pointer Get (Line, Pointer, Delimiters); -- Skip blanks Get (Line, Pointer, Field_2); -- Get field and move Pointer ... etc Quite trivial. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-16 14:51 ` Dmitry A. Kazakov @ 2005-06-20 11:19 ` Georg Bauhaus 2005-06-20 11:39 ` Dmitry A. Kazakov 0 siblings, 1 reply; 68+ messages in thread From: Georg Bauhaus @ 2005-06-20 11:19 UTC (permalink / raw) Dmitry A. Kazakov wrote: > Get (Line, Pointer, Field_2); -- Get field and move Pointer > ... > etc > > Quite trivial. And quite adventurous in any but an internal context. ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-20 11:19 ` Georg Bauhaus @ 2005-06-20 11:39 ` Dmitry A. Kazakov 2005-06-20 18:25 ` Georg Bauhaus 0 siblings, 1 reply; 68+ messages in thread From: Dmitry A. Kazakov @ 2005-06-20 11:39 UTC (permalink / raw) On Mon, 20 Jun 2005 13:19:44 +0200, Georg Bauhaus wrote: > Dmitry A. Kazakov wrote: > >> Get (Line, Pointer, Field_2); -- Get field and move Pointer >> ... >> etc >> >> Quite trivial. > > And quite adventurous in any but an internal context. Why? -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-20 11:39 ` Dmitry A. Kazakov @ 2005-06-20 18:25 ` Georg Bauhaus 2005-06-20 18:45 ` Preben Randhol 2005-06-20 18:54 ` Dmitry A. Kazakov 0 siblings, 2 replies; 68+ messages in thread From: Georg Bauhaus @ 2005-06-20 18:25 UTC (permalink / raw) Dmitry A. Kazakov wrote: > On Mon, 20 Jun 2005 13:19:44 +0200, Georg Bauhaus wrote: > > >>Dmitry A. Kazakov wrote: >> >> >>> Get (Line, Pointer, Field_2); -- Get field and move Pointer >>> ... >>> etc >>> >>>Quite trivial. >> >>And quite adventurous in any but an internal context. If you are parsing data from outside, you have to know the quality and structure of data (plus the pitfalls mentioned by Robert Duff.) As to quality, just one inadvertently typed space might be hazardous when it splits an atom in two... :) (Think of a medium quality CSV file, and a number typed 3.1 5. Oops!) XML can help with this for example by identifying the bounds of a data item, even if mistyped: <Distance km='3.1 5'/> This will be noticed by the XML parser if it knows about km's type (NMTOKEN). You could as well squeeze the space out using either Ada.Strings or XML related technology. But in any case there can be no doubt that the string "3.1 5" is a mistyped number. Georg ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-20 18:25 ` Georg Bauhaus @ 2005-06-20 18:45 ` Preben Randhol 2005-06-20 18:54 ` Dmitry A. Kazakov 1 sibling, 0 replies; 68+ messages in thread From: Preben Randhol @ 2005-06-20 18:45 UTC (permalink / raw) To: comp.lang.ada On Mon, Jun 20, 2005 at 08:25:13PM +0200, Georg Bauhaus wrote: > XML can help with this for example by identifying the bounds > of a data item, even if mistyped: > <Distance km='3.1 5'/> However only if it is computer generated... > This will be noticed by the XML parser if it knows about km's > type (NMTOKEN). You could as well squeeze the space out using > either Ada.Strings or XML related technology. But in any case > there can be no doubt that the string "3.1 5" is a mistyped > number. This depends if the parser is validating or not. Many parsers are not validating. Especially if one use SAX. -- Preben Randhol -------------- http://www.pvv.org/~randhol/Ada95 -- �For me, Ada95 puts back the joy in programming.� ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-20 18:25 ` Georg Bauhaus 2005-06-20 18:45 ` Preben Randhol @ 2005-06-20 18:54 ` Dmitry A. Kazakov 2005-06-21 9:24 ` Georg Bauhaus 2005-06-25 16:38 ` Simon Wright 1 sibling, 2 replies; 68+ messages in thread From: Dmitry A. Kazakov @ 2005-06-20 18:54 UTC (permalink / raw) On Mon, 20 Jun 2005 20:25:13 +0200, Georg Bauhaus wrote: > Dmitry A. Kazakov wrote: >> On Mon, 20 Jun 2005 13:19:44 +0200, Georg Bauhaus wrote: >> >> >>>Dmitry A. Kazakov wrote: >>> >>> >>>> Get (Line, Pointer, Field_2); -- Get field and move Pointer >>>> ... >>>> etc >>>> >>>>Quite trivial. >>> >>>And quite adventurous in any but an internal context. > > If you are parsing data from outside, you have to know > the quality and structure of data (plus the pitfalls mentioned > by Robert Duff.) As to quality, just one inadvertently typed > space might be hazardous when it splits an atom in two... :) > > (Think of a medium quality CSV file, and a number typed 3.1 5. > Oops!) No, you just have to use different delimiters between and within the fields. This is why in Ada parameters of a procedure call are separated by commas rather than spaces. Though is it about what syntax would be the best? Or is it about how to parse something in a defined syntax? > XML can help with this for example by identifying the bounds > of a data item, even if mistyped: > <Distance km='3.1 5'/> > This will be noticed by the XML parser if it knows about km's > type (NMTOKEN). Now consider a space between / and >: <Distance km='3.15'/ > XML adds here nothing, but a huge readability loss. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-20 18:54 ` Dmitry A. Kazakov @ 2005-06-21 9:24 ` Georg Bauhaus 2005-06-21 9:52 ` Jacob Sparre Andersen 2005-06-21 10:42 ` Dmitry A. Kazakov 2005-06-25 16:38 ` Simon Wright 1 sibling, 2 replies; 68+ messages in thread From: Georg Bauhaus @ 2005-06-21 9:24 UTC (permalink / raw) Dmitry A. Kazakov wrote: > No, you just have to use different delimiters between and within the > fields. "You just have to... ". No, gosh, the space was _mistyped_, it wasn't intended. This goes for any typo irrespective of what delimiter you choose. Now any reasonable CSV has far less offerings for error correction facilities for typos like these than any reasonable XML. By definition. (And, yes, I know you can construct syntax errors in XML, too, if you think this is an argument ...) Is it the typical Ada programmer's attitude to promote self-documenting bracketing constructs only for program text, but never for data text? > This is why in Ada parameters of a procedure call are separated by > commas rather than spaces. > > Though is it about what syntax would be the best? Or is it about how to > parse something in a defined syntax? HAving a "best syntax" requires a measure for syntax quality. If you measure what a syntax can do in a heterogenous project by applying your personal aesthetic preferences, or your reading habits, or your programming skills, I have nothing to say. If you care about robust data interchange in a "sloppy field", you employ standard tools to help you get the correct data. > Now consider a space between / and >: > > <Distance km='3.15'/ > > > XML adds here nothing, but a huge readability loss. Oh well... You mean Distance'(km => 3.15) can be read well, whereas Distance'( km => 3.15 ) is a huger readability loss? Come on. ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-21 9:24 ` Georg Bauhaus @ 2005-06-21 9:52 ` Jacob Sparre Andersen 2005-06-21 11:10 ` Georg Bauhaus 2005-06-21 10:42 ` Dmitry A. Kazakov 1 sibling, 1 reply; 68+ messages in thread From: Jacob Sparre Andersen @ 2005-06-21 9:52 UTC (permalink / raw) Georg Bauhaus wrote: > "You just have to... ". No, gosh, the space was _mistyped_, it > wasn't intended. This goes for any typo irrespective of what > delimiter you choose. Now any reasonable CSV has far less offerings > for error correction facilities for typos like these than any > reasonable XML. By definition. (And, yes, I know you can construct > syntax errors in XML, too, if you think this is an argument ...) > > Is it the typical Ada programmer's attitude to promote > self-documenting bracketing constructs only for program text, but > never for data text? Unlike Ada, XML is _not_ human-readable. And if I want an error-correcting file format which isn't human-readable, there are plenty to choose from, which are faster than XML. Jacob -- "Sleep is just a cheap substitute for coffee" ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-21 9:52 ` Jacob Sparre Andersen @ 2005-06-21 11:10 ` Georg Bauhaus 2005-06-21 12:35 ` Jacob Sparre Andersen 0 siblings, 1 reply; 68+ messages in thread From: Georg Bauhaus @ 2005-06-21 11:10 UTC (permalink / raw) Jacob Sparre Andersen wrote: > Unlike Ada, XML is _not_ human-readable. First, this has been claimed many times without even an indication of why this might be so. Again, compare <Date year = "2006" month = "December" day = "24"/> and Date'(year => 2006, month => -"December", day => 24); Could it be that you just don't like reading angle brackets? Do the <...> smell like C++'s template parameter brackets? Again, habits? I won't say that XML *looks* nice, but it's purpose is not to look nice, this is not a Miss Dataformat Competition where you cannot win without rounded curves o.K. by the lates fashion. XML is supposed to support identifing data in text form. Second, XML is meant to be easily accessible using text toos, not to be printed as novels. As such, it is not a language for writing prose, formal or not. This is why the relevant standards define a notion of rendition. > And if I want an error-correcting file format which isn't > human-readable, there are plenty to choose from, which are faster than > XML. Such as...? ASN.1 perhaps? ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-21 11:10 ` Georg Bauhaus @ 2005-06-21 12:35 ` Jacob Sparre Andersen 0 siblings, 0 replies; 68+ messages in thread From: Jacob Sparre Andersen @ 2005-06-21 12:35 UTC (permalink / raw) Georg Bauhaus wrote: > Jacob Sparre Andersen wrote: > > Unlike Ada, XML is _not_ human-readable. > > First, this has been claimed many times without even an indication > of why this might be so. Again, compare > > <Date year = "2006" month = "December" day = "24"/> > > and > > Date'(year => 2006, month => -"December", day => 24); > > Could it be that you just don't like reading angle brackets? I definitely don't like reading _any_ brackets, when I'm looking at data. > Do the <...> smell like C++'s template parameter brackets? They may. But neither of the above two notations are sensible, when I am playing with 30 � 55k matrices. > Again, habits? I won't say that XML *looks* nice, but it's purpose > is not to look nice, this is not a Miss Dataformat Competition where > you cannot win without rounded curves o.K. by the lates fashion. > XML is supposed to support identifing data in text form. Yes. But for tabular data XML has much too much overhead and is thus too difficult to read. > > And if I want an error-correcting file format which isn't > > human-readable, there are plenty to choose from, which are faster than > > XML. > > Such as...? > ASN.1 perhaps? I am not sure if ASN.1 includes error-correction, but it was one of the options I had on my mind, when I wrote the sentence. A much more effective format would be based on an instantiation of Ada.Direct_IO with some kind of checksum included in Element_Type. My astrophysics colleagues also have a nice format for multidimensional tables, but I can't remember the name at the moment. Jacob -- CAUTION BLADE EXTREMELY SHARP KEEP OUT OF CHILDREN ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-21 9:24 ` Georg Bauhaus 2005-06-21 9:52 ` Jacob Sparre Andersen @ 2005-06-21 10:42 ` Dmitry A. Kazakov 2005-06-21 11:41 ` Georg Bauhaus 1 sibling, 1 reply; 68+ messages in thread From: Dmitry A. Kazakov @ 2005-06-21 10:42 UTC (permalink / raw) On Tue, 21 Jun 2005 11:24:34 +0200, Georg Bauhaus wrote: > Dmitry A. Kazakov wrote: > >> No, you just have to use different delimiters between and within the >> fields. > > "You just have to... ". No, gosh, the space was _mistyped_, > it wasn't intended. This goes for any typo irrespective of what > delimiter you choose. Now any reasonable CSV has far less offerings for > error correction facilities for typos like these than any reasonable > XML. By definition. (And, yes, I know you can construct syntax errors > in XML, too, if you think this is an argument ...) > > Is it the typical Ada programmer's attitude to promote self-documenting > bracketing constructs only for program text, but never for data text? See below. It is a table. It has bracketing: rows and columns. This form existed for centuries before XML. Who would print tables of logarithms in XML? >> This is why in Ada parameters of a procedure call are separated by >> commas rather than spaces. >> >> Though is it about what syntax would be the best? Or is it about how to >> parse something in a defined syntax? > > HAving a "best syntax" requires a measure for syntax quality. > If you measure what a syntax can do in a heterogenous project > by applying your personal aesthetic preferences, > or your reading habits, or your programming skills, I have nothing to say. > > If you care about robust data interchange in a "sloppy > field", you employ standard tools to help you get the correct > data. That is a different problem for which I would use a well-defined binary format instead of fancy 3.15. What is the *accuracy* of this value, huh? >> Now consider a space between / and >: >> >> <Distance km='3.15'/ > >> >> XML adds here nothing, but a huge readability loss. > > Oh well... You mean > > Distance'(km => 3.15) > > can be read well, whereas > > Distance'( km => 3.15 ) > > is a huger readability loss? Come on. Distance isn't a record. At least it should not be visible as such. Neither distance is a type. The closest Ada's equivalent would be Distance => 3.15 km, or Distance := 3.15 km; But, lack of readability is not in the ugly </> brackets. Tabulated data are readable because they are tabulated. That is: the names, the types and units are *factored* out to the table header, which allows the reader to concentrate on the *values*. Thus a table looks as: Distance [km] Temperature [�C] ... 3.15 29.0 ... 2.10 14.4 ... This is readable. To make difference more visible, consider bitmaps stored XML format. Would you be able to recognize a person's face in it? -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-21 10:42 ` Dmitry A. Kazakov @ 2005-06-21 11:41 ` Georg Bauhaus 2005-06-21 12:44 ` Dmitry A. Kazakov 0 siblings, 1 reply; 68+ messages in thread From: Georg Bauhaus @ 2005-06-21 11:41 UTC (permalink / raw) Dmitry A. Kazakov wrote: > See below. It is a table. It has bracketing: rows and columns. Back to step one: brackets in computer tables are not named, a computer doesn't have accountants' abilities in pattern matching when looking at rows and colums in a table. Again, I said XML is good for parsing of data if you cannot tell in advance that the data stream is totally free of errors. XML provides means to build robust data streams in the absence of tight definitions and reliable procedures. As for whitespace, read Stroustrup's article on defining operator whitespace. > This form > existed for centuries before XML. Who would print tables of logarithms in > XML? You're missing the point: XML is *not* about rendering data. Logarithms are logarithms, not printed logarithms, this is a second step. Data formats for exchange or storage on the one hand and a print-out of some data on the other hand are two very different beasts, with different purposes. Consider the MVC paradigm. >>If you care about robust data interchange in a "sloppy >>field", you employ standard tools to help you get the correct >>data. > > > That is a different problem for which I would use a well-defined binary > format instead of fancy 3.15. What is the *accuracy* of this value, huh? It is totally unimportant what you or I would want, sorry. For a robust data interchange, absent comprehensive definitions and guarantees about data production, you need redundancy, period. The accuracy is well defined and most importantly, it is up to the application, yours and mine repectively. We both use the accuracy that is most appropriate, and I won't tell you not to use an internal type when it suits your application. I expect the same of you. If all I have to do is to store kilometers measuring straight lines inside the Netherlands in a relational database, I known the datatype I can use, no matter what you think is best in your application. This has been discussed for years during the development of XML Schema. What do you care about my accuracy as long as I compute values from your data that are within application bounds? 3.15 is as accurate as can be, and independent of bits. > Distance isn't a record. Huh? In data exchange it isn't your job to to tell others how they should represent one particular distance. Likewise, it's not my job to tell you not to think of print, so to speak. But we both have to exchange all relevant data, and we have to agree on element types and their attributes to represent data we both need. This is about DTDs and the like, not about using XML or not. Going from XML to ASN.1 or some format based on Lisp list doesn't add much difference. We still both have to know what an item means. Tags are good for helping with this because they add information about items. Qualified notation so to speak. > But, lack of readability is not in the ugly </> brackets. Tabulated data > are readable because they are tabulated. This is the *View* in MVC, XML is about *data*. So there is no point in talking about final looks, it is important to know how data will have to be seen. For example, can you debug datastreams using the simplest tools? Think of a log file of a concurrent application, processing data from several heterogenous input sources on the net. > That is: the names, the types and > units are *factored* out to the table header, which allows the reader to > concentrate on the *values*. Thus a table looks as: > > Distance [km] Temperature [°C] ... > 3.15 29.0 ... > 2.10 14.4 ... > > This is readable. This is irrelevant in data exchange. This is print. > To make difference more visible, consider bitmaps stored XML format. Would > you be able to recognize a person's face in it? You do know about NOTATION? I think it is very hard to find someone suggesting that we should recode bitmap graphics formats as pixel tags. ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-21 11:41 ` Georg Bauhaus @ 2005-06-21 12:44 ` Dmitry A. Kazakov 2005-06-21 21:01 ` Georg Bauhaus 0 siblings, 1 reply; 68+ messages in thread From: Dmitry A. Kazakov @ 2005-06-21 12:44 UTC (permalink / raw) On Tue, 21 Jun 2005 13:41:25 +0200, Georg Bauhaus wrote: > Dmitry A. Kazakov wrote: > >> See below. It is a table. It has bracketing: rows and columns. > > Back to step one: brackets in computer tables are not named, > a computer doesn't have accountants' abilities in pattern matching > when looking at rows and colums in a table. Again, I said XML is good > for parsing of data if you cannot tell in advance that the data stream > is totally free of errors. No it is bad, because missing one bracket may lead to loss of the whole data set. As a medium XML is as awful as readable. > XML provides means to build robust data > streams in the absence of tight definitions and reliable procedures. > As for whitespace, read Stroustrup's article on defining operator > whitespace. Delimiter /= whitespace. >> This form >> existed for centuries before XML. Who would print tables of logarithms in >> XML? > > You're missing the point: XML is *not* about rendering data. Sorry, but the thread's subject reads "Data table text I/O package". Text = rendered data. > Logarithms are logarithms, not printed logarithms, this is a second > step. Data formats for exchange or storage on the one hand and > a print-out of some data on the other hand are two very different beasts, > with different purposes. Consider the MVC paradigm. This is obviously wrong, clearly print-outs serve both data exchange and data storage when humans are involved. >>>If you care about robust data interchange in a "sloppy >>>field", you employ standard tools to help you get the correct >>>data. >> >> That is a different problem for which I would use a well-defined binary >> format instead of fancy 3.15. What is the *accuracy* of this value, huh? > > It is totally unimportant what you or I would want, sorry. > For a robust data interchange, absent comprehensive definitions > and guarantees about data production, you need redundancy, period. > > The accuracy is well defined and most importantly, > it is up to the application, yours and mine repectively. > We both use the accuracy that is most appropriate, and I won't > tell you not to use an internal type when it suits your application. > I expect the same of you. If all I have to do is to store kilometers > measuring straight lines inside the Netherlands in a relational database, > I known the datatype I can use, no matter what you think is best > in your application. This is a wrong approach of course. Because the accuracy of the data is *not* defined by the internal type used. And in any case the internal type is irrelevant to the data format used. Note that binary format has nothing to do with any internal format. > This has been discussed for years during the development of > XML Schema. What do you care about my accuracy as long as > I compute values from your data that are within application > bounds? 3.15 is as accurate as can be, and independent of > bits. Is it 3.14998751 or 3.150000? Floating-point numbers are intervals. Transporting them you should either use explicit bounds: [3.1499, 3.1600] or accuracy: 3.15 +/-0.0001. "As accurate as can be" is nice, but what if the application is a gateway, which reads 3.15 as accurate as 4 bytes float is and then sends it away? Two other applications communicating through it and using long long float will be quite perplexed... >> But, lack of readability is not in the ugly </> brackets. Tabulated data >> are readable because they are tabulated. > > This is the *View* in MVC, XML is about *data*. So there is no point in > talking about final looks, it is important to know how data will have > to be seen. For example, can you debug datastreams using the simplest > tools? Think of a log file of a concurrent application, processing data > from several heterogenous input sources on the net. Really? A normal log file of our data acquisition and control system (3-4 nodes, 500-1000 channels each) is about 10-100 MB. A trace file of the same system is typically about 10-100GB. The first is a highly dense binary format. The second is dense ASCII. Do you know any editor capable to load 10GB? In UltraEdit you need about 10 minutes to wait, before it becomes ready to do anything. Now, you propose me to convert all that into XML? How much is SCSI terabyte now? But more importantly each extra byte of rubbish you write is multiplied by the number of channels and their frequencies, that costs system performance. >> That is: the names, the types and >> units are *factored* out to the table header, which allows the reader to >> concentrate on the *values*. Thus a table looks as: >> >> Distance [km] Temperature [�C] ... >> 3.15 29.0 ... >> 2.10 14.4 ... >> >> This is readable. > > This is irrelevant in data exchange. This is print. > >> To make difference more visible, consider bitmaps stored XML format. Would >> you be able to recognize a person's face in it? > > You do know about NOTATION? > I think it is very hard to find someone suggesting that we should recode > bitmap graphics formats as pixel tags. So an image is not print whereas a table is? -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-21 12:44 ` Dmitry A. Kazakov @ 2005-06-21 21:01 ` Georg Bauhaus 2005-06-22 12:15 ` Dmitry A. Kazakov 0 siblings, 1 reply; 68+ messages in thread From: Georg Bauhaus @ 2005-06-21 21:01 UTC (permalink / raw) Dmitry A. Kazakov wrote: Let me first guess that many here have their largely regular and homogenuous data in mind. I'm not talking about this. We went off from what to do if you don't have atomic, homogenous, unambigous data, sent around. 1) If you have a nice arrangement of exactly one set of array-like data of guaranteed quality, there is little to win by using XML. 2) Given a data format much like in (1), if you can pick up the phone and ring the other end of the data-sending connection, and say, 'Uhm, we have seen a slight change in the data text table, could you explain ...' or similar, you are privileged. 3) I you think that every bunch of data is sent in agreeable format, I could be telling you a few stories, though not in public. >>Back to step one: brackets in computer tables are not named, >>a computer doesn't have accountants' abilities in pattern matching >>when looking at rows and colums in a table. Again, I said XML is good >>for parsing of data if you cannot tell in advance that the data stream >>is totally free of errors. > > > No it is bad, because missing one bracket may lead to loss of the whole > data set. As a medium XML is as awful as readable. If you mean loosing a closing tag, the parser can correct, though not always, and to different extents. If you mean somehow a '>' of a start tag is lost, then this is better or worse than in typical CSV or similar; a line end is a bracket, too. A separator is a two-way bracket, adding one more possibility for error and ambiguity. Imagine a CSV stream with _no_ record separators. (This is not fiction.) It is kind of efficient, you count fields. However, if some data item contains a separator due to an error, you loose the whole stream, or use the wrong data without noticing this, in the worst case. >>As for whitespace, read Stroustrup's article on defining operator >>whitespace. > > > Delimiter /= whitespace. True, still Stroustrup demonstrates some effects we are discussing. >>You're missing the point: XML is *not* about rendering data. > > > Sorry, but the thread's subject reads "Data table text I/O package". Text = > rendered data. Notice that the thread title has I/O. I/O can mean pretty printing, and it can mean a reliable and robust data input-output facility, working well in the face of erroneous input. I was under the impression that we were discussing the latter, in particular I added: You better had such-and-such data if you want to reliably handle data in a sloppy setting, answering Marius Amado Alves IIRC. >>Logarithms are logarithms, not printed logarithms, this is a second >>step. Data formats for exchange or storage on the one hand and >>a print-out of some data on the other hand are two very different beasts, >>with different purposes. Consider the MVC paradigm. > > > This is obviously wrong, clearly print-outs serve both data exchange and > data storage when humans are involved. The point is whether print-outs serve *well* as a data exchange format, IN THE SITUATION described above, that is you do not know in advance that you will get the finest data. I doubt that this is the case in any but a few well defined situations. (I.e., you might meet it more often in contexts where Ada is used, or so I hope.) >>The accuracy is well defined and most importantly, >>it is up to the application, yours and mine repectively. > This is a wrong approach of course. There is no more accurate representation of 3.15 than the text "3.15", right under our noses. In a text data stream, tabular, XML, whatever. I appreciate that you care how I should read a "3.15" and store it. Though, if my application uses decimal fixed point to represent money with 4 digits after the point, then you can add as many zeros as you like after .15, it's none of your business, it's the other application's business. I may not have your hardware, I may not have your rounding policies. I still find your data useful. > Because the accuracy of the data is > *not* defined by the internal type used. The accuracy of the data may not be defined at all, IN THE DATA STREAM. (The again, some peoply may try, adding a schema.) >> 3.15 is as accurate as can be, and independent of >>bits. > Is it 3.14998751 or 3.150000? It is 3.15. This is data, text data. Not a computer floating point value, just data in textual external format, very flexible, and with the number of digits that you see. Do with it what _your_ application wants to do with it. This is what you get. "Is the light On, or Off?" -- "It is On." Data, "Off" or "On". No matter how any program represents On or Off, all that can be said about On or Off AS PARTS OF THE DATA IN THE TEXT STREAM is in the stream. Use a Boolean, or use an enumeration type, your choice. A data stream does not in general define semantics. On the contrary, the standards talk about applications defining meaning, in the end.) > Floating-point numbers are intervals. > Transporting them you should either use explicit bounds: Who said floating point? I said "3.15", ('3', '.', '1', '5'). You do not have have a solution to the problem of exactly representing R-eal values in a data transport context, do you? (And note that not every important number originates inside a computer's FPU.) > [3.1499, 3.1600] Well, someone will ask you, 'and what exactly is 3.1499?' on *our* machine? > Really? A normal log file [...] You argue from your log files, let me argue from a heterogeneity point of view. (BTW, I use text pipes and stream analysis to look at files of about this size.) A server is running, you can look at the trace log, some parser fails, you want to know why. Say there are three lines of ';'-separated data, each at most 400 characters long. Ideally one appears right after the other. These lines are what they send you, no way to change that. Each field has varying length. Your job will be to associate matching fields. Because 370 characters don't fit in a single display line, you end up counting ';'s in each line and take notes, or c&p, to find the matching fields. Now consider separated key=value lines. They will be longer, but you can scan the line looking for the key strings. A big step up. XML isn't worse in my view. >>>That is: the names, the types and >>>units are *factored* out to the table header, which allows the reader to >>>concentrate on the *values*. Thus a table looks as: >>> >>>Distance [km] Temperature [°C] ... >>>3.15 29.0 ... >>>2.10 14.4 ... >>> >>>This is readable. Sure, I'm of course not saying a table isn't readable. I can even use XSL-FO or TeX to produce a table from XML, no problem. In fact, I have done this many times. A formatted table just isn't that robust. Consider the case where the headline gets lost. The missing redundancy will leave you with a puzzle, not a robust set of self describing text. >>This is irrelevant in data exchange. This is print. >> >> >>>To make difference more visible, consider bitmaps stored XML format. Would >>>you be able to recognize a person's face in it? I'm sure you know one can make text images, but I won't argue about this for the same reasons that have been explained for years when discussing SGML and data for which no parsing is desired. > So an image is not print whereas a table is? Now we are entering the realm of robust image encoding... No. Georg Bauhaus ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-21 21:01 ` Georg Bauhaus @ 2005-06-22 12:15 ` Dmitry A. Kazakov 2005-06-22 22:24 ` Georg Bauhaus 0 siblings, 1 reply; 68+ messages in thread From: Dmitry A. Kazakov @ 2005-06-22 12:15 UTC (permalink / raw) On Tue, 21 Jun 2005 23:01:59 +0200, Georg Bauhaus wrote: > Dmitry A. Kazakov wrote: > > Let me first guess that many here have their largely > regular and homogenuous data in mind. I'm not talking > about this. We went off from what to do if you > don't have atomic, homogenous, unambigous data, sent > around. > > 1) If you have a nice arrangement of exactly one set of > array-like data of guaranteed quality, there is little > to win by using XML. OK, that is a big difference. Tables representing tree-like structures are awful. >> Sorry, but the thread's subject reads "Data table text I/O package". Text = >> rendered data. > > Notice that the thread title has I/O. I/O can mean pretty printing, > and it can mean a reliable and robust data input-output facility, > working well in the face of erroneous input. But for data exchange there are better techniques than XML. Even if you mean [far stretched] objects brokering and active agents performed over a stream or printable characters, even then I wouldn't take XML. >>>The accuracy is well defined and most importantly, >>>it is up to the application, yours and mine repectively. > >> This is a wrong approach of course. > > There is no more accurate representation of 3.15 than the text "3.15", > right under our noses. In a text data stream, tabular, XML, whatever. The text "3.15" represents what? Everything of course depends on the OSI layer we are talking about. (:-)) [...] > The accuracy of the data may not be defined at all, IN THE DATA > STREAM. (The again, some peoply may try, adding a schema.) Then you cannot talk about numbers transferred. You said "3.15" is a text. So let it be a text. "3.1 5" is also a text, as valid as "3.15" [at this level of abstraction.] BTW, again there are better ways to send texts than XML offers. >> [3.1499, 3.1600] > > Well, someone will ask you, 'and what exactly is 3.1499?' on > *our* machine? 3.1499 is the lower bound. So on your machine you can represent it by any number less or equal to 3.1499. You loose precision, but retain correctness. The true value is always within the bounds. There is still a problem, but a much lesser one. > Now consider separated key=value lines. They will be longer, > but you can scan the line looking for the key strings. A big > step up. XML isn't worse in my view. Unfortunately in our case it is not that simple. key=value does not help. The problem is that data need to be sorted and filtered using various criteria. In other words a value has more than one key. A relational DB would probably help, but to load that amount of data would take too long. So it ends up with a specialized tool chain, integrated diagnostic etc. BTW, 80% of that would probably be unnecessary if Ada were used! (:-)) But the customer wished otherwise... > A formatted table just isn't that robust. > Consider the case where the headline > gets lost. The missing redundancy will leave you with a > puzzle, not a robust set of self describing text. It is a bad idea to correct I/O error using syntax anyway. Relevant errors are only ones made by humans. It is very unlikely that somebody would forget to read a table header [I don't talk about writing, because to write in XML is beyond anybody's capability anyway.] Humans are unbeatable in pattern recognition. This is whole idea behind tables. Tab stops and lines are very easy patterns to detect and any error becomes immediately visible long before inspecting the table contents. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-22 12:15 ` Dmitry A. Kazakov @ 2005-06-22 22:24 ` Georg Bauhaus 2005-06-23 9:03 ` Dmitry A. Kazakov 0 siblings, 1 reply; 68+ messages in thread From: Georg Bauhaus @ 2005-06-22 22:24 UTC (permalink / raw) Dmitry A. Kazakov wrote: > But for data exchange there are better techniques than XML. Such as ...? > Then you cannot talk about numbers transferred. You said "3.15" is a text. > So let it be a text. "3.1 5" is also a text, as valid as "3.15" [at this > level of abstraction.] How is a number transfered from one human to another? How do you explain the number three to a person who cannot see? > [sending two literals instead of one] > The true value is always within the bounds. There is still a > problem, but a much lesser one. I don't agree because you are actually introducing two intervals. And mine might be different from yours anyway. So why not use "3.15" as per the needs of the application? > Relevant errors are only ones made by humans. Ahh, no. Think of the last time you have been watching satellite TV with a strong cloud in the way. Where is your nice data stream... (No I'm not suggesting XML here, of course, but satellites aren't just used for MPEG streams. They can transmit XML data too.) > [I don't talk about writing, because to write > in XML is beyond anybody's capability anyway.] I suggest you have a look at oXygen or nXML mode for Emacs, or PSGML mode for Emacs. (Serna is also nice, though it is, uhm, stabilizing.) They all provide functions similar to a good programmer's IDE, analysing source text to help you with typing, inserting completions automatically, running the validator in the background etc. (In fact, XML lends itself well to syntax directed editing, whether you see the tags or not. :-)) > Humans are unbeatable in pattern recognition. This is whole idea behind > tables. Tab stops and lines are very easy patterns to detect and any error > becomes immediately visible long before inspecting the table contents. Right. So next time someone sends you an HTML table full of data, use this for a start, to get a nice plain text table. (It's verbose, I know :) <?xml version='1.0' encoding='UTF-8'?> <transform xmlns:html="http://www.w3.org/1999/xhtml" xmlns="http://www.w3.org/1999/XSL/Transform" version="1.0"> <!-- Transforms an XHTML table into a plain text table. Input: An XHTML document containing tables. Output: A text document containing tables. The tables should contain small portions of text in their cells, for example matrix data or some tabular array of small strings. The default width of columns in this transformation is 8, see "pad". --> <output method="text"/> <param name="my-line-terminator"> <!-- do not use system defaults for terminating lines. Use these characters instead. See new-line. --> <text> </text> </param> <template match="/"> <!-- insert some empty lines and then start the plain text table --> <for-each select="descendant::table"> <call-template name="new-line"> <with-param name="count">2</with-param> </call-template> <apply-templates select="tbody"/> </for-each> </template> <template match='tbody'> <!-- print the head, then a separating line, then the rows --> <apply-templates select="tr/th"/> <call-template name="new-line"/> <text>================================================</text> <for-each select="tr"> <apply-templates select="td"/> <call-template name="new-line"/> </for-each> </template> <template match="td | th"> <!-- place the text content inside a cell padded with blanks --> <call-template name="pad"> <with-param name="characters"> <apply-templates/> </with-param> </call-template> </template> <template match="td//*"> <!-- inside a table cell, discard everything but text --> <value-of select="text()"/> </template> <template name='pad'> <param name="characters"/> <!-- the text to which padding blanks might be added --> <param name="default-width">8</param> <!-- default column width measured in number of characters --> <variable name="fill" select="$default-width - string-length($characters)"/> <choose> <when test="$fill < 0"> <message>Please choose a wider display</message> </when> <otherwise> <value-of select="$characters"/> <value-of select=" substring(' ', 1, $fill)"/> </otherwise> </choose> </template> <template name="new-line"> <!-- 1 or more lines will be terminated --> <param name="count">1</param> <if test="$count > 0"> <value-of select="$my-line-terminator"/> <call-template name="new-line"> <with-param name="count"> <value-of select="$count - 1"/> </with-param> </call-template> </if> </template> </transform> ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-22 22:24 ` Georg Bauhaus @ 2005-06-23 9:03 ` Dmitry A. Kazakov 2005-06-23 9:47 ` Georg Bauhaus 2005-06-23 14:16 ` Marc A. Criley 0 siblings, 2 replies; 68+ messages in thread From: Dmitry A. Kazakov @ 2005-06-23 9:03 UTC (permalink / raw) On Thu, 23 Jun 2005 00:24:30 +0200, Georg Bauhaus wrote: > Dmitry A. Kazakov wrote: > >> But for data exchange there are better techniques than XML. > > Such as ...? Take any middleware available. >> Then you cannot talk about numbers transferred. You said "3.15" is a text. >> So let it be a text. "3.1 5" is also a text, as valid as "3.15" [at this >> level of abstraction.] > > How is a number transfered from one human to another? As a description of some [usually trivial] problem. The solution of that problem conveys the number. Most people are very bad in memorizing raw numbers or even in recognizing them from an acoustic stream. > How do you explain the number three to a person who > cannot see? The number second to two. (:-)) >> [sending two literals instead of one] >> The true value is always within the bounds. There is still a >> problem, but a much lesser one. > > I don't agree because you are actually introducing two intervals. No, it is still one interval that contains the true number. This is the way floating-point arithmetic functions. The result of a+b is c, such that [c'Pred, c'Succ] contains the exact result. [*] The problem is that 'Pred and 'Succ are of course machine dependent. So when you send c you should also convey the range. Depending on that the receiver should chose an appropriate internal representation for c, which might require a "true" interval. > And mine might be different from yours anyway. So why not use "3.15" > as per the needs of the application? It is no problem. But then in your XML format it should rather be: <model="float", dimension="km", digits="4", value="3.15"> This might look close to Ada's ideology, but I would rather say it does not. It smells much of structural types equivalence, I don't like it. What if the application expects a fixed point number? Would you convert? It is too slippery... BTW, I'm not arguing against the idea of using type descriptions in protocols. It is a great idea. I think Ada will definitely confront this issue some day, because presently Ada is completely unable to handle it. But XML isn't a right answer here. >> Relevant errors are only ones made by humans. > > Ahh, no. Think of the last time you have been watching satellite > TV with a strong cloud in the way. Where is your nice data stream... > (No I'm not suggesting XML here, of course, but satellites aren't > just used for MPEG streams. They can transmit XML data too.) Never use UDP, and you'll have no problems with that! (:-)) But seriously, do you really want to collapse all OSI levels into one big mess and make an application responsible for error correction? >> [I don't talk about writing, because to write >> in XML is beyond anybody's capability anyway.] > > I suggest you have a look at oXygen or nXML mode for Emacs, > or PSGML mode for Emacs. (Serna is also nice, > though it is, uhm, stabilizing.) > They all provide functions similar to a good programmer's IDE, > analysing source text to help you with typing, inserting > completions automatically, running the validator in the > background etc. That's the point, to write something as a table, you need nothing more elaborated than a notepad editor... Actually I enjoy XML in postings. It is an excellent spam flag. Any post which isn't plain text immediately goes into the recycle bin. (:-)) >> Humans are unbeatable in pattern recognition. This is whole idea behind >> tables. Tab stops and lines are very easy patterns to detect and any error >> becomes immediately visible long before inspecting the table contents. > > Right. So next time someone sends you an HTML table full of data, > use this for a start, to get a nice plain text table. (It's verbose, > I know :) [...] And why this nightmare cannot be written in Ada? ---------- * Depending on how the machine rounds a narrower interval can be used. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-23 9:03 ` Dmitry A. Kazakov @ 2005-06-23 9:47 ` Georg Bauhaus 2005-06-23 10:34 ` Dmitry A. Kazakov 2005-06-23 14:16 ` Marc A. Criley 1 sibling, 1 reply; 68+ messages in thread From: Georg Bauhaus @ 2005-06-23 9:47 UTC (permalink / raw) Dmitry A. Kazakov wrote: > On Thu, 23 Jun 2005 00:24:30 +0200, Georg Bauhaus wrote: > > >>Dmitry A. Kazakov wrote: >> >> >>>But for data exchange there are better techniques than XML. >> >>Such as ...? > > > Take any middleware available. Uhm, yes, such as ...? > No, it is still one interval that contains the true number. This is the way > floating-point arithmetic functions. The result of a+b is c, such that > [c'Pred, c'Succ] contains the exact result. [*] The problem is that 'Pred > and 'Succ are of course machine dependent. So when you send c you should > also convey the range. Depending on that the receiver should chose an > appropriate internal representation for c, which might require a "true" > interval. This amounts to specifying the precise details of a floating point computation in a data stream; a rather special case I think. Take for example prices, guesstimates of future price changes, insurance rates, direction of tomorrow's winds, day temperature, and the like. It seems quite enough to transmit one fpt number literal in these cases. > Never use UDP, and you'll have no problems with that! (:-)) But seriously, > do you really want to collapse all OSI levels into one big mess No. > and make an > application responsible for error correction? How can hard/software at OSI levels guarantee correct data? As soon as there is something real in there (i.e., real software, real hardware, real interference, humans, ...), degradation is possible. > [...] [XSL transformation] > And why this nightmare cannot be written in Ada? Because it would explode even more. Reading hint: if you see <apply-templates select="descendant::tbody"/> forget about the mosaic impression conveyed by <-="::"/> for a moment and read it as Apply templates to the selection of descendant tbodies. Or read the above aloud. I believe that XSL might look nightmarish to those who expect few characters on a PL text page, but if you actually read the text, it is quite natural :-) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-23 9:47 ` Georg Bauhaus @ 2005-06-23 10:34 ` Dmitry A. Kazakov 2005-06-23 11:37 ` Georg Bauhaus 0 siblings, 1 reply; 68+ messages in thread From: Dmitry A. Kazakov @ 2005-06-23 10:34 UTC (permalink / raw) On Thu, 23 Jun 2005 11:47:10 +0200, Georg Bauhaus wrote: > Dmitry A. Kazakov wrote: >> On Thu, 23 Jun 2005 00:24:30 +0200, Georg Bauhaus wrote: >> >>>Dmitry A. Kazakov wrote: >>> >>>>But for data exchange there are better techniques than XML. >>> >>>Such as ...? >> >> Take any middleware available. > > Uhm, yes, such as ...? CORBA, OPC (hmm), RPC etc, Ada.Streams after all. >> No, it is still one interval that contains the true number. This is the way >> floating-point arithmetic functions. The result of a+b is c, such that >> [c'Pred, c'Succ] contains the exact result. [*] The problem is that 'Pred >> and 'Succ are of course machine dependent. So when you send c you should >> also convey the range. Depending on that the receiver should chose an >> appropriate internal representation for c, which might require a "true" >> interval. > > This amounts to specifying the precise details of a floating > point computation in a data stream; a rather special case I think. > Take for example prices, guesstimates of future price changes, Those are fixed point with the problems of their own. You are bound to a definite radix, because all values need to be exact. > insurance rates, direction of tomorrow's winds, day temperature, > and the like. These are fuzzy numbers. They are characterized by a distribution of possible values. You need more than one value here. In natural languages we are using "approximately 3.15", "between 3 and 4", "close to 5" etc. > It seems quite enough to transmit one fpt > number literal in these cases. You mean a decimal literal for the case where a fixed-point decimal number is expected. (:-)) >> Never use UDP, and you'll have no problems with that! (:-)) But seriously, >> do you really want to collapse all OSI levels into one big mess > > No. > >> and make an >> application responsible for error correction? > > How can hard/software at OSI levels guarantee correct data? > As soon as there is something real in there (i.e., real software, > real hardware, real interference, humans, ...), degradation > is possible. That is true, but error correction codes (Hamming etc) are *known* to be optimal. This is a hard mathematical fact. So any bandwidth available should be invested there rather than at the application level in fancy things like </> brackets. Further we should never mix this class of errors with ones made by humans while writing and reading texts. These errors have completely different nature. The first ones should be eliminated on the transport level. The application level should consider all data free of any errors of this kind. >> [...] > [XSL transformation] > >> And why this nightmare cannot be written in Ada? > > Because it would explode even more. I don't believe it! (:-)) > Reading hint: if you see > > <apply-templates select="descendant::tbody"/> > > forget about the mosaic impression conveyed by <-="::"/> for > a moment and read it as > > Apply templates to the selection of descendant tbodies. > > Or read the above aloud. I believe that XSL might look nightmarish > to those who expect few characters on a PL text page, but > if you actually read the text, it is quite natural :-) I think this is an essence of misunderstanding of the readability issue. You cannot control your perception. It isn't programmable. You can only train yourself to ignore the loathing your "hardware" generates while seeing XML! No matter how good you could be in that, it will cost you many extra "CPU cycles" in any case. I prefer to spare my cycles. So I vote for Ada and plain tables. (:-)) -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-23 10:34 ` Dmitry A. Kazakov @ 2005-06-23 11:37 ` Georg Bauhaus 2005-06-23 12:59 ` Dmitry A. Kazakov 0 siblings, 1 reply; 68+ messages in thread From: Georg Bauhaus @ 2005-06-23 11:37 UTC (permalink / raw) Dmitry A. Kazakov wrote: > On Thu, 23 Jun 2005 11:47:10 +0200, Georg Bauhaus wrote: >>>>Dmitry A. Kazakov wrote: >>>>>But for data exchange there are better techniques than XML. >>>> >>>>Such as ...? >>> >>>Take any middleware available. >> >>Uhm, yes, such as ...? > > > CORBA, OPC (hmm), RPC etc, Ada.Streams after all. How does one debug data passed using RPC etc, for example when the P could not be called due to some data error? > That is true, but error correction codes (Hamming etc) are *known* to be > optimal. This is a hard mathematical fact. So any bandwidth available > should be invested there rather than at the application level in fancy > things like </> brackets. Here is an interesting point. SGML comes with SDIF (not the digital sound thing, but SGML Document Interchange Format). SDIF by default is kind of defined in ASN.1. So there are actually two layers of data... >>[XSL transformation] >> >> >>>And why this nightmare cannot be written in Ada? >> >>Because it would explode even more. > > > I don't believe it! (:-)) Some XSL "primitives" are quite powerful, when compared to Ada "primitives". > So I vote for > Ada and plain tables. (:-)) For your eyes' pleasure yes, for the robust transmission of non-regular data in a heterogenous uncontrolled setting, no. ;) ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-23 11:37 ` Georg Bauhaus @ 2005-06-23 12:59 ` Dmitry A. Kazakov 0 siblings, 0 replies; 68+ messages in thread From: Dmitry A. Kazakov @ 2005-06-23 12:59 UTC (permalink / raw) On Thu, 23 Jun 2005 13:37:20 +0200, Georg Bauhaus wrote: > Dmitry A. Kazakov wrote: >> CORBA, OPC (hmm), RPC etc, Ada.Streams after all. > > How does one debug data passed using RPC etc, for example > when the P could not be called due to some data error? There should be no difference to debugging conventional calls, objects (depending on the paradigm.) I would readily agree that available middlewares aren't that good. We have our own, just because CORBA and OPC don't fulfill our requirements. Returning to the point. In our middleware you can monitor each byte sent or received for any hardware interface. It is an integrated functionality. You can also see a summary of how pieces these raw data were interpreted as the application level values: velocity, temperature etc. Viewing some complex hierarchical structures was never requested. Maybe because there is no any (:-)), but largely because when it comes to debugging, timings and relationships *between* values are of much greater importance. Typically there is some periodic activity that involves values x1, x2, ..., xN, say, each 10ms. The range checking subsystem reports that x34 violates its bounds. That happens once per hour [rather an easy case, in one real case it was once per 3 months.] So, you turn logging on, and try to analyze the system state around these points. That's nasty! Protocol errors are trivial compared to that. Honestly, I cannot remember any difficult case, though we are supporting many quite strange devices. It might sound as an anecdote, but in one case we indeed used print-outs read from a serial port! There was no other way to access the device data, than through its serial printer. We were lucky that the printer wasn't used in the graphic mode... (:-)) The situation would change if more complex (OO) structures were involved. But I don't think that XML would be an answer. I would prefer an OO-ish approach where each object knows how to construct itself out of a segment of raw data. > Some XSL "primitives" are quite powerful, when compared > to Ada "primitives". There is nothing more powerful than a call to the procedure Do_It! (:-)) -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-23 9:03 ` Dmitry A. Kazakov 2005-06-23 9:47 ` Georg Bauhaus @ 2005-06-23 14:16 ` Marc A. Criley 1 sibling, 0 replies; 68+ messages in thread From: Marc A. Criley @ 2005-06-23 14:16 UTC (permalink / raw) Dmitry A. Kazakov wrote: > On Thu, 23 Jun 2005 00:24:30 +0200, Georg Bauhaus wrote: > >>Dmitry A. Kazakov wrote: >> >>>But for data exchange there are better techniques than XML. >> >>Such as ...? > > Take any middleware available. This exchange seems analagous to: "For information exchange there are better techniques than conversing in English." "Such as ...?" "A Telephone." :-) At some point, somewhere, the data has to be put on the wire using some defined format--XML, application specific raw binary, eXternal Data Representation (XDR), etc. Don't conflate the mechanism of data transfer, be it CORBA, RPC, Ada Streams, whatever, with the representation of the data. Marc A. Criley www.mckae.com ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-20 18:54 ` Dmitry A. Kazakov 2005-06-21 9:24 ` Georg Bauhaus @ 2005-06-25 16:38 ` Simon Wright 1 sibling, 0 replies; 68+ messages in thread From: Simon Wright @ 2005-06-25 16:38 UTC (permalink / raw) "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes: > <Distance km='3.15'/ > > > XML adds here nothing, but a huge readability loss. More likely to write <Distance unit="km">3.15</Distance> Anyway, XML is a means for programs to communicate, not people .. ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-16 14:01 ` Georg Bauhaus 2005-06-16 12:27 ` Dmitry A. Kazakov @ 2005-06-16 13:26 ` Marius Amado Alves 2005-06-16 18:10 ` Georg Bauhaus 1 sibling, 1 reply; 68+ messages in thread From: Marius Amado Alves @ 2005-06-16 13:26 UTC (permalink / raw) To: comp.lang.ada On 16 Jun 2005, at 15:01, Georg Bauhaus wrote: [a lot on data formats] Georg, there was an example earlier (tabs simulated by 3 spaces here): Gene ID p-value Expression-level Description Human cromosome GE29031 0.04539 245.45 Cyclin-B1 17 So it's "really" atomic. Your arguments are valid, but do not apply to this case. Incidently, this would generate type Record_Type is record Gene_ID : String_Ptr; P_Value : Float; Expression_Level : Float; Description : String_Ptr; Human_Cromosome : Integer; end record; ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-16 13:26 ` Marius Amado Alves @ 2005-06-16 18:10 ` Georg Bauhaus 0 siblings, 0 replies; 68+ messages in thread From: Georg Bauhaus @ 2005-06-16 18:10 UTC (permalink / raw) Marius Amado Alves wrote: > > On 16 Jun 2005, at 15:01, Georg Bauhaus wrote: > > [a lot on data formats] > > Georg, there was an example earlier (tabs simulated by 3 spaces here): > > Gene ID p-value Expression-level Description Human cromosome > GE29031 0.04539 245.45 Cyclin-B1 17 I did notice this example. > So it's "really" atomic. Your arguments are valid, but do not apply to > this case. That's hard to tell from this example. There is no TAB inside the values, oK, but that doesn't make data atomic in an application sense -- only the application knows. And this is precisely a point of a well designed XML format: you have a chance of naming the beginning and end of a value. It is up to the designer of the document type to choose a suitable level of detail for marking up the structure (and type) of both values and collections of values. (Elements with attributes, subtrees of the document, using domain specific notation in text.) And let us hope that the text pipeline will leave the Tab characters alone :-) Georg ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-16 9:55 ` Jacob Sparre Andersen 2005-06-16 10:53 ` Marius Amado Alves @ 2005-06-30 3:02 ` Randy Brukardt 2005-06-30 18:43 ` Jacob Sparre Andersen 2005-06-30 19:24 ` Björn Persson 1 sibling, 2 replies; 68+ messages in thread From: Randy Brukardt @ 2005-06-30 3:02 UTC (permalink / raw) "Jacob Sparre Andersen" <sparre@nbi.dk> wrote in message news:m2k6ku8w2s.fsf@hugin.crs4.it... > Randy Brukardt wrote: > > > I may be dense, but isn't this the purpose of XML? If so, why > > reinvent the wheel? > > The purpose of XML is to be _the_ universal file format. > > a) I don't want a universal file format. > > b) I don't believe in a universal file format. > > c) XML is (almost) less readable than a binary file my purposes. > > d) I'm _not_ going to switch away from tabulator separated tables for > purposes, where tabulator separated tables are a sensible > representation of the data in textual form. > > > (I personally think XML is way overused, more because it *can* be > > used than that it is worthwhile for the application. But this seems > > to be exactly the application that it was designed for. You'll end > > up with something like XML eventually anyway, why not start with > > it?) > > I'm afraid you completely misunderstood my problem. It is not a > matter of a selecting a file format. It is the matter of > automagically generating code for reading and writing that file > format. Not at all. We like to say around here that you need to describe what your needs are, because often the program you are trying to write isn't appropriate for Ada. We usually use that for people trying to write C in Ada, but it should apply to everyone. :-) For program-to-program communication, there really are only two sensible options. If both ends are under your control, then using a binary format (with versioning and error detection if needed) is preferable, because it has the least overhead and there is no need for data conversion. This certainly is the only option with reasonable performance. And this is usually the appropriate choice. OTOH, if the performance of the connection isn't critical, then using a well-known standard format that already has needed tools for it seems like the best option. Even if you don't currently need to allow access by other systems, you're leaving the door open for future programs outside your system to use the data. The cases that are neither of these and thus would make sense to use some internal, non-portable text format are essentially non-existent. Note that human readability of program-to-program data is a non-issue. Indeed, it is a mistake to try to bring that into the equation, as it adds a huge amount of overhead to the task. I've always used agile methods for debugging such data: if, in fact, I need to examine such a data stream, I'm write a program to display it. But I don't worry about that until/unless the need arises. It often does not arise, and even when it does, it's often not necessary to be able to display everything -- and it's often better to write a monitor for an interesting condition than filling a disk with 10 GB of text! So, all in all, I think you're trying to solve the wrong problem (finding a way to write a specific file format), rather than using an appropriate file format for Ada programs (usually binary). But, as a friend of mine likes to say, "do what you want, because you will anyway!". :-) Randy. ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-30 3:02 ` Randy Brukardt @ 2005-06-30 18:43 ` Jacob Sparre Andersen 2005-07-01 1:22 ` Randy Brukardt 2005-06-30 19:24 ` Björn Persson 1 sibling, 1 reply; 68+ messages in thread From: Jacob Sparre Andersen @ 2005-06-30 18:43 UTC (permalink / raw) Randy Brukardt wrote: > "Jacob Sparre Andersen" <sparre@nbi.dk> wrote in message > news:m2k6ku8w2s.fsf@hugin.crs4.it... > > Randy Brukardt wrote: > > > > > I may be dense, but isn't this the purpose of XML? If so, why > > > reinvent the wheel? > > > > The purpose of XML is to be _the_ universal file format. > > > > a) I don't want a universal file format. > > > > b) I don't believe in a universal file format. > > > > c) XML is (almost) less readable than a binary file my purposes. > > > > d) I'm _not_ going to switch away from tabulator separated tables > > for purposes, where tabulator separated tables are a sensible > > representation of the data in textual form. > > > > > (I personally think XML is way overused, more because it *can* > > > be used than that it is worthwhile for the application. But this > > > seems to be exactly the application that it was designed > > > for. You'll end up with something like XML eventually anyway, > > > why not start with it?) > > > > I'm afraid you completely misunderstood my problem. It is not a > > matter of a selecting a file format. It is the matter of > > automagically generating code for reading and writing that file > > format. > > Not at all. We like to say around here that you need to describe > what your needs are, because often the program you are trying to > write isn't appropriate for Ada. We usually use that for people > trying to write C in Ada, but it should apply to everyone. :-) I thought I had specified my needs. But in case I forgot: a) A format for storing experimental data in tabular form. b) A format I easily can manipulate with my standard Unix toolbox. c) A format I easily can read and get an overview of (sections of) the data. d) A format that easily can be imported into programs I'm not in control of. (concrete examples are Gnuplot, R, OOo Calc and Excel) e) A format I easily can read and write from my own programs. Tabulator separated text files handle this quite fine (although OOo and Excel users have to be careful about their number format settings when they import the files). > For program-to-program communication, there really are only two > sensible options. If both ends are under your control, then using a > binary format (with versioning and error detection if needed) is > preferable, because it has the least overhead and there is no need > for data conversion. Yes. But this doesn't handle b), c) and d). > OTOH, if the performance of the connection isn't critical, then > using a well-known standard format that already has needed tools for > it seems like the best option. Even if you don't currently need to > allow access by other systems, you're leaving the door open for > future programs outside your system to use the data. And which formats, besides tabulator separated text files, handle the requirements? XML doesn't handle b), c), d) and e). > The cases that are neither of these and thus would make sense to use > some internal, non-portable text format are essentially > non-existent. I think I have one of these "essentially non-existent" cases. And almost everything I do seems to be one of those cases. > Note that human readability of program-to-program data is a > non-issue. You're apparently working in a very different area than I am. Almost all data going from one program to another should also be available in a human-readable format. My work is to look at data, not to program. The programs are just written to process the data from one form into another form - which hopefully can teach us something new and interesting. > Indeed, it is a mistake to try to bring that into the equation, as > it adds a huge amount of overhead to the task. I've always used > agile methods for debugging such data: if, in fact, I need to > examine such a data stream, I'm write a program to display it. But I > don't worry about that until/unless the need arises. It seems that you're a programmer and not a researcher. I am (almost) always interested in the data. I have yet to run into a case where I wasn't interested in seeing the output of a program. > It often does not arise, and even when it does, it's often not > necessary to be able to display everything -- and it's often better > to write a monitor for an interesting condition than filling a disk > with 10 GB of text! I would spend all my time writing monitors that way. > So, all in all, I think you're trying to solve the wrong problem > (finding a way to write a specific file format), rather than using > an appropriate file format for Ada programs (usually binary). It may be a long-time bad habit to use tabulator separated text files for (intermediate) analysis results from experiments, but I haven't found a convincing argument yet. -- If I could auto-generate the monitor and the conversion programs to the programs I interact with, then I might be convinced, but I would still have to hack some type checking on top of Ada.Sequential_IO. And the program for auto-generating the export to Gnuplot would practically be identical to the one I asked for initially anyway. > But, as a friend of mine likes to say, "do what you want, because > you will anyway!". :-) A clever friend. :-) Jacob -- "Hungh. You see! More bear. Yellow snow is always dead give-away." ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-30 18:43 ` Jacob Sparre Andersen @ 2005-07-01 1:22 ` Randy Brukardt 2005-07-01 3:01 ` Alexander E. Kopilovich 0 siblings, 1 reply; 68+ messages in thread From: Randy Brukardt @ 2005-07-01 1:22 UTC (permalink / raw) "Jacob Sparre Andersen" <sparre@nbi.dk> wrote in message news:m2br5nd6sk.fsf@hugin.crs4.it... replying to me: ... > I thought I had specified my needs. But in case I forgot: > > a) A format for storing experimental data in tabular form. > > b) A format I easily can manipulate with my standard Unix toolbox. > > c) A format I easily can read and get an overview of (sections of) > the data. > > d) A format that easily can be imported into programs I'm not in > control of. (concrete examples are Gnuplot, R, OOo Calc and > Excel) > > e) A format I easily can read and write from my own programs. > > Tabulator separated text files handle this quite fine (although OOo > and Excel users have to be careful about their number format settings > when they import the files). Perhaps. But it's your "needs" that I question. (b) for instance doesn't really buy anything, as you can't do any *real* data transformations that way. Sure, you can add or delete a column, but that's trivial to code in the unusual case that you need it. And in about the same time that a text processing tool could do that job. As far as (c) goes, I don't believe that mixing human output with data storage/transmission is a good idea. Period. So that leaves us with (a), (d), and (e). [Certainly real requirements.] > > For program-to-program communication, there really are only two > > sensible options. If both ends are under your control, then using a > > binary format (with versioning and error detection if needed) is > > preferable, because it has the least overhead and there is no need > > for data conversion. > > Yes. But this doesn't handle b), c) and d). Of course it doesn't handle (d) [because (d) violates the premise]. And as mentioned above, I don't think (b) and (c) should even be goals. > > OTOH, if the performance of the connection isn't critical, then > > using a well-known standard format that already has needed tools for > > it seems like the best option. Even if you don't currently need to > > allow access by other systems, you're leaving the door open for > > future programs outside your system to use the data. > > And which formats, besides tabulator separated text files, handle the > requirements? XML doesn't handle b), c), d) and e). Certainly (e) is handled by using tools like XMLOUT. (It can't be much harder to write than HTML, which is trivial.) I'd be surprised if most modern tools that can handle CSV couldn't handle a simlar XML file. (Certainly Excel can read XML files.). And I don't want to sound like a broken record about (b) and (c). > > The cases that are neither of these and thus would make sense to use > > some internal, non-portable text format are essentially non-existent. > > I think I have one of these "essentially non-existent" cases. And > almost everything I do seems to be one of those cases. Could be, but I think it is because you have a bogus set of requirements. > > Note that human readability of program-to-program data is a > > non-issue. > > You're apparently working in a very different area than I am. Almost > all data going from one program to another should also be available in > a human-readable format. My work is to look at data, not to program. > The programs are just written to process the data from one form into > another form - which hopefully can teach us something new and > interesting. I hate to split hairs, but I think your job is to analyze data, not to "look at data". If there is enough data to make sense processing it with a program, there is little point at looking at it manually. You had mentioned a large data set (50 MB?) earlier; I hope you're looking at the analysis, not at the data. I hardly ever look at raw web logs (the closest analog I have); I use a program and look at the results of its analysis. Truthfully, if what you described above is true, you probably ought to be programming in Perl (ugh) or Python. Because Ada's text processing is its weak link, and it makes little sense to write any significant amount of text processing code in Ada. (I say that, despite the fact that I do exactly that -- but that's because I use Ada for everything that I can't do with a simple batch file.) > > Indeed, it is a mistake to try to bring that into the equation, as > > it adds a huge amount of overhead to the task. I've always used > > agile methods for debugging such data: if, in fact, I need to > > examine such a data stream, I'm write a program to display it. But I > > don't worry about that until/unless the need arises. > > It seems that you're a programmer and not a researcher. I am (almost) > always interested in the data. I have yet to run into a case where I > wasn't interested in seeing the output of a program. Sure, but the output of the program is an analysis of the data, not some raw (and huge) data stream. > > It often does not arise, and even when it does, it's often not > > necessary to be able to display everything -- and it's often better > > to write a monitor for an interesting condition than filling a disk > > with 10 GB of text! > > I would spend all my time writing monitors that way. Yes, formatting imput/results usefully for humans is the hard part of programming. Documentation, GUI input/output, and log files (that is, the stuff for humans) take up approximately 4 times as much time to create as the actual filter for our spam filter. For our compiler, (which needs little documentation or specialized I/O), it was always much less, but it still is a significant part (perhaps as much as half) of the effort. The other tools (CLAW, the web log analyzer, the web server, etc.) all have fallen somewhere in between those extremes -- but that's the real job that we get paid for (because its not fun and not interesting -- someone will do the fun and interesting stuff for free, but not the hard work, most of the time anyway). Like I said before, your mileage may differ. If you're stuck with lame tools that can't process a sane data format, it might make sense to use some junk text format to match it. (I'd rather get better tools, but I realize that isn't always possible.) But I'd hardly expect any help in creating such stuff. Randy. ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-07-01 1:22 ` Randy Brukardt @ 2005-07-01 3:01 ` Alexander E. Kopilovich 2005-07-01 5:59 ` Jeffrey Carter 2005-07-02 1:54 ` Randy Brukardt 0 siblings, 2 replies; 68+ messages in thread From: Alexander E. Kopilovich @ 2005-07-01 3:01 UTC (permalink / raw) To: comp.lang.ada Randy Brukardt wrote: > If there is enough data to make sense processing it with a > program, there is little point at looking at it manually. It would be fine if you said "then I see" instead of "there is". How do you know what is there, as you aren't a scientist, but a software engineer, regardless of you professional skills in your domain? You obviosly don't like data very much, but for a scientist that scientific data (often including raw experimental data) is one of the most valuable things. It certainly deserves attentive look (at least, from time to time), not just a bureacratic "analysis". > Truthfully, if what you described above is true, you probably ought to be > programming in Perl (ugh) or Python. Because Ada's text processing is its > weak link, and it makes little sense to write any significant amount of text > processing code in Ada. It would be interesting to hear reply from Robert Dewar to this opinion about text processing capabilities of Ada -:) . Actually, serious text processing is perfectly possible with Ada, and in fact Ada is more suitable for it than Perl. Ada is unsuitable for quick scripting (especially by novice), but it is true for all application domains, it is true for numerical computations as well as for text processing . ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-07-01 3:01 ` Alexander E. Kopilovich @ 2005-07-01 5:59 ` Jeffrey Carter 2005-07-02 1:54 ` Randy Brukardt 1 sibling, 0 replies; 68+ messages in thread From: Jeffrey Carter @ 2005-07-01 5:59 UTC (permalink / raw) Alexander E. Kopilovich wrote: > > You obviosly don't like data very much, but for a scientist that scientific > data (often including raw experimental data) is one of the most valuable > things. It certainly deserves attentive look (at least, from time to time), > not just a bureacratic "analysis". I developed SW for a researcher for a while once. What I found interesting was that what I, as a SW engineer, wanted to hide was usually what he wanted to see. -- Jeff Carter "We call your door-opening request a silly thing." Monty Python & the Holy Grail 17 ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-07-01 3:01 ` Alexander E. Kopilovich 2005-07-01 5:59 ` Jeffrey Carter @ 2005-07-02 1:54 ` Randy Brukardt 2005-07-02 10:24 ` Dmitry A. Kazakov 1 sibling, 1 reply; 68+ messages in thread From: Randy Brukardt @ 2005-07-02 1:54 UTC (permalink / raw) "Alexander E. Kopilovich" <aek@VB1162.spb.edu> wrote in message news:mailman.122.1120188122.17633.comp.lang.ada@ada-france.org... > Randy Brukardt wrote: ... > You obviosly don't like data very much, but for a scientist that scientific > data (often including raw experimental data) is one of the most valuable > things. It certainly deserves attentive look (at least, from time to time), > not just a bureacratic "analysis". I'm not speaking about data in general (that would be silly), but about it in the context of Ada programming. (Or have you forgotten the purpose of this newsgroup?) It makes perfect sense to look at raw data if you don't know what to analyze for and you need to find some patterns to give some insight. I suppose there also is an amount of idle curiousity, too (certainly that happens to me in these sorts of circumstances -- that's why I might look at web logs or the results of a game analysis). But I hardly think it makes sense to design software based on idle curiousity. And if you don't know what you are analyzing for, Ada is hardly the programming language to be using. (Unless you're a hard-core Ada nut [a category that I qualify in]; but then you hardly need advice from this group.) You need a much more dynamic language, perhaps even those Unix filters. Its quite possible that Jacob shouldn't be using Ada at all for his tasks, and thus he's trying to fit a square peg into a round hole. > > Truthfully, if what you described above is true, you probably ought to be > > programming in Perl (ugh) or Python. Because Ada's text processing is its > > weak link, and it makes little sense to write any significant amount of text > > processing code in Ada. > > It would be interesting to hear reply from Robert Dewar to this opinion about > text processing capabilities of Ada -:) . Actually, serious text processing > is perfectly possible with Ada, and in fact Ada is more suitable for it than > Perl. Ada is unsuitable for quick scripting (especially by novice), but it > is true for all application domains, it is true for numerical computations > as well as for text processing . Certainly, serious text processing is *possible* in Ada. (My Trash Finder spam filter certainly is an extensive text processing application!!) And of course, the benefits of Ada do apply (particular type checking and good runtime checks). But, Ada text processing code is just painful to write, and it's quite hard to read. That's true no matter whether you use plain strings or unbounded strings. One of my original intents with TF was to show a good example of Ada code to non-Ada programmers. But the code got so long-winded that I gave up on that idea fairly early on. Moreover, the standard routines in Ada.Strings.Unbounded were just not fast enough in some cases, and I had to write special routines that understand the internal representation of an unbounded string. Yuck. (Ada 200Y will help this a bit, at least the searching has been improved.) For instance, there isn't a way to search for an unbounded string in another unbounded string. [TF puts pretty much everything into lists of unbounded strings, because it's impossible to predict what sort of string lengths items will have.] You have to use To_String to convert to a regular string, which is ugly (especially without use clauses): if Ada.Strings.Unbounded.Index (Ada.Strings.Unbounded.Translate (Current.Line, Ada.Strings.Maps.Constants.Lower_Case_Map), Ada.Strings.Unbounded.To_String (Pattern.Line)) /= 0 then Even with a use clause for Ada.Strings.Unbounded (in which case you can't have one for Ada.Strings.Fixed, else things get very ambiguous): if Index (Translate (Current.Line, Ada.Strings.Maps.Constants.Lower_Case_Map), To_String (Pattern.Line) /= 0 then So, it's possible to write this sort of code in Ada, and get decent performance, too, but the result isn't particularly readable, understandable, or maintainable. It's a lot easier to write this in Perl, although the result would probably be a bit harder to maintain. Not having used Python, I can't say for sure, but I'd certainly hope that it would be easier that this to write (and read!) something simple like a case-insensitive search for a pattern. If I had used regular strings, the complexity would have been about the same, just in different places. (In hindsight, I probably wouldn't have used unbounded strings at all, they just didn't buy enough simplification.) So, I stand by my statements. There is more than 8,000 lines of text processing code in TF, all of which looks like this. And all I can say is that I certainly hope that there is a better way somewhere, even though such a way isn't really possible for Ada. Randy. ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-07-02 1:54 ` Randy Brukardt @ 2005-07-02 10:24 ` Dmitry A. Kazakov 2005-07-06 22:04 ` Randy Brukardt 0 siblings, 1 reply; 68+ messages in thread From: Dmitry A. Kazakov @ 2005-07-02 10:24 UTC (permalink / raw) On Fri, 1 Jul 2005 20:54:18 -0500, Randy Brukardt wrote: > And if you don't know what you are analyzing for, Ada is hardly the > programming language to be using. (Unless you're a hard-core Ada nut [a > category that I qualify in]; but then you hardly need advice from this > group.) You need a much more dynamic language, perhaps even those Unix > filters. I think it depends. I have a quite opposite experience. I'm lazy and always start to write a UNIX script. After a couple of hours fighting with that mess I note (usually to late) that to write it in Ada (or even in ANSI C) would take twice as short. > For instance, there isn't a way to search for an unbounded string in another > unbounded string. [TF puts pretty much everything into lists of unbounded > strings, because it's impossible to predict what sort of string lengths > items will have.] You have to use To_String to convert to a regular string, > which is ugly (especially without use clauses): Yes. All built-in string types should have a common ancestor. Ada.Strings.Unbounded was and remains an ugly hack. > if Ada.Strings.Unbounded.Index (Ada.Strings.Unbounded.Translate > (Current.Line, Ada.Strings.Maps.Constants.Lower_Case_Map), > Ada.Strings.Unbounded.To_String (Pattern.Line)) /= 0 then I'm using a table of tokens instead. The string is matched against the table for a longest token that matches. And I always use anchored search. I tend to do everything in one pass and Ada fits here well. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-07-02 10:24 ` Dmitry A. Kazakov @ 2005-07-06 22:04 ` Randy Brukardt 0 siblings, 0 replies; 68+ messages in thread From: Randy Brukardt @ 2005-07-06 22:04 UTC (permalink / raw) "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message news:1vlfc01w9jkzj$.k4rp7yhtuoj3$.dlg@40tude.net... ... > > if Ada.Strings.Unbounded.Index (Ada.Strings.Unbounded.Translate > > (Current.Line, Ada.Strings.Maps.Constants.Lower_Case_Map), > > Ada.Strings.Unbounded.To_String (Pattern.Line)) /= 0 then > > I'm using a table of tokens instead. The string is matched against the > table for a longest token that matches. And I always use anchored search. I > tend to do everything in one pass and Ada fits here well. That's actually what the above is doing: a single match in a list of patterns. It usually is inside of a loop. Some of the matching uses a special Match_Start routine which is cheaper than Index; but of course that only works because I know how an unbounded string works, and Ada lets be create a child to use that information. I'm not certain what you mean by an "anchored search", but I don't expect that too work too well on e-mail (which is just a mass of text). I do think it wouldn't have been any harder to have used type String items here instead of Unbounded_String. The only reason I used Unbounded_String was to see how easy or hard it really was to use that package - I won't make that mistake again. Randy. ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-30 3:02 ` Randy Brukardt 2005-06-30 18:43 ` Jacob Sparre Andersen @ 2005-06-30 19:24 ` Björn Persson 2005-07-01 0:54 ` Randy Brukardt 1 sibling, 1 reply; 68+ messages in thread From: Björn Persson @ 2005-06-30 19:24 UTC (permalink / raw) Randy Brukardt wrote: > OTOH, if the performance of the connection isn't critical, then using a > well-known standard format that already has needed tools for it seems like > the best option. I consider text/tab-separated-values a standard format. Whether it's well-known is debatable. The definition is here: http://www.iana.org/assignments/media-types/text/tab-separated-values I'm not going to try to decide whether it's the right choice for Jacob. -- Bj�rn Persson PGP key A88682FD omb jor ers @sv ge. r o.b n.p son eri nu ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-30 19:24 ` Björn Persson @ 2005-07-01 0:54 ` Randy Brukardt 2005-07-01 21:36 ` TSV and CSV Björn Persson 2005-07-02 0:07 ` Data table text I/O package? Georg Bauhaus 0 siblings, 2 replies; 68+ messages in thread From: Randy Brukardt @ 2005-07-01 0:54 UTC (permalink / raw) [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain, Size: 905 bytes --] "Bj�rn Persson" <spam-away@nowhere.nil> wrote in message news:bGXwe.141215$dP1.494536@newsc.telia.net... > Randy Brukardt wrote: > > OTOH, if the performance of the connection isn't critical, then using a > > well-known standard format that already has needed tools for it seems like > > the best option. > > I consider text/tab-separated-values a standard format. Whether it's > well-known is debatable. The definition is here: > http://www.iana.org/assignments/media-types/text/tab-separated-values Never heard of this one. It seems like the world's worst choice for a file format, since the first thing any decent text tool will do is discard any tabs. I'm amazed that anyone would actually standardize such junk. I'm much more familiar with CSV files (which also seemed pretty silly to me, but I kinda think the entire data-in-a-text file thing is pretty silly). Randy. ^ permalink raw reply [flat|nested] 68+ messages in thread
* TSV and CSV 2005-07-01 0:54 ` Randy Brukardt @ 2005-07-01 21:36 ` Björn Persson 2005-07-01 22:08 ` Martin Dowie 2005-07-02 0:07 ` Data table text I/O package? Georg Bauhaus 1 sibling, 1 reply; 68+ messages in thread From: Björn Persson @ 2005-07-01 21:36 UTC (permalink / raw) Randy Brukardt wrote: > "Bj�rn Persson" <spam-away@nowhere.nil> wrote in message > news:bGXwe.141215$dP1.494536@newsc.telia.net... >>I consider text/tab-separated-values a standard format. Whether it's >>well-known is debatable. The definition is here: >>http://www.iana.org/assignments/media-types/text/tab-separated-values > > Never heard of this one. It seems like the world's worst choice for a file > format, since the first thing any decent text tool will do is discard any > tabs. Well, such a tool isn't the right tool for manipulating TSV files. As always, use the right tool for the job. The only tool I can think of right now that discards tabs is a web browser, and that's when it thinks the content type is text/html. > I'm much more familiar with CSV files CSV works as long as there are no commas in the data fields, but commas can occur in text fields, and comma is also the decimal sign in large parts of the world (and preferred in ISO documents, I hear). TSV works in these cases, as there's usually no need to allow tabs in the fields. (If you find that you want tabs inside the data fields, then it's probably time to look for a more sophisticated file format � perhaps XML based.) -- Bj�rn Persson PGP key A88682FD omb jor ers @sv ge. r o.b n.p son eri nu ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: TSV and CSV 2005-07-01 21:36 ` TSV and CSV Björn Persson @ 2005-07-01 22:08 ` Martin Dowie 2005-07-02 0:05 ` Georg Bauhaus 0 siblings, 1 reply; 68+ messages in thread From: Martin Dowie @ 2005-07-01 22:08 UTC (permalink / raw) Bj�rn Persson wrote: > CSV works as long as there are no commas in the data fields, but commas > can occur in text fields, and comma is also the decimal sign in large > parts of the world (and preferred in ISO documents, I hear). TSV works > in these cases, as there's usually no need to allow tabs in the fields. > (If you find that you want tabs inside the data fields, then it's > probably time to look for a more sophisticated file format � perhaps XML > based.) If you want commas in the data fields, simply wrap the data fields in quotes, e.g. "1","alpha, beta, gamma","foo" -- Martin ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: TSV and CSV 2005-07-01 22:08 ` Martin Dowie @ 2005-07-02 0:05 ` Georg Bauhaus 2005-07-02 1:10 ` Randy Brukardt 0 siblings, 1 reply; 68+ messages in thread From: Georg Bauhaus @ 2005-07-02 0:05 UTC (permalink / raw) Martin Dowie wrote: > If you want commas in the data fields, simply wrap the data fields in > quotes, e.g. > > "1","alpha, beta, gamma","foo" You can't be seriously sugggesting this? "If you want quotes in the fields..." -- Georg ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: TSV and CSV 2005-07-02 0:05 ` Georg Bauhaus @ 2005-07-02 1:10 ` Randy Brukardt 2005-07-02 1:20 ` Ed 2005-07-03 9:08 ` Georg Bauhaus 0 siblings, 2 replies; 68+ messages in thread From: Randy Brukardt @ 2005-07-02 1:10 UTC (permalink / raw) "Georg Bauhaus" <bauhaus@futureapps.de> wrote in message news:42c5e46e$0$10818$9b4e6d93@newsread4.arcor-online.net... > Martin Dowie wrote: > > > If you want commas in the data fields, simply wrap the data fields in > > quotes, e.g. > > > > "1","alpha, beta, gamma","foo" > > You can't be seriously sugggesting this? Of course he's seriously suggesting this, it's how these files work. > "If you want quotes in the fields..." You escape them, I forget how. (There is a standard for CSV files.) Same as you would do in Ada or any other language.There's no place that you can put quotes without some sort of escape (I usually use the Ada syntax if I'm inventing my own) if you plan to read them afterwards. But in any case, if your data is at all complex, you are going to need complex reading/writing to handle it. The original query was about tables of numbers, not random sequences of characters. Pretty much any format can be made to work for that. Randy. ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: TSV and CSV 2005-07-02 1:10 ` Randy Brukardt @ 2005-07-02 1:20 ` Ed 2005-07-03 9:08 ` Georg Bauhaus 1 sibling, 0 replies; 68+ messages in thread From: Ed @ 2005-07-02 1:20 UTC (permalink / raw) On 01/07/2005 6:10 PM, Randy Brukardt wrote: > You escape them, I forget how. (There is a standard for CSV files.) Same as http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm has a good description of the file format. Ed. ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: TSV and CSV 2005-07-02 1:10 ` Randy Brukardt 2005-07-02 1:20 ` Ed @ 2005-07-03 9:08 ` Georg Bauhaus 1 sibling, 0 replies; 68+ messages in thread From: Georg Bauhaus @ 2005-07-03 9:08 UTC (permalink / raw) Randy Brukardt wrote: > "Georg Bauhaus" <bauhaus@futureapps.de> wrote in message > news:42c5e46e$0$10818$9b4e6d93@newsread4.arcor-online.net... > >>Martin Dowie wrote: >> >> >>>If you want commas in the data fields, simply wrap the data fields in >>>quotes, e.g. >>> >>>"1","alpha, beta, gamma","foo" >> >>You can't be seriously sugggesting this? I was addressing the "simply" in the sentence above about wrapping the data fields, because it only shifts the problem to the next escaping level, which you then have mentioned. It's there where the problems usually start, "simply do this, and, uhm that, and, oh, I forgot you should...". Bottom line: We don't have standardised CSV document types. Even considering the CSV description Ed has mentioned, with all its buts and donts which speak for themselves... In fact, they repeat some of the input to the XML design discussion, which lead to a standard. Just to make sure, it is easy to think of a (one) set of rules for producing good CSV data. However, like Ada programs, producing them is far less important than using them later, from a consumption point of view. At least if you care about the recipients at all. When reading CSV data, you can think of more than one set of rules, in sharp contrast to just one when producing CSV data. One average CSV stream we read contains no line breaks, probably for reaons of transmission speed. As if this weren't enough (excuse: "simply" count fields) some fields can *contain* non-escaped separators (excuse: "simply" inspect context to find out whether the comma is acutally a separator...). It is rare that I have been given a CSV file/stream to process together with a clear description. (So maybe I'm biased.) The streams have almost always had some hack or some "cleverness" in them. I believe that a standardised data format helps, in practise, to reduce undocumented hacks and cleverness. One such format type can be based on XML. > Of course he's seriously suggesting this, it's how these files work. This is how these files *should* work, ideally. As you can see on http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm#FileFormat, you still have to climb up a decision tree and visit this or that branch in order to parse CSV data in a reliable fashion, unless you know exactly how they are produced. All in all you end with: > Pretty much any format can be > made to work for that. ...provided you sort of reinvent the markup rules and wheels. And disregard your own advice to use a really standardised format (in applications not all under your control.) ;-) -- Georg ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-07-01 0:54 ` Randy Brukardt 2005-07-01 21:36 ` TSV and CSV Björn Persson @ 2005-07-02 0:07 ` Georg Bauhaus 2005-07-02 1:21 ` Randy Brukardt 1 sibling, 1 reply; 68+ messages in thread From: Georg Bauhaus @ 2005-07-02 0:07 UTC (permalink / raw) Randy Brukardt wrote: > I > kinda think the entire data-in-a-text file thing is pretty silly). It's not that silly when your data is actually text, though. -- Georg ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-07-02 0:07 ` Data table text I/O package? Georg Bauhaus @ 2005-07-02 1:21 ` Randy Brukardt 0 siblings, 0 replies; 68+ messages in thread From: Randy Brukardt @ 2005-07-02 1:21 UTC (permalink / raw) "Georg Bauhaus" <bauhaus@futureapps.de> wrote in message news:42c5e4de$0$10818$9b4e6d93@newsread4.arcor-online.net... > Randy Brukardt wrote: > > > I > > kinda think the entire data-in-a-text file thing is pretty silly). > > It's not that silly when your data is actually text, though. That would be one of the weird special cases. It's not that unusual to have some text components in your data, but for *all* of the data to be text is pretty rare. For instance, in a web log, the URL is text, but most of the other components (access time, result code, IP address) are really values of one sort or another. They can be represented as text, but any significant manipulation ought to be done on the value (strongly typed, if you're using Ada). And I generally don't think of raw text (like an e-mail message) as data. Text and data are usually different things. You're welcome to view text as data if you want, but that's not at all the sorts of applications that I've been thinking about here. Randy. ^ permalink raw reply [flat|nested] 68+ messages in thread
[parent not found: <20050615141236.GA90053@pvv.org>]
* Re: Data table text I/O package? [not found] ` <20050615141236.GA90053@pvv.org> @ 2005-06-15 15:40 ` Marius Amado Alves 2005-06-15 19:18 ` Oliver Kellogg [not found] ` <7adf1648bb99ca2bb4055ed8e6e381f4@netcabo.pt> 1 sibling, 1 reply; 68+ messages in thread From: Marius Amado Alves @ 2005-06-15 15:40 UTC (permalink / raw) To: comp.lang.ada On 15 Jun 2005, at 15:12, Preben Randhol wrote: > Jacob Sparre Andersen <sparre@nbi.dk> wrote on 15/06/2005 (13:38) : >> The important part is to have the checking of the headers and the >> generation of Put_Line and Get_Line procedures automated based on a >> record type (and not too much more). Since I need records (for type >> checking) and not just simple arrays, I can't manage with a generic >> package, but have to put some code generation into the system (or can >> I play some tricks with streams?). (Didn't get this message from Jacob.) You have to generate code. I did that in the past. Ada records or types cannot be created dynamically. Ada is not reflexive. Open Ada is, but I haven't tried it yet. ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-15 15:40 ` Marius Amado Alves @ 2005-06-15 19:18 ` Oliver Kellogg 2005-06-17 9:02 ` Jacob Sparre Andersen 0 siblings, 1 reply; 68+ messages in thread From: Oliver Kellogg @ 2005-06-15 19:18 UTC (permalink / raw) Marius Amado Alves <amado.alves@netcabo.pt> wrote: > > On 15 Jun 2005, at 15:12, Preben Randhol wrote: > >> Jacob Sparre Andersen <sparre@nbi.dk> wrote on 15/06/2005 (13:38) : >>> The important part is to have the checking of the headers and the >>> generation of Put_Line and Get_Line procedures automated based on a >>> record type (and not too much more). Since I need records (for type >>> checking) and not just simple arrays, I can't manage with a generic >>> package, but have to put some code generation into the system (or can >>> I play some tricks with streams?). > > (Didn't get this message from Jacob.) > > You have to generate code. I did that in the past. Ada records or types > cannot be created dynamically. Ada is not reflexive. Open Ada is, but I > haven't tried it yet. > Auto_Text_IO ? http://www.toadmail.com/~ada_wizard/ada/auto_text_io.html HTH ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-15 19:18 ` Oliver Kellogg @ 2005-06-17 9:02 ` Jacob Sparre Andersen 0 siblings, 0 replies; 68+ messages in thread From: Jacob Sparre Andersen @ 2005-06-17 9:02 UTC (permalink / raw) Oliver Kellogg wrote: > Marius Amado Alves wrote: > >> Jacob Sparre Andersen <sparre@nbi.dk> wrote on 15/06/2005 (13:38) : > >>> The important part is to have the checking of the headers and > >>> the generation of Put_Line and Get_Line procedures automated > >>> based on a record type (and not too much more). Since I need > >>> records (for type checking) and not just simple arrays, I can't > >>> manage with a generic package, but have to put some code > >>> generation into the system (or can I play some tricks with > >>> streams?). > > You have to generate code. Yes. > Auto_Text_IO ? > > http://www.toadmail.com/~ada_wizard/ada/auto_text_io.html I will have to hack it a bit for my purpose, but it looks like the tool I need. Now I just have to work around the lack of ASIS on Debian/PPC, but that's relatively trivial. Still, it would be nice if somebody could explain (and solve) this bug report: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=117788 Jacob -- �What fun is it being "cool" if you can't wear a sombrero?� ^ permalink raw reply [flat|nested] 68+ messages in thread
[parent not found: <7adf1648bb99ca2bb4055ed8e6e381f4@netcabo.pt>]
* Re: Data table text I/O package? [not found] ` <7adf1648bb99ca2bb4055ed8e6e381f4@netcabo.pt> @ 2005-06-15 15:46 ` Preben Randhol [not found] ` <20050615154640.GA1921@pvv.org> 1 sibling, 0 replies; 68+ messages in thread From: Preben Randhol @ 2005-06-15 15:46 UTC (permalink / raw) To: comp.lang.ada On Wed, Jun 15, 2005 at 04:40:53PM +0100, Marius Amado Alves wrote: > You have to generate code. I did that in the past. Ada records or types > cannot be created dynamically. Ada is not reflexive. Open Ada is, but I > haven't tried it yet. Open Ada? -- Preben Randhol -------------- http://www.pvv.org/~randhol/Ada95 -- "For me, Ada95 puts back the joy in programming." ^ permalink raw reply [flat|nested] 68+ messages in thread
[parent not found: <20050615154640.GA1921@pvv.org>]
* Re: Data table text I/O package? [not found] ` <20050615154640.GA1921@pvv.org> @ 2005-06-15 16:14 ` Marius Amado Alves [not found] ` <f04ccd7efd67fe197cc14cda89340779@netcabo.pt> 1 sibling, 0 replies; 68+ messages in thread From: Marius Amado Alves @ 2005-06-15 16:14 UTC (permalink / raw) To: comp.lang.ada > Open Ada? Sorry, OpenAda. Originally from the USAF, now from Rational I think. ^ permalink raw reply [flat|nested] 68+ messages in thread
[parent not found: <f04ccd7efd67fe197cc14cda89340779@netcabo.pt>]
* Re: Data table text I/O package? [not found] ` <f04ccd7efd67fe197cc14cda89340779@netcabo.pt> @ 2005-06-15 16:20 ` Preben Randhol 0 siblings, 0 replies; 68+ messages in thread From: Preben Randhol @ 2005-06-15 16:20 UTC (permalink / raw) To: Marius Amado Alves; +Cc: comp.lang.ada On Wed, Jun 15, 2005 at 05:14:39PM +0100, Marius Amado Alves wrote: > >Open Ada? > > Sorry, OpenAda. Originally from the USAF, now from Rational I think. I see. More info here: http://www.cs.york.ac.uk/ftpdir/reports/YCS-2000-331.pdf -- Preben Randhol -------------- http://www.pvv.org/~randhol/Ada95 -- �For me, Ada95 puts back the joy in programming.� ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-15 9:57 Data table text I/O package? Jacob Sparre Andersen 2005-06-15 11:43 ` Preben Randhol @ 2005-06-15 19:30 ` Simon Wright 2005-06-15 22:40 ` Lionel Draghi 2 siblings, 0 replies; 68+ messages in thread From: Simon Wright @ 2005-06-15 19:30 UTC (permalink / raw) Jacob Sparre Andersen <sparre@nbi.dk> writes: > I do quite a lot of work, where I manipulate data stored in > (tabulator separated) text files [1]. Does anybody know of a > package which handles the inclusion of a header line with the column > names in an elegant way? It should preferably include automated > testing that the header is correct, when a file is opened, and > automated creation of the header when a file is created. I think this sounds like an ASIS application. You might look at Stephe Leake's Auto_Text_IO .. Google finds it easily enough. ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: Data table text I/O package? 2005-06-15 9:57 Data table text I/O package? Jacob Sparre Andersen 2005-06-15 11:43 ` Preben Randhol 2005-06-15 19:30 ` Simon Wright @ 2005-06-15 22:40 ` Lionel Draghi 2 siblings, 0 replies; 68+ messages in thread From: Lionel Draghi @ 2005-06-15 22:40 UTC (permalink / raw) Jacob Sparre Andersen a ï¿œcrit : > I do quite a lot of work, where I manipulate data stored in (tabulator > separated) text files [1]. Does anybody know of a package which > handles the inclusion of a header line with the column names in an > elegant way? Not an answer, but you may grab some idea from ploticus input formats: http://ploticus.sourceforge.net/doc/dataformat.html And maybe some idea from the C code... -- Lionel Draghi ^ permalink raw reply [flat|nested] 68+ messages in thread
end of thread, other threads:[~2005-07-06 22:04 UTC | newest] Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2005-06-15 9:57 Data table text I/O package? Jacob Sparre Andersen 2005-06-15 11:43 ` Preben Randhol 2005-06-15 13:35 ` Jacob Sparre Andersen 2005-06-15 14:12 ` Preben Randhol 2005-06-15 15:02 ` Jacob Sparre Andersen 2005-06-15 16:17 ` Preben Randhol 2005-06-15 16:58 ` Dmitry A. Kazakov 2005-06-15 17:30 ` Marius Amado Alves 2005-06-15 18:41 ` Dmitry A. Kazakov 2005-06-15 19:09 ` Marius Amado Alves 2005-06-15 18:58 ` Randy Brukardt 2005-06-16 9:55 ` Jacob Sparre Andersen 2005-06-16 10:53 ` Marius Amado Alves 2005-06-16 12:24 ` Robert A Duff 2005-06-16 14:01 ` Georg Bauhaus 2005-06-16 12:27 ` Dmitry A. Kazakov 2005-06-16 14:46 ` Georg Bauhaus 2005-06-16 14:51 ` Dmitry A. Kazakov 2005-06-20 11:19 ` Georg Bauhaus 2005-06-20 11:39 ` Dmitry A. Kazakov 2005-06-20 18:25 ` Georg Bauhaus 2005-06-20 18:45 ` Preben Randhol 2005-06-20 18:54 ` Dmitry A. Kazakov 2005-06-21 9:24 ` Georg Bauhaus 2005-06-21 9:52 ` Jacob Sparre Andersen 2005-06-21 11:10 ` Georg Bauhaus 2005-06-21 12:35 ` Jacob Sparre Andersen 2005-06-21 10:42 ` Dmitry A. Kazakov 2005-06-21 11:41 ` Georg Bauhaus 2005-06-21 12:44 ` Dmitry A. Kazakov 2005-06-21 21:01 ` Georg Bauhaus 2005-06-22 12:15 ` Dmitry A. Kazakov 2005-06-22 22:24 ` Georg Bauhaus 2005-06-23 9:03 ` Dmitry A. Kazakov 2005-06-23 9:47 ` Georg Bauhaus 2005-06-23 10:34 ` Dmitry A. Kazakov 2005-06-23 11:37 ` Georg Bauhaus 2005-06-23 12:59 ` Dmitry A. Kazakov 2005-06-23 14:16 ` Marc A. Criley 2005-06-25 16:38 ` Simon Wright 2005-06-16 13:26 ` Marius Amado Alves 2005-06-16 18:10 ` Georg Bauhaus 2005-06-30 3:02 ` Randy Brukardt 2005-06-30 18:43 ` Jacob Sparre Andersen 2005-07-01 1:22 ` Randy Brukardt 2005-07-01 3:01 ` Alexander E. Kopilovich 2005-07-01 5:59 ` Jeffrey Carter 2005-07-02 1:54 ` Randy Brukardt 2005-07-02 10:24 ` Dmitry A. Kazakov 2005-07-06 22:04 ` Randy Brukardt 2005-06-30 19:24 ` Björn Persson 2005-07-01 0:54 ` Randy Brukardt 2005-07-01 21:36 ` TSV and CSV Björn Persson 2005-07-01 22:08 ` Martin Dowie 2005-07-02 0:05 ` Georg Bauhaus 2005-07-02 1:10 ` Randy Brukardt 2005-07-02 1:20 ` Ed 2005-07-03 9:08 ` Georg Bauhaus 2005-07-02 0:07 ` Data table text I/O package? Georg Bauhaus 2005-07-02 1:21 ` Randy Brukardt [not found] ` <20050615141236.GA90053@pvv.org> 2005-06-15 15:40 ` Marius Amado Alves 2005-06-15 19:18 ` Oliver Kellogg 2005-06-17 9:02 ` Jacob Sparre Andersen [not found] ` <7adf1648bb99ca2bb4055ed8e6e381f4@netcabo.pt> 2005-06-15 15:46 ` Preben Randhol [not found] ` <20050615154640.GA1921@pvv.org> 2005-06-15 16:14 ` Marius Amado Alves [not found] ` <f04ccd7efd67fe197cc14cda89340779@netcabo.pt> 2005-06-15 16:20 ` Preben Randhol 2005-06-15 19:30 ` Simon Wright 2005-06-15 22:40 ` Lionel Draghi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox