comp.lang.ada
 help / color / mirror / Atom feed
From: Jacob Sparre Andersen <sparre@nbi.dk>
Subject: Re: Data table text I/O package?
Date: 30 Jun 2005 20:43:39 +0200
Date: 2005-06-30T20:43:39+02:00	[thread overview]
Message-ID: <m2br5nd6sk.fsf@hugin.crs4.it> (raw)
In-Reply-To: -pGdnVJqme2I_V7fRVn-qA@megapath.net

Randy Brukardt wrote:
> "Jacob Sparre Andersen" <sparre@nbi.dk> wrote in message
> news:m2k6ku8w2s.fsf@hugin.crs4.it...
> > Randy Brukardt wrote:
> >
> > > I may be dense, but isn't this the purpose of XML? If so, why
> > > reinvent the wheel?
> >
> > The purpose of XML is to be _the_ universal file format.
> >
> >  a) I don't want a universal file format.
> >
> >  b) I don't believe in a universal file format.
> >
> >  c) XML is (almost) less readable than a binary file my purposes.
> >
> >  d) I'm _not_ going to switch away from tabulator separated tables
> >     for purposes, where tabulator separated tables are a sensible
> >     representation of the data in textual form.
> >
> > > (I personally think XML is way overused, more because it *can*
> > > be used than that it is worthwhile for the application. But this
> > > seems to be exactly the application that it was designed
> > > for. You'll end up with something like XML eventually anyway,
> > > why not start with it?)
> >
> > I'm afraid you completely misunderstood my problem.  It is not a
> > matter of a selecting a file format.  It is the matter of
> > automagically generating code for reading and writing that file
> > format.
> 
> Not at all. We like to say around here that you need to describe
> what your needs are, because often the program you are trying to
> write isn't appropriate for Ada. We usually use that for people
> trying to write C in Ada, but it should apply to everyone. :-)

I thought I had specified my needs.  But in case I forgot:

 a) A format for storing experimental data in tabular form.

 b) A format I easily can manipulate with my standard Unix toolbox.

 c) A format I easily can read and get an overview of (sections of)
    the data.

 d) A format that easily can be imported into programs I'm not in
    control of.  (concrete examples are Gnuplot, R, OOo Calc and
    Excel)

 e) A format I easily can read and write from my own programs.

Tabulator separated text files handle this quite fine (although OOo
and Excel users have to be careful about their number format settings
when they import the files).

> For program-to-program communication, there really are only two
> sensible options. If both ends are under your control, then using a
> binary format (with versioning and error detection if needed) is
> preferable, because it has the least overhead and there is no need
> for data conversion.

Yes.  But this doesn't handle b), c) and d).

> OTOH, if the performance of the connection isn't critical, then
> using a well-known standard format that already has needed tools for
> it seems like the best option. Even if you don't currently need to
> allow access by other systems, you're leaving the door open for
> future programs outside your system to use the data.

And which formats, besides tabulator separated text files, handle the
requirements?  XML doesn't handle b), c), d) and e).

> The cases that are neither of these and thus would make sense to use
> some internal, non-portable text format are essentially
> non-existent.

I think I have one of these "essentially non-existent" cases.  And
almost everything I do seems to be one of those cases.

> Note that human readability of program-to-program data is a
> non-issue.

You're apparently working in a very different area than I am.  Almost
all data going from one program to another should also be available in
a human-readable format.  My work is to look at data, not to program.
The programs are just written to process the data from one form into
another form - which hopefully can teach us something new and
interesting.

> Indeed, it is a mistake to try to bring that into the equation, as
> it adds a huge amount of overhead to the task. I've always used
> agile methods for debugging such data: if, in fact, I need to
> examine such a data stream, I'm write a program to display it. But I
> don't worry about that until/unless the need arises.

It seems that you're a programmer and not a researcher.  I am (almost)
always interested in the data.  I have yet to run into a case where I
wasn't interested in seeing the output of a program.

> It often does not arise, and even when it does, it's often not
> necessary to be able to display everything -- and it's often better
> to write a monitor for an interesting condition than filling a disk
> with 10 GB of text!

I would spend all my time writing monitors that way.

> So, all in all, I think you're trying to solve the wrong problem
> (finding a way to write a specific file format), rather than using
> an appropriate file format for Ada programs (usually binary).

It may be a long-time bad habit to use tabulator separated text files
for (intermediate) analysis results from experiments, but I haven't
found a convincing argument yet. -- If I could auto-generate the
monitor and the conversion programs to the programs I interact with,
then I might be convinced, but I would still have to hack some type
checking on top of Ada.Sequential_IO.  And the program for
auto-generating the export to Gnuplot would practically be identical
to the one I asked for initially anyway.

> But, as a friend of mine likes to say, "do what you want, because
> you will anyway!". :-)

A clever friend. :-)

Jacob
-- 
"Hungh. You see! More bear. Yellow snow is always dead give-away."



  reply	other threads:[~2005-06-30 18:43 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-06-15  9:57 Data table text I/O package? Jacob Sparre Andersen
2005-06-15 11:43 ` Preben Randhol
2005-06-15 13:35   ` Jacob Sparre Andersen
2005-06-15 14:12     ` Preben Randhol
2005-06-15 15:02       ` Jacob Sparre Andersen
2005-06-15 16:17         ` Preben Randhol
2005-06-15 16:58           ` Dmitry A. Kazakov
2005-06-15 17:30             ` Marius Amado Alves
2005-06-15 18:41               ` Dmitry A. Kazakov
2005-06-15 19:09                 ` Marius Amado Alves
2005-06-15 18:58         ` Randy Brukardt
2005-06-16  9:55           ` Jacob Sparre Andersen
2005-06-16 10:53             ` Marius Amado Alves
2005-06-16 12:24               ` Robert A Duff
2005-06-16 14:01               ` Georg Bauhaus
2005-06-16 12:27                 ` Dmitry A. Kazakov
2005-06-16 14:46                   ` Georg Bauhaus
2005-06-16 14:51                     ` Dmitry A. Kazakov
2005-06-20 11:19                       ` Georg Bauhaus
2005-06-20 11:39                         ` Dmitry A. Kazakov
2005-06-20 18:25                           ` Georg Bauhaus
2005-06-20 18:45                             ` Preben Randhol
2005-06-20 18:54                             ` Dmitry A. Kazakov
2005-06-21  9:24                               ` Georg Bauhaus
2005-06-21  9:52                                 ` Jacob Sparre Andersen
2005-06-21 11:10                                   ` Georg Bauhaus
2005-06-21 12:35                                     ` Jacob Sparre Andersen
2005-06-21 10:42                                 ` Dmitry A. Kazakov
2005-06-21 11:41                                   ` Georg Bauhaus
2005-06-21 12:44                                     ` Dmitry A. Kazakov
2005-06-21 21:01                                       ` Georg Bauhaus
2005-06-22 12:15                                         ` Dmitry A. Kazakov
2005-06-22 22:24                                           ` Georg Bauhaus
2005-06-23  9:03                                             ` Dmitry A. Kazakov
2005-06-23  9:47                                               ` Georg Bauhaus
2005-06-23 10:34                                                 ` Dmitry A. Kazakov
2005-06-23 11:37                                                   ` Georg Bauhaus
2005-06-23 12:59                                                     ` Dmitry A. Kazakov
2005-06-23 14:16                                               ` Marc A. Criley
2005-06-25 16:38                               ` Simon Wright
2005-06-16 13:26                 ` Marius Amado Alves
2005-06-16 18:10                   ` Georg Bauhaus
2005-06-30  3:02             ` Randy Brukardt
2005-06-30 18:43               ` Jacob Sparre Andersen [this message]
2005-07-01  1:22                 ` Randy Brukardt
2005-07-01  3:01                   ` Alexander E. Kopilovich
2005-07-01  5:59                     ` Jeffrey Carter
2005-07-02  1:54                     ` Randy Brukardt
2005-07-02 10:24                       ` Dmitry A. Kazakov
2005-07-06 22:04                         ` Randy Brukardt
2005-06-30 19:24               ` Björn Persson
2005-07-01  0:54                 ` Randy Brukardt
2005-07-01 21:36                   ` TSV and CSV Björn Persson
2005-07-01 22:08                     ` Martin Dowie
2005-07-02  0:05                       ` Georg Bauhaus
2005-07-02  1:10                         ` Randy Brukardt
2005-07-02  1:20                           ` Ed
2005-07-03  9:08                           ` Georg Bauhaus
2005-07-02  0:07                   ` Data table text I/O package? Georg Bauhaus
2005-07-02  1:21                     ` Randy Brukardt
     [not found]     ` <20050615141236.GA90053@pvv.org>
2005-06-15 15:40       ` Marius Amado Alves
2005-06-15 19:18         ` Oliver Kellogg
2005-06-17  9:02           ` Jacob Sparre Andersen
     [not found]       ` <7adf1648bb99ca2bb4055ed8e6e381f4@netcabo.pt>
2005-06-15 15:46         ` Preben Randhol
     [not found]         ` <20050615154640.GA1921@pvv.org>
2005-06-15 16:14           ` Marius Amado Alves
     [not found]           ` <f04ccd7efd67fe197cc14cda89340779@netcabo.pt>
2005-06-15 16:20             ` Preben Randhol
2005-06-15 19:30 ` Simon Wright
2005-06-15 22:40 ` Lionel Draghi
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox