comp.lang.ada
 help / color / mirror / Atom feed
From: Georg Bauhaus <bauhaus@futureapps.de>
Subject: Re: TSV and CSV
Date: Sun, 03 Jul 2005 11:08:54 +0200
Date: 2005-07-03T11:51:43+02:00	[thread overview]
Message-ID: <42c7b52b$0$10804$9b4e6d93@newsread4.arcor-online.net> (raw)
In-Reply-To: <z8SdnQAFJ8pRdVjfRVn-gw@megapath.net>

Randy Brukardt wrote:
> "Georg Bauhaus" <bauhaus@futureapps.de> wrote in message
> news:42c5e46e$0$10818$9b4e6d93@newsread4.arcor-online.net...
> 
>>Martin Dowie wrote:
>>
>>
>>>If you want commas in the data fields, simply wrap the data fields in
>>>quotes, e.g.
>>>
>>>"1","alpha, beta, gamma","foo"
>>
>>You can't be seriously sugggesting this?

I was addressing the "simply" in the sentence above about wrapping
the data fields, because it only shifts the problem to the next
escaping level, which you then have mentioned.
  It's there where the problems usually start,
"simply do this, and, uhm that, and, oh, I forgot you should...".
Bottom line: We don't have standardised CSV document types.

Even considering the CSV description Ed has mentioned,
with all its buts and donts which speak for themselves...
In fact, they repeat some of the input to the XML design
discussion, which lead to a standard.

Just to make sure, it is easy to think of a (one)
set of rules for producing good CSV data. However, like
Ada programs, producing them is far less important than
using them later, from a consumption point of view.
At least if you care about the recipients at all.
When reading CSV data, you can think of more than one set
of rules, in sharp contrast to just one when producing
CSV data.

One average CSV stream we read contains no line breaks,
probably for reaons of transmission speed.
As if this weren't enough (excuse: "simply" count fields)
some fields can *contain* non-escaped separators (excuse:
"simply" inspect context to find out whether the comma is
acutally a separator...).

It is rare that I have been given a CSV file/stream to process
together with a clear description. (So maybe I'm biased.)
The streams have almost always had some hack or some
"cleverness" in them. I believe that a standardised data
format helps, in practise, to reduce undocumented hacks and
cleverness. One such format type can be based on XML.


> Of course he's seriously suggesting this, it's how these files work.

This is how these files *should* work, ideally. As you can see
on http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm#FileFormat,
you still have to climb up a decision tree and visit this or that
branch in order to parse CSV data in a reliable fashion,
unless you know exactly how they are produced.

All in all you end with:

>  Pretty much any format can be
> made to work for that.

...provided you sort of reinvent the markup rules and wheels.
And disregard your own advice to use a really standardised
format (in applications not all under your control.) ;-)


-- Georg



  parent reply	other threads:[~2005-07-03  9:08 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-06-15  9:57 Data table text I/O package? Jacob Sparre Andersen
2005-06-15 11:43 ` Preben Randhol
2005-06-15 13:35   ` Jacob Sparre Andersen
2005-06-15 14:12     ` Preben Randhol
2005-06-15 15:02       ` Jacob Sparre Andersen
2005-06-15 16:17         ` Preben Randhol
2005-06-15 16:58           ` Dmitry A. Kazakov
2005-06-15 17:30             ` Marius Amado Alves
2005-06-15 18:41               ` Dmitry A. Kazakov
2005-06-15 19:09                 ` Marius Amado Alves
2005-06-15 18:58         ` Randy Brukardt
2005-06-16  9:55           ` Jacob Sparre Andersen
2005-06-16 10:53             ` Marius Amado Alves
2005-06-16 12:24               ` Robert A Duff
2005-06-16 14:01               ` Georg Bauhaus
2005-06-16 12:27                 ` Dmitry A. Kazakov
2005-06-16 14:46                   ` Georg Bauhaus
2005-06-16 14:51                     ` Dmitry A. Kazakov
2005-06-20 11:19                       ` Georg Bauhaus
2005-06-20 11:39                         ` Dmitry A. Kazakov
2005-06-20 18:25                           ` Georg Bauhaus
2005-06-20 18:45                             ` Preben Randhol
2005-06-20 18:54                             ` Dmitry A. Kazakov
2005-06-21  9:24                               ` Georg Bauhaus
2005-06-21  9:52                                 ` Jacob Sparre Andersen
2005-06-21 11:10                                   ` Georg Bauhaus
2005-06-21 12:35                                     ` Jacob Sparre Andersen
2005-06-21 10:42                                 ` Dmitry A. Kazakov
2005-06-21 11:41                                   ` Georg Bauhaus
2005-06-21 12:44                                     ` Dmitry A. Kazakov
2005-06-21 21:01                                       ` Georg Bauhaus
2005-06-22 12:15                                         ` Dmitry A. Kazakov
2005-06-22 22:24                                           ` Georg Bauhaus
2005-06-23  9:03                                             ` Dmitry A. Kazakov
2005-06-23  9:47                                               ` Georg Bauhaus
2005-06-23 10:34                                                 ` Dmitry A. Kazakov
2005-06-23 11:37                                                   ` Georg Bauhaus
2005-06-23 12:59                                                     ` Dmitry A. Kazakov
2005-06-23 14:16                                               ` Marc A. Criley
2005-06-25 16:38                               ` Simon Wright
2005-06-16 13:26                 ` Marius Amado Alves
2005-06-16 18:10                   ` Georg Bauhaus
2005-06-30  3:02             ` Randy Brukardt
2005-06-30 18:43               ` Jacob Sparre Andersen
2005-07-01  1:22                 ` Randy Brukardt
2005-07-01  3:01                   ` Alexander E. Kopilovich
2005-07-01  5:59                     ` Jeffrey Carter
2005-07-02  1:54                     ` Randy Brukardt
2005-07-02 10:24                       ` Dmitry A. Kazakov
2005-07-06 22:04                         ` Randy Brukardt
2005-06-30 19:24               ` Björn Persson
2005-07-01  0:54                 ` Randy Brukardt
2005-07-01 21:36                   ` TSV and CSV Björn Persson
2005-07-01 22:08                     ` Martin Dowie
2005-07-02  0:05                       ` Georg Bauhaus
2005-07-02  1:10                         ` Randy Brukardt
2005-07-02  1:20                           ` Ed
2005-07-03  9:08                           ` Georg Bauhaus [this message]
2005-07-02  0:07                   ` Data table text I/O package? Georg Bauhaus
2005-07-02  1:21                     ` Randy Brukardt
     [not found]     ` <20050615141236.GA90053@pvv.org>
2005-06-15 15:40       ` Marius Amado Alves
2005-06-15 19:18         ` Oliver Kellogg
2005-06-17  9:02           ` Jacob Sparre Andersen
     [not found]       ` <7adf1648bb99ca2bb4055ed8e6e381f4@netcabo.pt>
2005-06-15 15:46         ` Preben Randhol
     [not found]         ` <20050615154640.GA1921@pvv.org>
2005-06-15 16:14           ` Marius Amado Alves
     [not found]           ` <f04ccd7efd67fe197cc14cda89340779@netcabo.pt>
2005-06-15 16:20             ` Preben Randhol
2005-06-15 19:30 ` Simon Wright
2005-06-15 22:40 ` Lionel Draghi
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox