From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,602331146257f418 X-Google-Attributes: gid103376,public X-Google-Language: ENGLISH,ASCII-7-bit Path: g2news1.google.com!news1.google.com!proxad.net!proxad.net!newsfeed.arcor.de!news.arcor.de!not-for-mail Date: Sun, 03 Jul 2005 11:08:54 +0200 From: Georg Bauhaus User-Agent: Debian Thunderbird 1.0.2 (X11/20050331) X-Accept-Language: en-us, en MIME-Version: 1.0 Newsgroups: comp.lang.ada Subject: Re: TSV and CSV References: <-pGdnVJqme2I_V7fRVn-qA@megapath.net> <9tOdnboZwYMIDlnfRVn-iw@megapath.net> <42c5e46e$0$10818$9b4e6d93@newsread4.arcor-online.net> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Message-ID: <42c7b52b$0$10804$9b4e6d93@newsread4.arcor-online.net> Organization: Arcor NNTP-Posting-Date: 03 Jul 2005 11:51:43 MEST NNTP-Posting-Host: d9c36a73.newsread4.arcor-online.net X-Trace: DXC=Yb;8J1G[2Xea0B5i45NL;d:ejgIfPPlddjW\KbG]kaMhliQbn6H@_EiMOG>a8EXB;fhP3YJKgE\jlT9@o2<2gP;b X-Complaints-To: abuse@arcor.de Xref: g2news1.google.com comp.lang.ada:11838 Date: 2005-07-03T11:51:43+02:00 List-Id: Randy Brukardt wrote: > "Georg Bauhaus" wrote in message > news:42c5e46e$0$10818$9b4e6d93@newsread4.arcor-online.net... > >>Martin Dowie wrote: >> >> >>>If you want commas in the data fields, simply wrap the data fields in >>>quotes, e.g. >>> >>>"1","alpha, beta, gamma","foo" >> >>You can't be seriously sugggesting this? I was addressing the "simply" in the sentence above about wrapping the data fields, because it only shifts the problem to the next escaping level, which you then have mentioned. It's there where the problems usually start, "simply do this, and, uhm that, and, oh, I forgot you should...". Bottom line: We don't have standardised CSV document types. Even considering the CSV description Ed has mentioned, with all its buts and donts which speak for themselves... In fact, they repeat some of the input to the XML design discussion, which lead to a standard. Just to make sure, it is easy to think of a (one) set of rules for producing good CSV data. However, like Ada programs, producing them is far less important than using them later, from a consumption point of view. At least if you care about the recipients at all. When reading CSV data, you can think of more than one set of rules, in sharp contrast to just one when producing CSV data. One average CSV stream we read contains no line breaks, probably for reaons of transmission speed. As if this weren't enough (excuse: "simply" count fields) some fields can *contain* non-escaped separators (excuse: "simply" inspect context to find out whether the comma is acutally a separator...). It is rare that I have been given a CSV file/stream to process together with a clear description. (So maybe I'm biased.) The streams have almost always had some hack or some "cleverness" in them. I believe that a standardised data format helps, in practise, to reduce undocumented hacks and cleverness. One such format type can be based on XML. > Of course he's seriously suggesting this, it's how these files work. This is how these files *should* work, ideally. As you can see on http://www.creativyst.com/Doc/Articles/CSV/CSV01.htm#FileFormat, you still have to climb up a decision tree and visit this or that branch in order to parse CSV data in a reliable fashion, unless you know exactly how they are produced. All in all you end with: > Pretty much any format can be > made to work for that. ...provided you sort of reinvent the markup rules and wheels. And disregard your own advice to use a really standardised format (in applications not all under your control.) ;-) -- Georg