From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!news.eternal-september.org!news.eternal-september.org!feeder.eternal-september.org!gegeweb.org!dedibox.gegeweb.org!gegeweb.eu!nntpfeed.proxad.net!proxad.net!feeder2-2.proxad.net!newsfeed.arcor.de!newsspool1.arcor-online.net!news.arcor.de.POSTED!not-for-mail Date: Sun, 17 Nov 2013 20:05:26 +0100 From: Georg Bauhaus User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:24.0) Gecko/20100101 Thunderbird/24.0.1 MIME-Version: 1.0 Newsgroups: comp.lang.ada Subject: Re: strange behaviour of utf-8 files References: <73e0853b-454a-467f-9dc7-84ca5b9c29b2@googlegroups.com> <1ghx537y5gbfq.17oazom68d4n6.dlg@40tude.net> <9d00683c-949c-4e88-a161-ebd78b350d39@googlegroups.com> <1w23uq33ul2i8$.wzjpp3evot36.dlg@40tude.net> <5288c584$0$6639$9b4e6d93@newsspool2.arcor-online.net> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-5; format=flowed Content-Transfer-Encoding: 7bit Message-ID: <52891372$0$6636$9b4e6d93@newsspool2.arcor-online.net> Organization: Arcor NNTP-Posting-Date: 17 Nov 2013 20:05:22 CET NNTP-Posting-Host: 54813ac5.newsspool2.arcor-online.net X-Trace: DXC=TnIbjCh8=IP<<0iRN7DLEQA9EHlD;3YcR4Fo<]lROoRQ8kFejVX[k3<:EhI9ZP^ZB7;]U]K_W X-Complaints-To: usenet-abuse@arcor.de Xref: news.eternal-september.org comp.lang.ada:17710 Date: 2013-11-17T20:05:22+01:00 List-Id: On 17.11.13 15:07, Dmitry A. Kazakov wrote: >> ASCII-ism is the soil in which dangerous bugs keep many things >> from working.(*) > > On the contrary, it is a reasonable precaution against sloppy OSes (Linux, > Windows) incapable to handle encoding safely [*]. The OP just ran into > that. If he followed the advise he would never have any problems of this > kind. > ------- > * Preventing a file encoded as X, being read and written as if it were > encoded as Y. Precaution? ASCII could just as well be EBDCI. When the OS's programming interface does not suggest studying the file type, then the best thing one can do reading a text file is to rely on the data---UTF-NN has a BOM, which is better than nothing, and certainly is better than the any 7bit (or 8bit) ambiguities. It is unfortunate that 7bit engineers can't swallow their pride and use extended files attributes available with all semi-modern and modern file systems and archive formats.