From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!news.eternal-september.org!news.eternal-september.org!feeder.eternal-september.org!aioe.org!.POSTED!not-for-mail From: "Dmitry A. Kazakov" Newsgroups: comp.lang.ada Subject: Re: strange behaviour of utf-8 files Date: Sun, 17 Nov 2013 21:38:52 +0100 Organization: cbb software GmbH Message-ID: <10ec0vuld83fy.1t7bduzwsrfe.dlg@40tude.net> References: <73e0853b-454a-467f-9dc7-84ca5b9c29b2@googlegroups.com> <1ghx537y5gbfq.17oazom68d4n6.dlg@40tude.net> <9d00683c-949c-4e88-a161-ebd78b350d39@googlegroups.com> <1w23uq33ul2i8$.wzjpp3evot36.dlg@40tude.net> <5288c584$0$6639$9b4e6d93@newsspool2.arcor-online.net> <52891372$0$6636$9b4e6d93@newsspool2.arcor-online.net> Reply-To: mailbox@dmitry-kazakov.de NNTP-Posting-Host: Ws8cDh6KC0dYMbHlsA0RIw.user.speranza.aioe.org Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Complaints-To: abuse@aioe.org User-Agent: 40tude_Dialog/2.0.15.1 X-Notice: Filtered by postfilter v. 0.8.2 Xref: news.eternal-september.org comp.lang.ada:17711 Date: 2013-11-17T21:38:52+01:00 List-Id: On Sun, 17 Nov 2013 20:05:26 +0100, Georg Bauhaus wrote: > On 17.11.13 15:07, Dmitry A. Kazakov wrote: > >>> ASCII-ism is the soil in which dangerous bugs keep many things >>> from working.(*) >> >> On the contrary, it is a reasonable precaution against sloppy OSes (Linux, >> Windows) incapable to handle encoding safely [*]. The OP just ran into >> that. If he followed the advise he would never have any problems of this >> kind. > >> ------- >> * Preventing a file encoded as X, being read and written as if it were >> encoded as Y. > > Precaution? ASCII could just as well be EBDCI. Firstly, EBCDIC is practically dead. Secondly, you simply cannot compile any Ada program encoded in EBCDIC as if it were ASCII. No chance. UTF-8 was intentionally designed to be compatible with ASCII, which is why there is a trouble with Latin1 which also was an extension of ASCII. Similarly if somebody used KOI-8 thinking it were Latin1 or UTF-8. The problem is that the common part (ASCII) is sufficient for Ada programming while the varying part is subtle enough to cause difficult to detect bugs in string literals. Bugs that cannot be detected by the compiler. > It is unfortunate that 7bit engineers can't swallow their pride > and use extended files attributes available with all semi-modern > and modern file systems and archive formats. What for? In oder to get silly bugs the OP did? -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de