From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable autolearn_force=no version=3.4.4 Path: border2.nntp.dca.giganews.com!nntp.giganews.com!news.bbs-scene.org!weretis.net!feeder1.news.weretis.net!feeder4.news.weretis.net!rt.uk.eu.org!nntp-feed.chiark.greenend.org.uk!ewrotcd!reality.xs3.de!news.jacob-sparre.dk!loke.jacob-sparre.dk!pnx.dk!.POSTED!not-for-mail From: "Randy Brukardt" Newsgroups: comp.lang.ada Subject: Re: strange behaviour of utf-8 files Date: Thu, 21 Nov 2013 19:03:29 -0600 Organization: Jacob Sparre Andersen Research & Innovation Message-ID: References: <73e0853b-454a-467f-9dc7-84ca5b9c29b2@googlegroups.com> <1ghx537y5gbfq.17oazom68d4n6.dlg@40tude.net> <5bf1b290-70bc-4240-b27c-120ce6b0b840@googlegroups.com> <7464679c-6b98-4e23-a337-83b671473553@googlegroups.com> NNTP-Posting-Host: static-69-95-181-76.mad.choiceone.net X-Trace: loke.gir.dk 1385082213 3457 69.95.181.76 (22 Nov 2013 01:03:33 GMT) X-Complaints-To: news@jacob-sparre.dk NNTP-Posting-Date: Fri, 22 Nov 2013 01:03:33 +0000 (UTC) X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 6.00.2900.5931 X-RFC2646: Format=Flowed; Original X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 Xref: number.nntp.dca.giganews.com comp.lang.ada:183965 Date: 2013-11-21T19:03:29-06:00 List-Id: "Stoik" wrote in message news:7464679c-6b98-4e23-a337-83b671473553@googlegroups.com... > Thanks for your comments. It is obviously a question of having a different > encoding in the > editor and the compiler. I forgot to add the -gnatW8 switch to the > compiler (this should be > a default, I believe). Ada 2012 requires compilers to accept UTF-8 source code. But given that Ada source code historically is Latin-1, it's very unlikely that compilers would change the default setting. The effect would be to break the compilation of much existing source, a step that most compiler vendors would never take. Speaking as a vendor, Janus/Ada has a number of default switches that would never be the default choices today. But changing the defaults breaks *everyone's* build scripts; it's just so disruptive that it's not something that we would do unless there was no other choice. It makes command line use of compilers with an extensive history harder than we would like, but that's the price of having customers that go way back. If UTF-8 files were somehow identified as such, we could have friendlier defaults -- but since the use of the BOM is optional (and discouraged in recent Unicode standards), and there are no encoding attributes in common file systems (Windows, Linux) -- there really isn't much that we can do. This is going to remain a mess for a long time to come, I fear. Randy. P.S. Truth-in-advertising: Janus/Ada *only* takes Latin-1 input; it has no support for any other encoding (of course it supports Wide_String at runtime). That will have to change as we migrate to Ada 2012, but it probably will be a while before that happens (not a lot of demand).