From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: a07f3367d7,31af760e939556ef X-Google-Attributes: gida07f3367d7,public,usenet X-Google-NewGroupId: yes X-Google-Language: ENGLISH,ASCII Path: g2news2.google.com!postnews.google.com!b15g2000yqd.googlegroups.com!not-for-mail From: Adam Beneschan Newsgroups: comp.lang.ada Subject: Re: Interpretation of extensions different from Unix/Linux? Date: Mon, 17 Aug 2009 17:32:59 -0700 (PDT) Organization: http://groups.google.com Message-ID: <6f80c882-fa03-4ca9-a53e-fae34cea160d@b15g2000yqd.googlegroups.com> References: <8a5f3b98-1c5a-4d47-aca7-e106d1223fa9@a26g2000yqn.googlegroups.com> <87skg7952j.fsf@jspa-nykredit.sparre-andersen.dk> <1f999bfa99erz$.9b8p6yymr8x7$.dlg@40tude.net> NNTP-Posting-Host: 66.126.103.122 Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: posting.google.com 1250555579 30816 127.0.0.1 (18 Aug 2009 00:32:59 GMT) X-Complaints-To: groups-abuse@google.com NNTP-Posting-Date: Tue, 18 Aug 2009 00:32:59 +0000 (UTC) Complaints-To: groups-abuse@google.com Injection-Info: b15g2000yqd.googlegroups.com; posting-host=66.126.103.122; posting-account=duW0ogkAAABjRdnxgLGXDfna0Gc6XqmQ User-Agent: G2/1.0 X-HTTP-UserAgent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; WOW64; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.5.21022; .NET CLR 3.5.30729; .NET CLR 3.0.30618),gzip(gfe),gzip(gfe) Xref: g2news2.google.com comp.lang.ada:7847 Date: 2009-08-17T17:32:59-07:00 List-Id: On Aug 17, 3:28=A0pm, "Randy Brukardt" wrote: > The problem here is that String really is not the right type, but since y= ou > can't have string literals for private types in Ada, you can't make it a > private type. (And if you could have string literals, it still couldn't b= e > used with the existing I/O packages, it would be way too incompatible.) That wouldn't even be an issue if UTF-8 were strictly a "storage format" as you called it above. If that were the case, you wouldn't need string literals for it. I think the problem is that UTF-8 is something of a hybrid. If all characters in the string are in the 32..126 range, the "sequence of octets" stored in the UTF-8 string is identical to the graphic characters stored in a String. (UTF-8 was designed purposefully so that would happen.) In cases like that, it makes sense to use a string literal. I can understand how "hybrid" types like that can cause headaches for fans of strong typing. A "String" should be an array of graphic or control characters; an encoded type should be a sequence of octets; but here we need a type that is sometimes one thing and sometimes another. So I can respect the decision to make these String types, but I don't like it. Also, I'm afraid that using String can backfire. If I understand it correctly, the decision was that the Name parameter of Text_IO.Open should be interpreted as a UTF-8 octet sequence even though it's a String. But the intent is to allow string literals. At some point, though, some poor innocent programmer in Germany or Spain is going to try to use a string literal (or a Latin-1 string variable) with an umlaut or an accented vowel in it and get totally screwed up since those characters don't represent themselves in UTF-8 encoding, and they'll end up puzzling over how their program created a file with a Chinese character in the middle of the name. (Yeah, I know, that's very unlikely; most likely the UTF-8 encoding will simply be invalid.) -- Adam