From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,5d4095813b818c7d X-Google-Attributes: gid103376,public X-Google-Language: ENGLISH,ASCII Path: g2news2.google.com!news1.google.com!postnews.google.com!l12g2000cwl.googlegroups.com!not-for-mail From: "Adam Beneschan" Newsgroups: comp.lang.ada Subject: Re: Reading "normal" text files with Wide_Text_IO in GNAT Date: 4 Dec 2006 10:17:35 -0800 Organization: http://groups.google.com Message-ID: <1165256255.486012.132810@l12g2000cwl.googlegroups.com> References: <1164916470.648544.256710@n67g2000cwd.googlegroups.com> NNTP-Posting-Host: 66.126.103.122 Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Trace: posting.google.com 1165256260 12563 127.0.0.1 (4 Dec 2006 18:17:40 GMT) X-Complaints-To: groups-abuse@google.com NNTP-Posting-Date: Mon, 4 Dec 2006 18:17:40 +0000 (UTC) User-Agent: G2/1.0 X-HTTP-UserAgent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.7.12-1.3.1,gzip(gfe),gzip(gfe) Complaints-To: groups-abuse@google.com Injection-Info: l12g2000cwl.googlegroups.com; posting-host=66.126.103.122; posting-account=cw1zeQwAAABOY2vF_g6V_9cdsyY_wV9w Xref: g2news2.google.com comp.lang.ada:7799 Date: 2006-12-04T10:17:35-08:00 List-Id: Bj=F6rn Persson wrote: > Adam Beneschan wrote: > > > However, at first glance, I didn't see a way to get Wide_Text_IO to > > read a UCS-1 text file. > > Hmm, I've never heard of UCS-1. Is such an encoding really defined? I don't know if that's the correct name. I have seen it referenced in a few places. > > This is the encoding where each byte in the > > range 16#00#..16#FF# represents a character in the range > > Wide_Character'Val(16#0000#) .. Wide_Character'Val(16#00FF#), and there > > is no way to represent wide characters from 16#0100# to 16#FFFF#. > > OK, so it's identical to ISO 8859-1. Technically, I thought ISO-8859-1 was a mapping from a range of integers to a set of characters, rather than a specification of how characters are represented in bits in an actual file. I could be wrong. The distinction gets blurry at times. > > Does GNAT's Wide_Text_IO have a way to read a file like this? > > It does indeed look like it can't. Gnat's approach to character encodings= is > amazingly faulty. > > Does EAstrings fill your needs? If not, would you like to join me in > finishing the implementation so we can get rid of these problems? > > http://adacl.sourceforge.net/AdaBrowse/adacl-eastrings.html My question was more theoretical than anything---I was looking at that section of the manual for other reasons, and happened to notice what seemed like an omission. But thanks for the pointer. I'll take a look at it. -- Adam