From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,bde6706c124e6eed X-Google-Attributes: gid103376,public X-Google-Language: ENGLISH,ASCII Path: g2news1.google.com!news4.google.com!news2.volia.net!news.germany.com!sumatra.thomas-huehn.de!texta.sil.at!newsfeed.inode.at!news.hispeed.ch!linux2.krischik.com!news From: Martin Krischik Newsgroups: comp.lang.ada Subject: Re: Filenames in Ada Date: Sun, 27 Nov 2005 11:21:19 +0100 Organization: Cablecom Newsserver Message-ID: <1255659.7PSTQaQJvX@linux1.krischik.com> References: <1653090.31FM62oI6I@linux1.krischik.com> NNTP-Posting-Host: 84-73-3-231.dclient.hispeed.ch Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Transfer-Encoding: 8Bit X-Trace: news.hispeed.ch 1133089205 19552 84.73.3.231 (27 Nov 2005 11:00:05 GMT) X-Complaints-To: news@hispeed.ch NNTP-Posting-Date: Sun, 27 Nov 2005 11:00:05 +0000 (UTC) User-Agent: KNode/0.9.2 Xref: g2news1.google.com comp.lang.ada:6646 Date: 2005-11-27T11:21:19+01:00 List-Id: Bj�rn Persson wrote: > Let's see if I understand the problem. Windows has two functions for > each file operation, one -A version that expects or returns a file name > in some 8-bit encoding like Windows-1252, and one -W version that > expects or returns a file name in UTF-16 or maybe UCS-2? Well the Windows API in question where designed at a time when UTF-16 and UCS-2 where still the same - that is Unicode had no codes defined above the 65535 border. At that time programmers did not care - or understood - the difference between the two. VFAT-32 is most likely a UCS-2 filesystem (anyone from china to confirm that?). I remember an article about the "new" VFAT technology wasting "enormous" amount of storrage using UCS-2 for character encoding. Obviously the article came from an Latin-1 based country ;-) . > And all the > file operations in the Ada library take and return file names as String, > that is, Latin-1? And Gnat's implementation pretends that Latin-1 is > identical to whatever 8-bit encoding Windows is using, and passes these > Strings to Windows' -A functions, leaving you with no way to handle > filenames that can't be expressed in said 8-bit encoding? Is that right? Yes indeed. But I take it that on a Russian system the Windows-1251 code page is active and all filenames are expressed using that and not Latin 1. > It is my intention to add an encoding-aware interface to Ada.Directories > under EAstrings.OS. For that to work reasonably on Windows, this problem > needs to be solved. I suppose I also need to fix this in EAstrings.IO. I > will need help from a Windows programmer to do this. (Of course I also > need to get transcoding implemented on Windows before EAstrings will be > of any use there.) It is sad that XML/Ada has no UCS-2 and UCS-4 convertion available - but AdaCL allready has that - so not problem for you really. > It seems that the right thing to do would be to tap into the Gnat > library and make UTF-16 (or UCS-2) versions of the file operations. It > could be as easy as changing the parameter type and replacing calls to > the Windows functions with their -W equivalents, or it could be very > hairy. I had that idea as well and did take a look. Lots of "pragma Import" there. > We'll need to determine whether it is UTF-16 or UCS-2. This page lists > code page numbers for a whole lot of encodings, but UTF-16 is missing: > > http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_81rn.asp > > I take that as a hint that UTF-16 is Windows' idea of wide strings, and > that all the others are considered "multi-byte character sets" or > whatever the term is. Well there seems an better article: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/fs/naming_a_file.asp I wonder about that \\?\ stuff and what it really means Martin -- mailto://krischik@users.sourceforge.net Ada programming at: http://ada.krischik.com