comp.lang.ada
 help / color / mirror / Atom feed
* Text_IO on WinNT  problem
@ 2001-09-25 16:04 Alfred Hilscher
  2001-09-25 16:13 ` Lutz Donnerhacke
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Alfred Hilscher @ 2001-09-25 16:04 UTC (permalink / raw)


Hi,

can I read (ASCII-) files that are stored as UNICODE with Text_IO ?

I have a file that contains only ASCII text but is coded in UNICODE
(e.g. "ABC" is stored as hex 41 00 42 00 43 00 instead of 41 42 43). Is
there a way to handle such files via Text_IO ? I work with GNAT on WinNT
4.0.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Text_IO on WinNT  problem
  2001-09-25 16:04 Text_IO on WinNT problem Alfred Hilscher
@ 2001-09-25 16:13 ` Lutz Donnerhacke
  2001-09-25 18:07   ` David Botton
  2001-09-25 17:36 ` Ted Dennison
  2001-09-25 17:41 ` David Starner
  2 siblings, 1 reply; 10+ messages in thread
From: Lutz Donnerhacke @ 2001-09-25 16:13 UTC (permalink / raw)


* Alfred Hilscher wrote:
>can I read (ASCII-) files that are stored as UNICODE with Text_IO ?

Ever tried using Wide_Char?



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Text_IO on WinNT  problem
  2001-09-25 16:04 Text_IO on WinNT problem Alfred Hilscher
  2001-09-25 16:13 ` Lutz Donnerhacke
@ 2001-09-25 17:36 ` Ted Dennison
  2001-09-25 17:41 ` David Starner
  2 siblings, 0 replies; 10+ messages in thread
From: Ted Dennison @ 2001-09-25 17:36 UTC (permalink / raw)


In article <3BB0AAFD.9F006F92@icn.siemens.de>, Alfred Hilscher says...
>I have a file that contains only ASCII text but is coded in UNICODE
>(e.g. "ABC" is stored as hex 41 00 42 00 43 00 instead of 41 42 43). Is
>there a way to handle such files via Text_IO ? I work with GNAT on WinNT
>4.0.

Have you tried Wide_Text_IO?

---
T.E.D.    homepage   - http://www.telepath.com/dennison/Ted/TED.html
          home email - mailto:dennison@telepath.com



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Text_IO on WinNT  problem
  2001-09-25 16:04 Text_IO on WinNT problem Alfred Hilscher
  2001-09-25 16:13 ` Lutz Donnerhacke
  2001-09-25 17:36 ` Ted Dennison
@ 2001-09-25 17:41 ` David Starner
  2001-09-25 18:48   ` Richard Riehle
  2 siblings, 1 reply; 10+ messages in thread
From: David Starner @ 2001-09-25 17:41 UTC (permalink / raw)


On Tue, 25 Sep 2001 18:04:13 +0200, Alfred Hilscher wrote:
> Hi,
> 
> can I read (ASCII-) files that are stored as UNICODE with Text_IO ?
> 
> I have a file that contains only ASCII text but is coded in UNICODE
> (e.g. "ABC" is stored as hex 41 00 42 00 43 00 instead of 41 42 43). Is
> there a way to handle such files via Text_IO ? I work with GNAT on WinNT
> 4.0.

Looking at the GNAT Reference manual, under Wide_Text_IO, there's no way
to load it in as Wide_Character. There's no standard way to handle it
with Text_IO - this seems like a fairly unusual case - but you could
always try reading them in one by one and discarding half of them.

BTW, Unicode refers to a enumerated set of characters and the associated
standard. There are various encoding forms; the one you refer to is
UTF-16 - UTF-16LE, to be specific, as whether UTF-16 is big-endian or
little-endian is ambigious. To that end, UTF-16 often starts with FEFF
so you can tell whether it's big endian (FE FF) or little endian (FF
FE). You probably want to make sure you don't accidently read that in as
ASCII, and that's the right way around.

-- 
David Starner - dstarner98@aasaa.ofe.org
Pointless website: http://dvdeug.dhis.org
When the aliens come, when the deathrays hum, when the bombers bomb,
we'll still be freakin' friends. - "Freakin' Friends"



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Text_IO on WinNT  problem
  2001-09-25 16:13 ` Lutz Donnerhacke
@ 2001-09-25 18:07   ` David Botton
  0 siblings, 0 replies; 10+ messages in thread
From: David Botton @ 2001-09-25 18:07 UTC (permalink / raw)
  To: comp.lang.ada

Take a look in the RM (A.11) for Ada.Wide_Text_IO;

David Botton

"Alfred Hilscher" <Alfred.Hilscher@icn.siemens.de> wrote in message
news:3BB0AAFD.9F006F92@icn.siemens.de...
> can I read (ASCII-) files that are stored as UNICODE with Text_IO ?





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Text_IO on WinNT  problem
  2001-09-25 17:41 ` David Starner
@ 2001-09-25 18:48   ` Richard Riehle
  2001-09-25 20:07     ` David Starner
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Riehle @ 2001-09-25 18:48 UTC (permalink / raw)


David Starner wrote:

> Looking at the GNAT Reference manual, under Wide_Text_IO, there's no way
> to load it in as Wide_Character. There's no standard way to handle it
> with Text_IO - this seems like a fairly unusual case - but you could
> always try reading them in one by one and discarding half of them.

See Ada.Characters.Handling,  ALRM, Annex A.3.2

      function To_Character(Item : in Wide_Character;
                                           Substitute : in Character := ' ')
return Character;

and other useful functions for this kind of thing.

Richard Riehle
richard@adaworks.com






^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Text_IO on WinNT  problem
  2001-09-25 18:48   ` Richard Riehle
@ 2001-09-25 20:07     ` David Starner
  2001-09-25 21:44       ` Larry Kilgallen
  0 siblings, 1 reply; 10+ messages in thread
From: David Starner @ 2001-09-25 20:07 UTC (permalink / raw)


On Tue, 25 Sep 2001 11:48:04 -0700, Richard Riehle <richard@adaworks.com> wrote:
> David Starner wrote:
> 
>> Looking at the GNAT Reference manual, under Wide_Text_IO, there's no way
>> to load it in as Wide_Character. There's no standard way to handle it
>> with Text_IO - this seems like a fairly unusual case - but you could
>> always try reading them in one by one and discarding half of them.
> 
> See Ada.Characters.Handling,  ALRM, Annex A.3.2
> 
>       function To_Character(Item : in Wide_Character;
>                                            Substitute : in Character := ' ')
> return Character;

As I pointed out, GNAT can't read in UTF-16LE using Wide_Text_IO. The 
problem is getting them into the program, not converting them to the 
right form.

-- 
David Starner - dstarner98@aasaa.ofe.org
Pointless website: http://dvdeug.dhis.org
When the aliens come, when the deathrays hum, when the bombers bomb,
we'll still be freakin' friends. - "Freakin' Friends"



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Text_IO on WinNT  problem
  2001-09-25 20:07     ` David Starner
@ 2001-09-25 21:44       ` Larry Kilgallen
  2001-09-25 23:14         ` Mark Johnson
  0 siblings, 1 reply; 10+ messages in thread
From: Larry Kilgallen @ 2001-09-25 21:44 UTC (permalink / raw)


In article <9oqo67$8a21@news.cis.okstate.edu>, David Starner <dvdeug@x8b4e53cd.dhcp.okstate.edu> writes:

> As I pointed out, GNAT can't read in UTF-16LE using Wide_Text_IO.

Why is that ?  (asked by somebody who does not know GNAT)



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Text_IO on WinNT  problem
  2001-09-25 21:44       ` Larry Kilgallen
@ 2001-09-25 23:14         ` Mark Johnson
  2001-09-26  2:39           ` David Starner
  0 siblings, 1 reply; 10+ messages in thread
From: Mark Johnson @ 2001-09-25 23:14 UTC (permalink / raw)


Larry Kilgallen wrote:

> In article <9oqo67$8a21@news.cis.okstate.edu>, David Starner <dvdeug@x8b4e53cd.dhcp.okstate.edu> writes:
>
> > As I pointed out, GNAT can't read in UTF-16LE using Wide_Text_IO.
>
> Why is that ?  (asked by somebody who does not know GNAT)

A quick look at the GNAT reference manual describes six different encoding schemes - none of which appear
to look like UTF-16LE. I don't have enough background to tell if one of the following matches UTF-16LE or
not.... After reading the description that goes with each one, my guess is not.

`h'
     Hex ESC encoding
`u'
     Upper half encoding
`s'
     Shift-JIS encoding
`e'
     EUC Encoding
`8'
     UTF-8 encoding
`b'
     Brackets encoding

I'll assume the original author may have already gone down that path.

Another idea - if UTF-16LE is a series of 16 bit values, you could read the file as a series of 16 bit
integers & convert it into something you can use the hard way.

Another alternative is to request the enhancement (assuming ACT support).
 --Mark





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Text_IO on WinNT  problem
  2001-09-25 23:14         ` Mark Johnson
@ 2001-09-26  2:39           ` David Starner
  0 siblings, 0 replies; 10+ messages in thread
From: David Starner @ 2001-09-26  2:39 UTC (permalink / raw)


On Tue, 25 Sep 2001 18:14:43, Mark Johnson <mark_h_johnson@raytheon.com> wrote:
> Larry Kilgallen wrote:
> 
>> In article <9oqo67$8a21@news.cis.okstate.edu>, David Starner <dvdeug@x8b4e53cd.dhcp.okstate.edu> writes:
>>
>> > As I pointed out, GNAT can't read in UTF-16LE using Wide_Text_IO.
>>
>> Why is that ?  (asked by somebody who does not know GNAT)

Because it hasn't been implemented yet?
 
> A quick look at the GNAT reference manual describes six different encoding schemes - none of which appear
> to look like UTF-16LE. I don't have enough background to tell if one of the following matches UTF-16LE or
> not.... After reading the description that goes with each one, my guess is not.
[...]
> I'll assume the original author may have already gone down that path.

Yep. Basically, the only real Unicode encoding that GNAT can read in
is UTF-8. 

> Another alternative is to request the enhancement (assuming ACT support).

To someone familar with GNAT, the addition would probably take less than
an hour. If someone with a contract requested it, it would probably be
added promptly. I have considered doing some minor GNAT hacking for 
more Unicode support, when it gets into GCC. I may try adding this then.

-- 
David Starner - dstarner98@aasaa.ofe.org
Pointless website: http://dvdeug.dhis.org
When the aliens come, when the deathrays hum, when the bombers bomb,
we'll still be freakin' friends. - "Freakin' Friends"



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2001-09-26  2:39 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-09-25 16:04 Text_IO on WinNT problem Alfred Hilscher
2001-09-25 16:13 ` Lutz Donnerhacke
2001-09-25 18:07   ` David Botton
2001-09-25 17:36 ` Ted Dennison
2001-09-25 17:41 ` David Starner
2001-09-25 18:48   ` Richard Riehle
2001-09-25 20:07     ` David Starner
2001-09-25 21:44       ` Larry Kilgallen
2001-09-25 23:14         ` Mark Johnson
2001-09-26  2:39           ` David Starner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox