comp.lang.ada
 help / color / mirror / Atom feed
* re: characters with codes >= 128
@ 1987-09-14 14:28 Jim Moody, DCA C342
  1987-09-15 20:55 ` Erland Sommarskog
  0 siblings, 1 reply; 11+ messages in thread
From: Jim Moody, DCA C342 @ 1987-09-14 14:28 UTC (permalink / raw)


Erland Sommarskog (sommar@enea.uucp) gets to the heart of the
disagreement when he writes:
    And, guys, can't we agree on that it would have been much easier
    if the language definition in one way or another had given place
    for a wider character concept than 128 ASCII codes?
No it wouldn't.  Or at least, easier for whom?
Text_IO, remember, is standard.  That means that all vendors must support
it.  And must support it to all output devices (not just bright terminals).
That means printers with hammers which are limited to the ASCII 95 graphic
characters.  The only reasonable way of requiring vendors to support
something more than the 95 characters plus ASCII.HT would be to make
Text_IO generic.  This brings its own problems:  there is currently no
provision for a generic formal parameter to be restricted to character
type and indeed no requirement that a compiler recognise character types
as a separate semantic category.  I do not know that LRM 3.5.2 is
referenced elsewhere in the LRM.  This means that doing what Sommarskog
wants imposes costs on a vendor/implementor which are not limited to
the Text_IO package but spread into the middle part of the compiler.
If we have a cost of such magnitude, we are entitled to ask what benefit
to the user community as a whole does it produce.  I think that it was a
reasonable decision to limit the standard to the 95 ASCII printables plus
ASCII.HT which means that if someone wants to use other characters, he/she
has to shoulder the cost his/herself rather than have the entire user
community pay.  I emphasis that this is a cost/benefit decision which 
could change in the future.  One of these days, Ada standardisation will
be reopened.  If at that point, it is clear that a substantial segment of
the user community is using or wants to use a bigger character set, the
benefits of centralising the cost of supporting them may outweigh.  I
doubt that it does now.  That is, the cost to Sommarskog of implementing
the subset of text_io which he needs plus the cost to the other users
of implementing the subsets of text_io that they need for the character
sets they want to use is less than the cost (for 137 compilers at last 
count) of requiring vendors to support bigger character sets.  Maybe I'm
wrong.  Maybe there are a thousand applications out there which need
bigger character sets (I think that's the order of magnitude needed for
it to be cheaper on the whole for vendors to support).  If there are, 
then ISO/ANSII/AJPO probably need to be told.

Usual disclaimer:  the opinions expressed are my own and should not be
construed as the opinions of the US Government.

Sorry to go on at such length.

Jim Moody

^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: Characters with codes >= 128
@ 1987-09-10 12:51 "MARTIN J. MOORE"
  0 siblings, 0 replies; 11+ messages in thread
From: "MARTIN J. MOORE" @ 1987-09-10 12:51 UTC (permalink / raw)


> From: colbert <hermix!colbert@rand-unix.ARPA>

> Unfortunately, Martin Moore's solution is NOT portable either.  It only works
> because:
>                      [list of reasons]

He is absolutely correct.  My solution is non-portable and, as I pointed 
out in my original message, erroneous as defined by the LRM.  My purpose in
posting it was to possibly help the original questioner, since the solution
does work on the VAX and may work on other machines.  It wasn't intended to be 
a universal solution.  The approach suggested by Colbert et al is obviously 
the way to go to provide portability.  

				Martin Moore
------

^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: Characters with codes >= 128
@ 1987-09-10  3:47 colbert
  1987-09-10 18:39 ` Barry Margolin
  1987-09-12 14:47 ` Erland Sommarskog
  0 siblings, 2 replies; 11+ messages in thread
From: colbert @ 1987-09-10  3:47 UTC (permalink / raw)



In response to my answer to his question about character types
sommar@seismo.css.gov  (Erland Sommarskog) writes:

>I think Martin Moore's solution was much more simple and elegant. It will
>work on any Ada system that doesn't check character assignments for 
>Constraint_error.
>  This solution requires one hell lot of work and it isn't portable from
>OS to another. Yes, I can write my own Text_IO, but guess how fun I find
>that. And, I will have to write one Text_IO for each OS I want to work
>with. Guess why there is a standard Text_IO. It gives you a standard 
>interface.

Unfortunately, Martin Moore's solution is NOT portable either.  It only works
because:

	1) Unchecked_Conversion is implemented in DEC Ada.

	2) The size of type Character objects in DEC is 8 bits.

	3) DEC did not give a Constraint_Error on the assignemt (which may
	   be a bug in DEC's implementation).

	4) DEC does not "place restrictions on unchecked conversions"
	   (13.10.2 P2);

	5) DEC truncates high order bits if the source value if its size is
	   greater than the size of the target type (this is really only a
	   problem with the specific example given by Moore, in that he used
	   the type Integer as the source type as opposed to an 8 bit type).

The principle benefit of my proposed solution is the creation of a portable
abstraction that represents the problem. Re-implementing a Text I/O for this
type is a small price to pay for this benefit (especially when Moore's
technique can be used in the implementation of this Text I/O - Sufficiently
issolated to prevent major impact on the system that I'm implementing and later
porting [as pointed out by another reader of this group]).


Take Care,
Ed Colbert
hermix!colbert@rand-unix.arpa

P.S.

As an additional comment, at the recent SIGAda Conference, Dr. Dewer indicated
that Unchecked_Conversion could be legally implemented to always return 0 no
matter what the "value" of the source object was.  I did not get a chance to
full nail him down on what he ment by this comment, so may be he will respond
to this message.

^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: Characters with codes >= 128
@ 1987-09-09 13:29 Jim Moody, DCA C342
  0 siblings, 0 replies; 11+ messages in thread
From: Jim Moody, DCA C342 @ 1987-09-09 13:29 UTC (permalink / raw)


It's not clear that there's a conflict between Martin Moore's solution
to the problem and that of colbert @hermix.UUCP.  Colbert is clearly
correct that formally one should create a new version of text_io.
Martin tells you how to do that on certain targets.  There is little
thought required to turn Martin's solution into a full-blown text_io
package (about three minutes, don't invite A. E. Housman's scorn), and
not more than a couple hours typing.  The point is that that makes all
applications which use (say) Thai_text_io portable in the sense that it
isolates the machine dependencies (does Martin'e trick work) into a
single package.  Which, I thought, was the point.

Jim Moody
DCA/JDSSC

^ permalink raw reply	[flat|nested] 11+ messages in thread
* RE: OCharacters with codes >= 128
@ 1987-09-02  2:35 colbert
  1987-09-05 20:43 ` Characters " sommar
  0 siblings, 1 reply; 11+ messages in thread
From: colbert @ 1987-09-02  2:35 UTC (permalink / raw)


You can create your own Character type by defining an enumeration type that
has character literals.

e.g.

	type Character_Type is (Nul, Del, ..., 'A', 'B', ...,
				Koo_Kai, Khoo_Khai, ....);


Where Koo_Kai and Khoo_Khai (etc) are the special characters in your language
(these letters are phonetic Thai characters).

You can then use an enumeration representation specification to map the enumer-
ation literals to the appropriate extended ASCII values.

Once you have this character type defined, you can create a string type by
defining an array of this character type:

e.g.

	type String_Type is array (positive <>) of Character_Type;

This allows you to use string literals such as the following:

	"This is a string of String_Type"  -- may require type qualification

However, you will have to use catenation to create string_type expressions that
contain your countries special characters (and of course non-printable
characters).

E.g.

	"This is a '" & Koo_Kai & "' while this is a '" & Khoo_Khai & "'"

As for the I/O of your language specific characters, you will need to create
a Thai_Text_IO (or something equivalent).  Ada does not say that Text_IO is
the ONLY text I/O package, only that it is the standard text I/O package.  In
this case you need something non-standard.

I hope this is of help.

Take care,

Ed Colbert
Absolute Software
4593 Orchid Dr
Los Angeles, CA 90043-3320
USA
(213) 293-0783
hermix!colbert@rand-unix.arpa			ARPA
(trwrb!, sesmo!, ...)hermix!colbert		UUCP


P.S.  See LRM Sections 2.6 (String Literals), 3.5.2 (Character Types),
		       3.6.3 (The Type String), 4.2 (Literals).

^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: Characters with codes >= 128
@ 1987-08-31 20:47 "MARTIN J. MOORE"
  0 siblings, 0 replies; 11+ messages in thread
From: "MARTIN J. MOORE" @ 1987-08-31 20:47 UTC (permalink / raw)


I encountered the same problem in attempting to use DEC's extended character
set.  I worked around it by using an UNCHECKED_CONVERSION to stick 8-bit
values into CHARACTER objects (thereby making the program erroneous according
to the LRM; however, it worked.) 

For example, to use the DEC control character CSI (= 155) I did:

  function EIGHT_BIT_CHARACTER is new UNCHECKED_CONVERSION (INTEGER, CHARACTER);

  CSI : constant CHARACTER := EIGHT_BIT_CHARACTER (155);

Characters so defined could then be used in string constants, such as
the following:

  ERASE_SCREEN : constant STRING := CSI & "2J";	-- ANSI erase screen command

------------------------------------------------------------------------------
Martin Moore
mooremj@eglin-vax.arpa
------

^ permalink raw reply	[flat|nested] 11+ messages in thread
* Characters with codes >= 128
@ 1987-08-30 21:28 sommar
  1987-09-02 14:24 ` stt
  0 siblings, 1 reply; 11+ messages in thread
From: sommar @ 1987-08-30 21:28 UTC (permalink / raw)



I'd like to write a programme that can handle text which contains
characters from an extented ASCII set for covering national 
characters. The LRM seems to totally disregard this, since it
states that the character type is ASCII with 128 possible values.
Also, Ada only allows you to have printable characters within
strings. And printable is defined as the range ' '..'~'. 
  Easy, you might say. Just define a new character type. How?
I can't have quoted strings for the new characters, since they
are "non-printing". I can't extend the ASCII package (in STANDARD),
since it relies on that the character type is already defined.
  And even if I succeed somhow, how to with Text_io? Will the 
compiler accept attempt to give Text_io the new character type,
even if it's called "character"? Hardly.
  Have I missed someting? I hope. If not, THIS IS A VERY SERIOUS
RESTRICTION IN ADA.

I should add that to some extent it is possible to handle these
characters. My Ada system (Verdix 5.2A for VAX/Unix) doesn't mind
if I read an extended character from a file or if I try to write
it. Character'val(ch) on the character returns the correct code.
But Character'pos(Character'val(ch)) raises Contraint_error if
ch is from the upper half.
  But this only a little. I want string constants in my programme,  
it's dead. What do? Read them from a file at start-up? :-)
-- 

Erland Sommarskog       
ENEA Data, Stockholm    
sommar@enea.UUCP        

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~1987-09-15 20:55 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1987-09-14 14:28 characters with codes >= 128 Jim Moody, DCA C342
1987-09-15 20:55 ` Erland Sommarskog
  -- strict thread matches above, loose matches on Subject: below --
1987-09-10 12:51 Characters " "MARTIN J. MOORE"
1987-09-10  3:47 colbert
1987-09-10 18:39 ` Barry Margolin
1987-09-12 14:47 ` Erland Sommarskog
1987-09-09 13:29 Jim Moody, DCA C342
1987-09-02  2:35 OCharacters " colbert
1987-09-05 20:43 ` Characters " sommar
1987-08-31 20:47 "MARTIN J. MOORE"
1987-08-30 21:28 sommar
1987-09-02 14:24 ` stt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox