"Manuel Collado" <m.collado@lml.ls.fi.upm.es> wrote in message
news:4507de42@news.upm.es...
> Randy Brukardt escribi�:
...
> > In any case, one of the big advantages of using ASIS over writing your
own
> > parser is that the resulting program is independent of the character set
> > used. So it works with anything supported by your compiler vendor (and
still
> > does if you change vendors). ASIS code that depends on the input source
> > representation (which is not defined by Ada anyway) is probably broken.
And
> > there is no chance of any sort of agreement on source representations
for
> > ASIS (or even the naming of them) if there isn't be any for Ada.
>
> I'm not sure to understand you. Some style checks depend on source code
> representation. Like non-uniform casing for identifiers (mixing alpha
> and Alpha in the same source).
>
> Am I missing anything?

Apparently. The source of an Ada 2005 program is described in terms of
Unicode characters. (Ada 95 is similar). Similarly, Wide_String  and
Wide_Wide_String are defined in terms of Unicode. The actual source
representation is implementation-defined, but it is logically converted into
Unicode characters when it is processed. (Not all compilers actually do this
for efficiency reasons, but that's what the Standard says.)

So, an ASIS routine that returns an identifier in a Wide_String should be
returning it in a particular Unicode encoding. If it doesn't do that, it's
wrong.

Indeed, Ada 2005 defines identifier equivalence in terms of the Unicode
casing rules; if you are using a non-Unicode encoding, that will require
some translation somewhere.

Because, of this, an ASIS program to check style only needs to be written in
terms of Wide_String and/or Wide_Wide_String encoding -- you shouldn't see
anything else. (Another message here says that GNAT gets this wrong, which
doesn't surprise me at all given past ARG discussions on this topic.)
Encodings (other than that defined for Wide_String) have nothing to do with
it (unless you want to write a modified version of the program in a
different encoding - but I suggest just sticking to UTF-8).

                         Randy.