From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=0.5 required=5.0 tests=BAYES_00, PP_MIME_FAKE_ASCII_TEXT,REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: 103376,da3af210412d89fd X-Google-Attributes: gid103376,public,usenet X-Google-Language: ENGLISH,ASCII Path: g2news2.google.com!news1.google.com!border1.nntp.dca.giganews.com!nntp.giganews.com!wn13feed!worldnet.att.net!bgtnsc05-news.ops.worldnet.att.net.POSTED!53ab2750!not-for-mail Newsgroups: comp.lang.ada From: anon@anon.org (anon) Subject: Re: Range types Reply-To: anon@anon.org (anon) References: <1192994157.867598@athprx04> <471bb318$0$27835$39db0f71@news.song.fi> <471BC497.5060601@gmail.com> <1193051690.350063@athprx04> X-Newsreader: IBM NewsReader/2 2.0 Message-ID: Date: Tue, 23 Oct 2007 23:52:58 GMT NNTP-Posting-Host: 12.64.134.122 X-Complaints-To: abuse@worldnet.att.net X-Trace: bgtnsc05-news.ops.worldnet.att.net 1193183578 12.64.134.122 (Tue, 23 Oct 2007 23:52:58 GMT) NNTP-Posting-Date: Tue, 23 Oct 2007 23:52:58 GMT Organization: AT&T Worldnet Xref: g2news2.google.com comp.lang.ada:2558 Date: 2007-10-23T23:52:58+00:00 List-Id: -- -- Package example: -- -- For non Greek keyboards use Wide_Character -- { [""][""] } -- where and are 4-hex-digits to represents -- the two Wide_Character values -- -- Example is -- ["03d6"]["1eee"] -- valid Greek string -- ["03d6"]h -- is not valid because character "h" is not -- a valid Greek character. -- with Ada.Wide_Text_IO ; use Ada.Wide_Text_IO ; procedure tst is -- -- Internal Packages: -- package Greek is function Is_Greek_Character ( GC : Wide_Character ) return Boolean ; function Is_Greek_Character_2 ( GC : Wide_Character ) return Boolean ; function Is_Greek_Character ( GS : Wide_String ) return Boolean ; end Greek ; -- -- Internal Body Package -- package body Greek is -- --------------------------- -- -- Use for Is_Greek_Character -- -- --------------------------- -- -- -- creates a greek constraint type -- subtype Greek_Base is Wide_Character range Wide_Character'Val ( 16#370# ) .. Wide_Character'Val ( 16#1FFF# ) ; -- -- creates an excluded type -- subtype Greek_Exclude_Subtype is Greek_Base range Greek_Base'Val ( 16#03D8# ) .. Greek_Base'Val ( 16#0FFF# ) ; -- ----------------------------- -- -- Use for Is_Greek_Character_2 -- -- ----------------------------- -- -- -- create lower greek characters type -- subtype Lower_Greek_Character is Wide_Character range Wide_Character'Val ( 16#0370# ) .. Wide_Character'Val ( 16#03D7# ) ; -- -- create upper greek characters type -- subtype Upper_Greek_Character is Wide_Character range Wide_Character 'Val ( 16#1000# ) .. Wide_Character 'Val ( 16#1FFF# ) ; -- -- Is_Greek_Character -- function Is_Greek_Character ( GC : Wide_Character ) return Boolean is begin -- -- Is character within the Greek base -- if GC in Greek_Base then -- -- Is character apart of the the non-Greek sub type -- if GC in Greek_Exclude_Subtype then return False ; else return True ; end if ; else return False ; end if ; end ; -- -- Is_Greek_Character version number 2 -- function Is_Greek_Character_2 ( GC : Wide_Character ) return Boolean is begin -- -- Could use: -- -- when Lower_Greek_Character | Upper_Greek_Character => -- return True ; -- case GC is when Lower_Greek_Character => return True ; when Upper_Greek_Character => return True ; when others => return False ; end case ; end ; function Is_Greek_Character ( GS : Wide_String ) return Boolean is begin -- -- Could use: -- -- for Index in 1 .. GS'Length loop -- -- if index-character of a string is not a Greek character -- if not Is_Greek_Character_2 ( GS ( Index ) ) then return False ; end if ; end loop ; -- -- String contains all Greek characters -- return True ; end ; end Greek ; stz : wide_string ( 1..2 ) ; use Greek ; begin -- put ( "Enter (2 character Greek string) => " ) ; get ( stz ) ; -- put ( "Testing => " ) ; put ( stz ) ; new_line ; -- if Is_Greek_Character ( stz ) then put_line ( "Greek String ? => Yes" ) ; else put_line ( "Greek String ? => No" ) ; -- -- Char 1 ? -- if Is_Greek_Character ( stz ( 1 ) ) then put_line ( "Character (1) Greek ? => Yes" ) ; else put_line ( "Character (1) Greek ? => No" ) ; end if ; -- -- Char 2 ? -- if Is_Greek_Character_2 ( stz ( 2 ) ) then put_line ( "Character (2) Greek ? => Yes" ) ; else put_line ( "Character (1) Greek ? => No" ) ; end if ; end if ; -- end tst ; In <1193051690.350063@athprx04>, Christos Chryssochoidis writes: >Jacob Sparre Andersen wrote: >> Christos Chryssochoidis wrote: >> >>> I would like to define a subtype of Wide_Character for a program >>> that processes (unicode) text. This type would represent the Greek >>> letters. >> >> This sounds like what enumerated types are for. You could do it like >> this: >> >> type Faroese_Letter is ('a', 'A', 'b', 'B', 'd', 'D', '�', '�', >> 'e', 'E', [...], >> 'y', 'Y', '�', '�', '�', '�', '�', '�'); >> -- optional representation clause >> >> function To_Wide_Wide_Character (Item : in Faroese_Letter) >> return Wide_Wide_Character; >> >> function To_Faroese_Letter (Item : in Wide_Wide_Character) >> return Faroese_Letter; >> >> The conversion functions could make use of representation clauses, >> "Image" and "Value" functions, or tables. >> >>> Greek letters in Unicode, with all their diacritics, are >>> located in two separate ranges: 0370 - 03D7 and 1F00 - 1FFF. That's >>> 360 characters to write in an enumeration... Since gaps are not >>> allowed in ranges, I 'm thinking instead of defining such a type, to >>> define a function that would accept a Wide_Character as argument and >>> return a boolean value indicating whether the given Wide_Character >>> falls in the ranges of the Greek characters. >> >> This could be done very simply using Ada.Strings.Maps. >> >> How you should do it depends strongly on what you actually need your >> Greek_Letter type for. >> >> Greetings, >> >> Jacob > >Thanks! Ada.Strings.Wide_Maps seems very helpful for what I want to do. >Basically, what I would like to do is to write a program that given a >text file in utf8 encoding, which would contain ancient greek text, >which is written with all the diacritic marks on the letters, this >program would load the contents of the file in memory, strip the >in-memory text contents from all the diacritics except those used in >today's "modern" Greek, and write the modified contents to a new file of >the user's choosing. For this it would be nice if there were some >package for regular expressions for Ada. Then if I succeeded in the >mentioned task, I 'd like to do some natural language processing (NLP, >that is linguistics processing) with my program, but I don't know if Ada >would be an appropriate language for such a task (NLP). I've seen on the >web references to NLP applications with functional languages or logic >programming languages, but not many implemented with imperative >languages... (Sorry for getting of topic...) > >Thanks very much, >Christos