From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: a07f3367d7,802ccdc10f849020 X-Google-Attributes: gida07f3367d7,public,usenet X-Google-NewGroupId: yes X-Google-Language: ENGLISH,ASCII-7-bit X-Received: by 10.224.18.132 with SMTP id w4mr14034338qaa.1.1362666005741; Thu, 07 Mar 2013 06:20:05 -0800 (PST) X-Received: by 10.50.7.163 with SMTP id k3mr3228414iga.1.1362666005592; Thu, 07 Mar 2013 06:20:05 -0800 (PST) Path: p7ni382qai.0!nntp.google.com!dd2no7998209qab.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail Newsgroups: comp.lang.ada Date: Thu, 7 Mar 2013 06:20:05 -0800 (PST) In-Reply-To: Complaints-To: groups-abuse@google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=122.30.85.243; posting-account=Mi71UQoAAACnFhXo1NVxPlurinchtkIj NNTP-Posting-Host: 122.30.85.243 References: User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: <5e5e7e80-7d69-47e1-9550-19e2e0a211a9@googlegroups.com> Subject: Re: string and wide string usage From: ytomino Injection-Date: Thu, 07 Mar 2013 14:20:05 +0000 Content-Type: text/plain; charset=ISO-8859-1 Date: 2013-03-07T06:20:05-08:00 List-Id: On Thursday, March 7, 2013 8:12:01 PM UTC+9, Ali Bendriss wrote: > I've got some problem with some string in example: > a base 64 encoded string > V2luZG93c8KgNyBQcm9mZXNzaW9ubmVsIE4= > wich decode to 'Windows\xa07 Professionnel N' in utf-8 > every thing is working if I feed directly the database, but if want to > apply Ada.Characters.Handling.To_Lower on the string before feeding the > database postgres is not happy > 'ERROR: invalid byte sequence for encoding "UTF8": 0xe2 0xa0 0x37' > it's not really a big deal, but I would like to understand where the > problem is. Do I have to use wide string ? Because functions in Ada.Characters.Handling take not UTF-8 but Latin-1. You have to 1. convert UTF-8 String to Wide_Wide_String, process UTF-32 and restore it to UTF-8. (Ada.Characters.Conversion also take Latin-1. You have to use GNAT.Encode_String/Decode_String or Ada.Strings.UTF_Encoding for converting.) 2. search a external library to process UTF-8 directly.