From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=0.4 required=5.0 tests=BAYES_00,FORGED_MUA_MOZILLA autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: 103376,d778a4f52acd9d43 X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII-7-bit Received: by 10.68.75.170 with SMTP id d10mr7739951pbw.6.1324553853399; Thu, 22 Dec 2011 03:37:33 -0800 (PST) Path: lh20ni51441pbb.0!nntp.google.com!news2.google.com!news3.google.com!feeder1-2.proxad.net!proxad.net!feeder2-2.proxad.net!newsfeed.arcor.de!newsspool3.arcor-online.net!news.arcor.de.POSTED!not-for-mail Date: Thu, 22 Dec 2011 12:37:22 +0100 From: Georg Bauhaus User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:8.0) Gecko/20111105 Thunderbird/8.0 MIME-Version: 1.0 Newsgroups: comp.lang.ada Subject: Re: Representation clauses for base-64 encoding References: In-Reply-To: Message-ID: <4ef31672$0$6574$9b4e6d93@newsspool3.arcor-online.net> Organization: Arcor NNTP-Posting-Date: 22 Dec 2011 12:37:22 CET NNTP-Posting-Host: c7430c7e.newsspool3.arcor-online.net X-Trace: DXC=o1=XDA4_MCWaAeROF2PWMQMcF=Q^Z^V3X4Fo<]lROoRQ8kFejVXF35^f:j1liT^Q6:M?3G@JV X-Complaints-To: usenet-abuse@arcor.de Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Date: 2011-12-22T12:37:22+01:00 List-Id: On 22.12.11 10:41, Natasha Kerensikova wrote: > Hello, > > the recent discussion about representation clauses vs explicit shifting > made me wonder about what is the Right Way of performing base-64 > encoding (rfc 1421). > My first thoughts were along the following lines: > > type Octet is mod 256; > -- or Character or Storage_Element or Stream_Element > -- or whatever 8-bit type relevant for the appliication > > for Octet'Size use 8; > for Octet'Component_Size use 8; Here I would stop. The RFC says that a value from the range 0 .. 63 is associated with a character from a specific set of characters, for encoding it: 'A' .. 'Z', 'a' .. 'z', '+', '/'. And there is a "pad", '='. Since the characters shall stand for 0 .. 25, 26 .. 51, 52 .. 63, this specifies a range, actually. In Ada, the 1:1 translation into a type can be: type Repertoire is ( 'A','B','C','D','E','F','G','H','I','J','K','L','M', 'N','O','P','Q','R','S','T','U','V','W','X','Y','Z', 'a','b','c','d','e','f','g','h','i','j','k','l','m', 'n','o','p','q','r','s','t','u','v','w','x','y','z', '0','1','2','3','4','5','6','7','8','9', '+','/','='); subtype Base_64_Character is Repertoire range 'A' .. '/'; subtype Padding is Repertoire range '=' .. '='; Note that you could have string literals of these: type Base_64_String is array (Positive range <>) of Repertoire; S : Base_64_String := "ABC="; -- but not "ABC!" The language guarantees that each of the literals is associated with just the positional number that Base 64 encoding requires. Let the compiler choose the best representation for Repertoire subtypes when encoding. Or simply use subtypes of Character. Only if you need some representation in memory or other storage that has 'Size /= Character'Size, or 'Size /= Repertoire'Size etc, derive new types as needed, and add representation clauses: http://www.adacore.com/2008/03/03/gem-27/ http://www.adacore.com/2008/03/17/gem-28/ For streaming encoded text over the wire, a subtype of String should serve the job just fine, convert as necessary. Or use Base_64_String. Packing and unpacking can be quite expensive. I remember at least two publicly available Base 64 encoding packages, one by Tom Moran IIRC, and one in AWS. There are probably more in the PAL. -- Georg