From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,f9957894e0bdf128 X-Google-Attributes: gid103376,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII-7-bit Path: g2news2.google.com!postnews.google.com!i24g2000prf.googlegroups.com!not-for-mail From: Ludovic Brenta Newsgroups: comp.lang.ada Subject: Re: How to put 200 into an integer sub-type of 16 bits (code included) Date: Wed, 14 Jan 2009 15:11:04 -0800 (PST) Organization: http://groups.google.com Message-ID: <7466b6ce-5d31-4012-b93c-5ac786783438@i24g2000prf.googlegroups.com> References: <407ae64d-3cb3-4310-b59e-f1bbae9910a5@t39g2000prh.googlegroups.com> <71gqm49eatq868htrvd7eghm3m8su8kcbl@4ax.com> <3d3719f4-355c-4094-9902-495d612d46fe@n33g2000pri.googlegroups.com> <139961e9-bae6-4e60-8ff7-4f4779b27481@z6g2000pre.googlegroups.com> <87816592-c947-4bbc-92ed-7473646a105e@a12g2000pro.googlegroups.com> <1a2b31ac-cf6b-44e3-85b7-04594460db87@d36g2000prf.googlegroups.com> NNTP-Posting-Host: 94.108.185.21 Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Trace: posting.google.com 1231974665 6071 127.0.0.1 (14 Jan 2009 23:11:05 GMT) X-Complaints-To: groups-abuse@google.com NNTP-Posting-Date: Wed, 14 Jan 2009 23:11:05 +0000 (UTC) Complaints-To: groups-abuse@google.com Injection-Info: i24g2000prf.googlegroups.com; posting-host=94.108.185.21; posting-account=pcLQNgkAAAD9TrXkhkIgiY6-MDtJjIlC User-Agent: G2/1.0 X-HTTP-UserAgent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.18) Gecko/20081030 Iceape/1.1.13 (Debian-1.1.13-1),gzip(gfe),gzip(gfe) Xref: g2news2.google.com comp.lang.ada:4277 Date: 2009-01-14T15:11:04-08:00 List-Id: ChristopherL wrote: > Please remove the rounding operation for now from this discussion. > > This problem was given to me to try to solve who said it's probably > impossible! > > The size of floating point number is 32 bits on my system, and my > float will always be positive. > > General information about floating point numbers in general: > > Floating-point numbers are typically packed into a computer datum as > the sign bit, the exponent field, and the significand (mantissa). > > Mantissa Exponent Value > 71 0 71 > 71 1 710 > 71 2 7100 > 71 -1 7.1 > 2 2 200 > > For example, mathematical PI , rounded to 24 bits of precision, has: > > sign = 0 ; e = 1 ; s = 110010010000111111011011 (including the hidden > bit) > > The sum of the exponent bias (127) and the exponent (1) is 128, so > this is represented in single precision format as > > 0 10000000 10010010000111111011011 (excluding the hidden bit). > > So, what is the proper way to store a number (never being greater > than 200.5) in a 8 bit short number as outlined above? > > Chris L. Are you talking about an 8-bit floating-point number? One that uses e.g. 1 bit for sign, 3 for exponent and 4 for mantissa? An exact floating-point representation for 200.5 requires more than 8 bits. At a minimum you'd need: 0 1000 10010001 i.e. 13 bits. Is that what you were thinking about when you first mentioned 16-bit types? And since your hardware probably supports neither 8-bit nor 16-bit floating-point types anyway, do you plan to implement floating-point arithmetic in software? -- Ludovic Brenta.