From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,1086bab45b40d4b0 X-Google-Attributes: gid103376,public Path: controlnews3.google.com!news2.google.com!news.maxwell.syr.edu!newsfeed00.sul.t-online.de!newsmm00.sul.t-online.de!t-online.de!news.t-online.com!not-for-mail From: Martin Krischik Newsgroups: comp.lang.ada Subject: Re: UTF-8 in strings - a bug? Date: Mon, 10 May 2004 08:29:50 +0200 Organization: AdaCL Message-ID: <3878175.nfHeE0N58X@linux1.krischik.com> References: <200456-112553-85684@foorum.com> <2178612.8V5KANFFf5@linux1.krischik.com> <4b3nc.58514$mU6.237399@newsb.telia.net> <3171026.RJblE7u9LK@linux1.krischik.com> Reply-To: krischik@users.sourceforge.net Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7Bit X-Trace: news.t-online.com 1084174208 06 9400 wQeNGyK6o4fbSVuj 040510 07:30:08 X-Complaints-To: usenet-abuse@t-online.de X-ID: VPM8J2Z-YebVmByx+3mub1WYwdliCxSdoejGtnrqFksAatjKfxCgs1 User-Agent: KNode/0.7.7 Xref: controlnews3.google.com comp.lang.ada:408 Date: 2004-05-10T08:29:50+02:00 List-Id: Georg Bauhaus wrote: > Martin Krischik wrote: > > : The UTF-X encodings can start with a BOM "Byte-order mark". > > However, systems are allowed to define protocols which may > restrict the use of a BOM in case of UTF-8 (require/forbid). > A #!/shell script is an example. > > A BOM is said to be useful to distinguish a UTF-8 Unicode file > from a file using another 8bit encoding. Though I wonder how by > the absence of the Unicode BOM they think a program can find > out which of the other encodings has been used... XML/Ada does some guessing on the the usual beginning of an xml file. Apart from that I guess they can't.. With Regards Martin -- mailto://krischik@users.sourceforge.net http://www.ada.krischik.com