From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,1086bab45b40d4b0 X-Google-Attributes: gid103376,public Path: controlnews3.google.com!news1.google.com!news.glorb.com!news2.telebyte.nl!news-fra1.dfn.de!news-ber1.dfn.de!news.uni-hamburg.de!cs.tu-berlin.de!uni-duisburg.de!not-for-mail From: Georg Bauhaus Newsgroups: comp.lang.ada Subject: Re: UTF-8 in strings - a bug? Date: Sun, 9 May 2004 12:16:07 +0000 (UTC) Organization: GMUGHDU Message-ID: References: <200456-112553-85684@foorum.com> <2178612.8V5KANFFf5@linux1.krischik.com> <4b3nc.58514$mU6.237399@newsb.telia.net> <3171026.RJblE7u9LK@linux1.krischik.com> NNTP-Posting-Host: l1-hrz.uni-duisburg.de X-Trace: a1-hrz.uni-duisburg.de 1084104967 26039 134.91.1.34 (9 May 2004 12:16:07 GMT) X-Complaints-To: usenet@news.uni-duisburg.de NNTP-Posting-Date: Sun, 9 May 2004 12:16:07 +0000 (UTC) User-Agent: tin/1.5.8-20010221 ("Blue Water") (UNIX) (HP-UX/B.11.00 (9000/800)) Xref: controlnews3.google.com comp.lang.ada:394 Date: 2004-05-09T12:16:07+00:00 List-Id: Martin Krischik wrote: : The UTF-X encodings can start with a BOM "Byte-order mark". However, systems are allowed to define protocols which may restrict the use of a BOM in case of UTF-8 (require/forbid). A #!/shell script is an example. A BOM is said to be useful to distinguish a UTF-8 Unicode file from a file using another 8bit encoding. Though I wonder how by the absence of the Unicode BOM they think a program can find out which of the other encodings has been used... -- Georg