[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: UTF-8 on stdout?



On 2002-07-06 at 21:42 -0400, James K.Lowden wrote:

> PMJI.  You might want to have a look at http://czyborra.com/
> Mr. Czyborra has a pretty good overview of what's what
> regarding encoding and character sets, and does a good job of
> distinguishing between fonts, glyphs, and characters.  You
> may in particular want to look at:
>
> 	http://czyborra.com/unicode/terminals.html

This certainly seems a pretty good site. I'll have to take a
long look at it sometime but it's answered a few questions
already... thanks for the reference!

> What you bumped into was, as Lars said, a problem with xterm.
> If you push UTF-8 to stdout, it falls to the application
> whose job it is to convert encoded values into glyphs that
> your brain can interpret as characters (I'm skipping a few
> steps).  The standard xterm is *not* going to expect UTF-8;
> it will instead interpret the bytestream as ASCII or Latin-1
> or whatever your locale settings indicate.

Yep. However, it does appear from czyborra that there is an
escape sequence to make UTF-8 hacked 4.0 xterms switch into
UTF-8 mode. I'll investigate this and give it a try. Not sure
if it's the kind of thing that Dia should be outputting
however... probably more of a user/system-wide thing.

> dia --credits |sort
>
> how is sort(1) supposed to know what's incoming?  It doesn't
> guess; it assumes, and unless the answer is 7-bit ascii, it
> assumes wrong.  Its only defense is, it's got a lot of good
> company.

Good point. In this case I'm not going to worry because the
names are not surname, forename anyway (which is conventional
in most locales I think), and there is surrounding bumpf too.
But in a more general case that is very important I guess.

> Interesting place.  In particular, the -u8 option for xterm
> does exactly what Andrew wants.  We should get Akira and Xing
> Wang to use their utf8 encodings for their names.

Yes, I guess so. I'll continue outputting in UTF-8 then: I'll
assume it's the responsibility of the user to sort out their
terminal if they want 'correct' output.

Cheers for all that guys,
Andrew.

-- 
Andrew Ferrier

email: andrew.junk@new-destiny.co.uk
web:   http://www.new-destiny.co.uk/andrew/





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index] Mail converted by Mofo Magic and the Flying D

 
All trademarks and copyrights are the property of their respective owners.

Other Directory Sites: SeekWonder | Directory Owners Forum

GuideSMACK