Le lun, mai 28, 2001, à 03:51:15 +0800, James Henstridge a écrit:
> On Mon, 28 May 2001, Cyrille Chepelov wrote:
>
> > Normally, XML files already embed character set encoding information in the
> > very first element (<?xml version="1.0" encoding="foo"?>
>
> That sounds like the best way to discriminate between old and new files.
Yes ; as long as the libxml we use either brings us this information to our
knowledge, or converts everything to UTF8 before we even see it.
> > Is (libxml1 8-bit only) and (libxml2 dependent on gtk2) ? If not, we
>
> libxml1 doesn't care about character encodings, so 8-bit characters pass
> through without problems (you may run into trouble with some multibyte
> charsets though).
> Libxml2 doesn't rely on gtk2, but cares about encodings. We already have
> conditional support for using libxml2, but it breaks on 8-bit chars. The
> reason is that it assumes that the internal encoding used by the app is
> UTF-8, so occasionally mangles the second highest bit of some characters.
Mmmmhh, OK. So, basically, if we give it UTF8 strings and assume it brings
us back UTF8 strings, it should be OK. I'll look into that.
> > However, this will be no small task (basically requires to audit the whole
> > code for (gchar *) arithmetic and moving that to the unicode_* functions,
> > and define wrappers for these when !HAVE_UNICODE). I'm very motivated to
> > tackle this, but I'd like 0.88.1 to not be the new 0.86. I think there has
> > been enough problems removed in the CVS head relative to 0.88.1, that making
> > a new release (either 0.88.3 or 0.89) before going utf8 actually makes sense.
>
> If you want a new release, we can do one whenever you want. Probably
> better to call it 0.89 rather than 0.88.3.
Mmmm... at the end of the week might be a good time to consider release
candidates.
> If we are going to have unicode as the default, I am inclined to make it a
> required library. The less conditionals, the easier it is to test that a
> tarball will build correctly. What do others think about this?
Let's put it as default with a big fat warning if it's not available. And
let's promise to make it mandatory for the next version. We'll see if
anyone cares.
> We may as well use the libunicode calls unconditionally. That way, it
> will be a simple sed job to convert over to the glib unicode calls found
> in glib-2.0 (which will be in a required library for gtk2, so we may as
> well use it :)
Other way around, let's keep our own, simpler interface, and rewrite the
backend to glib2 if that's nicer to use.
-- Cyrille
--
Grumpf.