[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: dia 0.90RC3



At 10:13 03.06.02 +0400, Vitaly Lipatov wrote:
>I try new 0.90RC3 version with my old dia files from 0.88.1 version
>and I have trouble with encoding.
>In old files russian letters looks like
><dia:string>#&#235;&#207;&#206;&#212;&#210;&#207;&#204;&#204;&#197;
>&#240;&#254;#</dia:string>
>There is UTF-8 in new one.
>It can't tranlate old files correctly (for russian letters).
>Can I convert files from old format by hands?
>Any suggestions?

There is some code in lib/dia_xml.c (around line 134) which tries
to be smart about default encoding and valid UTF-8. If
there are no bytes found where the MSB is set it assumes
well formed utf8. In you case this is plain wrong.
Placing the correct encoding like:

<?xml version="1.0" encoding="CP1252"?>

in your dia file should help. Beware: the encoding string works
for german on win32. I don't know the correct encoding
string for russian on Linux ...

But if you prepare a file which has the offending bits set
(without an encoding definition) Dia will complain about 
the missing encoding and will show what it assumes to be the
default.

You could also apply the attached patch, which does not
only seek for the MSB but the '&' char encode character
too. Finally it fixes the re-writing of the temporay
file including you default encoding.

Hope this helps,
	Hans
diff --exclude-from=c:\util\tool\diff.ign -u -r from-cvs/dia/lib/dia_xml.c my-gtk/dia/lib/dia_xml.c
--- from-cvs/dia/lib/dia_xml.c	Mon May 20 19:26:30 2002
+++ my-gtk/dia/lib/dia_xml.c	Mon Jun 03 21:23:08 2002
@@ -133,13 +133,17 @@
   do {
     int i;
     for (i = 0; i < len; i++)
-      if (buf[i] & 0x80)
+      if (buf[i] & 0x80 || buf[i] == '&')
         well_formed_utf8 = FALSE;
     len = gzread(zf,buf,BUFLEN);
   } while (len > 0 && well_formed_utf8);
   if (well_formed_utf8) {
     gzclose(zf); /* this file is utf-8 compatible  */
     return filename;
+  } else {
+    gzclose(zf); /* poor mans fseek */
+    zf = gzopen(filename,"rb"); 
+    len = gzread(zf,buf,BUFLEN);
   }
 
   if (0 != strcmp(default_enc,"UTF-8")) {

-------- Hans "at" Breuer "dot" Org -----------
Tell me what you need, and I'll tell you how to 
get along without it.                -- Dilbert


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index] Mail converted by Mofo Magic and the Flying D

 
All trademarks and copyrights are the property of their respective owners.

Other Directory Sites: SeekWonder | Directory Owners Forum

GuideSMACK