Re: printing on the Simpl. Chinese and other non-latin1 locales
From: Zhang Lin-bo <zlb lsec cc ac cn>
To: dia-list gnome org
Subject: Re: printing on the Simpl. Chinese and other non-latin1 locales
Date: Tue, 28 May 2002 22:54:00 +0800 (CST)
On Tue, 28 May 2002, Cyrille Chepelov wrote:
Bonjour,
> Le Tue, May 28, 2002, 3:49:50PM +0800, Zhang Lin-bo a rit:
>
> OK. I'll try your patch, hopefully today, to check whether it breaks or
> not
> on a latin0 workload.
Thank you.
>
> I'm afraid I'll need the ag1.sh script as well (I'm not running Red Hat).
It's part of the ghostscript distribution of rh7.3 (the filename is
ag14.sh in rh7.2)
>
> It would be interesting if you could send me a couple sample files
> (privately). I'm not a Postscript wizard, and while I can read some
> non-latin alphabets, I'm totally at loss with the CJK writing systems (big
> surprise)
I have attached some sample PS files in ps_samples.tar.bz2,
They all contains two same Chinese character (ºº×Ö, or Han Zi,
they mean "Chinese characters"). I don't know if the attachment
is too large for the mailing list (131KB). If it can't get through,
I'll send it to your address.
>
> Then, what is shipped by RH ?
> I mean, how are encoded the Chinese fonts in the RH 7.3 Ghostscript package
> ?
I don't know much. A set of fontnames is defined in the CIDFont directory,
and the ag1.sh script can create more font names in the Font directory
(both are subdirectories in /usr/share/ghostscripts/Resource), it seems
that none of them works with Dia's EPS files.
>
> Would extending the table of known symbol names in lib/ps-utf8.c (to avoid
> \uni1234 notations on Chinese glyphs) help ?
I don't think so. I have tried with a document containing the single
letter 'A' (whose unicode name is /A), and I got the same result as
with Chinese characters.
>
>
> I will try your patch with latin0 input; if people working on latin(!1 &&
> !0)
> systems had the time to test your patch as well, to see whether UTF-8 latin2
> or KOI8-R is acceptable to Ghostscript, it would be a good knowledge data
> point.
>
> Then, what would be even more wonderful, would be to assemble the following
> test PS files:
> latin0-custom-encoding # I can make this yesterday
> latin0-UTF8 # I'll do it this evening.
> latin2-custom-encoding # I've got nell's EPS, which works on my
> l0 system
> latin2-UTF8
> KOI8-custom-encoding
> KOI8-UTF8
> zh_CN-custom-encoding # what you call nonworking
> zh_CN-UTF8 # what you call working
maybe also "zh_CN-GB-EUC", "zh_CN-Adobe-GB1", etc. I know nothing
about these encodings, but they must represent some 'standard'.
> ja_JP-custom-encoding # working, according to Akira TAGOH
> ja_JP-UTF8
>
> build a second set of files (the same but after going through ps2ps), and
> then pass the whole batch of files to non-Ghostscript Postscript devices
> (printers, high-end photocopiers, etc.), and see what works and what
> doesn't.
>
> For the moment, I assume the -UTF8 files are out of the PS spec; maybe they
> aren't after all?
>
> I'm able to generate only the latin0 files; I'll need volunteers for the
> rest (the files you attached to your mail will be fine). And of course,
> volunteers to print 20 pages of paper on high-end devices I have no access
> to ;-) and on various-locale Ghostscripts.
As far as I know, there's no need to do this kind of test for simplified
Chinese, because there're very few PostScript devices in the Mainland
China, and fewer with Chinese PostScript fonts.
BTW, I have found out the bug in the freetype support:
With freetype support enabled, the fonts listed in 'font_data'
are ignored (see the '#ifdef' line at font.c:904, and I don't know
if this is a bug or a feature). Since rh7.3 uses xfs, no fontpath will
be found through the 'XGetFontPath' function. The patch of Robert Young
did not work because there was a bug in his code in parsing the
fs/config file: most path names (in my system) were skipped because
they have a trailing comma (and also I think the code needs to be
rewritten to correctly retrieve fontpaths from the 'catalogue = ...'
entry of the fs/config file (or one can borrow some lines from xfs,
or simply run the 'chkfontpath' program?)).
With freetype support enabled, dia now works fine for displaying and
for exporting to png with the "Ar pl ..." Chinese fonts (gkai00mp.ttf
and gbsn00lp.ttf), but exporting to eps does not work (even for ASCII
characters). Another problem is that the output file is too large
(10-20MB for a few characters) when using Chinese TTF fonts. So
there's still a lot of work to do ...
Finally, a suggestion: I think dia should save the locale
information with a diagram since interpretation of characters
is locale dependent (I have a diagram which contains some Chinese
characters, when I try to open it in a non zh_CN locale, I get
a lot of warnings, such as "** WARNING **: unicode_iconv(u2l,
utf=å¼ ...) failed, because 'Invalid or incomplete multibyte
or wide character'...", and the diagram is incorrectly
displayed).
LB