Check out Wombat News Center!
Check out the Wombat today!!!

Wombat-Chinese

Welcome to the amazing world of Wombat-Chinese

Hello, everyone.
First of all, this page and its sister page "Wombat Japanese" have been reorganized so that their contents better reflect their titles. This will be done by sorting information about each language on to the page that bears the name of the language. Of course, a good deal of the information deals with both Chinese and Japanese, and this information will be duplicated. Also, a new "Wombat Korean" page has been added. Please feel free to provide feedback, as it will be most welcome!
Next, It has been brought to my attention that some readers may feel slightly offended by the former title of this page, which was PIG-Chinese. Of course, I was not implying that the Chinese language or the Chinese people are in any way pig-like. Actually, I am Pig. It is kind of a nick name. Anyway, just to avoid confusion, I have decided to rename this page "Wombat-Chinese". For those who are unfamiliar with wombats, they are Austrailian marsupials related to Koalas. They do look a bit like pigs, only cuter and not nearly as clever.

By the way, this document was originally copied from an (unfortunately) forgotten source for personal use. After realizing that many other people were refering to it, I started to update and add to it. I would like to appologize to whomever I have unwittingly plagerized.

Titus North, December 8, 1997


 Write to me at: ctnst3+@pitt.edu


Index

  • Software
  • Information
  • Other links

  • Press here to go to WOMBAT-CHINESE, which has lots of links to information and software for Chinese!

    Press here to go to WOMBAT-JAPANESE, which has lots of links to information and software for Japanese!

    Press here to go to WOMBAT-KOREAN, which has lots of links to information and software for Korean!


    Squeeze here for pkunzip, which you will need to set up other software.
    ftp://ifcss.org/pub/software/dos/pkunzip.exe

    Or here:
    ftp://cnd.org/pub/software/dos/pkunzip.exe

    Here is a tip. You will find many FTP links on this page to "ftp.ifcss.org" and "cnd.org". These are basically mirror sites to each other. If you have trouble connecting to one of these sites, just replace the "cdn" in the URL with "ftp.ifscc" or vice versa, and you will likely connect. Also, if you get a message saying that the file does not exist, just remove the file name from the URL and try to connect to the parent directory.


    SOFTWARE!

  • Back to Index

    

    software: NJWin
    function: CJK view for Windows
    comment: This is a solid viewer with automatic detection of the various Japanese codes, conversion between traditional and simplified Chinese characters, and also has Korean capability, but no input system. Also, I found occasional corrupt characters at the begining or end of a line when using it with Microsoft Explorer, although I assume this is due to the fact that the browser unknowingly split the two-bit Kanji and that I would get better results using a CJK version of the browser.

    ftp://ftp.cc.monash.edu.au/pub/nihongo/njwin160.exe
    ftp://ftp.cc.monash.edu.au/pub/nihongo/njwin160.txt


    

    UW-DBM (UNIWAY)


    software: UW-DBM
    version : 4.0f working demo version, 1996.1.6
    function: UW-DBM 4.0 is the first and the only Windows program which allows you to use Chinese (Big5, GB or HZ), Japanese(S-JIS, JIS, EUC) and Korean (KSC) at the same time.
    author : UnionWay International Corp. comments: I have found this to be the most convinient program for reading CJK text. However, its text input function is not particularly good.
    -- Titus

    You will need the following two files:

    ftp://cnd.org/pub/software/ms-win/c-sys/uwdbm4_e.exe
    ftp://cnd.org/pub/software/ms-win/input/uwhk.zip
    You will need to press here for UnionWay product ID in order to activate the software.
    Directory : ftp://cnd.org/pub/software/ms-win/c-sys
    Help file : ftp://cnd.org/pub/software/ms-win/c-sys/uwdbm4_c.exe


    

    software: ZW-DOS
    function: Chinese for DOS.
    comment: This is a simple and elegant DOS program that allows for convenient viewing and input of Chinese characters. Both simplified and traditional characters are supported.

    You will need both a font file and the software file:

    Simplified font:

    ftp://ftp.cnd.org/pub/software/dos/ZWDOS/zw220/cclib.16
    Traditional font:

    ftp://cnd.org/pub/software/fonts/big5/hbf/chinese.16

    Software

    ftp://ftp.cnd.org/pub/software/dos/ZWDOS/zw220/zwdos220.zip


    

    software: Twin Bridge
    function: CJK viewer for Windows
    comment: TwinBridge is a package that adds CJK functionality to non-CJK Windows. A demo version of TwinBridge for Japanese is at the following URL:


    A corresponding README file that provides details of the demo (and contact information) is at the following URL:

    ftp://ftp.ora.com/pub/examples/nutshell/ujip/windows/twnbdemo.txt


    software: MView version: 1.00, 1995.7.12 function: Super-Software MView is a Microsoft Windows integrated software that allows viewing of Chinese, Japanese and Korean characters under Windows. The supported coding types include GB, HZ, BIG5, JIS, EUC, SJIS, KSC, UTF7 and UTF8. MView is distributed as a shareware. It is an inexpensive software with some distinctive features that are not found in other softwares. MView has been tested on Netscape 1.1n, Netscape 1.2b2, Windows Notepad, Windows Write and Norton Desktop Editor. author: Albert Chong comment: Mview is a smaller program than Uniway and very useful as a viewer, but it does not seem to run with as many applications.

    You will need the zipped software file and the fonts for whichever character set you plan to use.

    Software:

    ftp://cnd.org/pub/software/ms-win/viewer/ss-mview.zip

    Simplified Chinese font:

    ftp://cnd.org/pub/software/fonts/gb/hbf/cclib.16

    Traditional Chinese font:

    ftp://cnd.org/pub/software/fonts/big5/hbf/chinese.16

    Japanese font:

    ftp://cnd.org/pub/software/fonts/misc/hbf/jis.16

    Korean font:

    ftp://cnd.org/pub/software/fonts/misc/hbf/ksc.16

    Unified font:

    ftp://cnd.org/pub/software/fonts/unicode/hbf/unihan16.hbf

    Gunzipped set of all the fonts???: ftp://cnd.org/pub/software/fonts/unicode/hbf/*16.bin.gz


    software: MView

    version: 1.00 Alt, 1995.7.16

    function: IMPORTANT: This is an alternative version of MView Version 1.00. This version uses only Unicode font files. Refer to mview_at.txt for (7 bits), ISO Latin-1 (8 bits), Japanese, Chinese, Korean (16 bits) coded in the ISO2022 standard and its variants (e.g. EUC, Compound Text). For Chinese there is support for both GB and Big5. In addition, Arabic, IPA, Thai (based on TIS620) and Vietnamese (based on VISCII and VSCII) are also supported.

    author: Ken'ichi HANDA, Satoru TOMURA, Mikiko NISIKIMI

    ftp://cnd.org/pub/software/mule/mule-2.3.tar.gz

    ftp://cnd.org/pub/software/mule/diff-2.2.2-2.3.gz

    ftp://cnd.org/pub/software/mule/diff-19.28-2.3.gz


    

    software: Multi-Language Conversion System "Wnn"

    version: 4.2

    function: This packege includes the following software:

    Japanese Conversion Server

    Simplified Chinese Conversion Server

    Traditional Chinese Conversion Server

    Korean Conversion Server

    Multi-Language Input Manager for X Window(R5/R6)

    author: KUWARI Seiji comment:I am not sure of the file name listed below, so you will have to look around the directory

    ftp://cnd.org/pub/software/x-win/Wnn4.2.tar.gz


    

    software: mule

    version: 2.3 of 1995.7.24 (SUETSUMUHANA)

    function: MULtilingual Enhancement to GNU Emacs 19.28 can handle ASCII (7 bits), ISO Latin-1 (8 bits), Japanese, Chinese, Korean (16 bits) coded in the ISO2022 standard and its variants (e.g. EUC, Compound Text). For Chinese there is support for both GB and Big5. In addition, Arabic, IPA, Thai (based on TIS620) and Vietnamese (based on VISCII and VSCII) are also supported.

    authors: Ken'ichi HANDA, Satoru TOMURA, Mikiko NISIKIMI

    ftp://cnd.org/pub/software/mule/mule-2.3.tar.gz

    ftp://cnd.org/pub/software/mule/diff-2.2.2-2.3.gz

    ftp://cnd.org/pub/software/mule/diff-19.28-2.3.gz


    

    software: Support Table for Hanzi Convert (hc)

    version: 1994/05/01

    function: Convert table supports the program Hanzi Convert by Fung F. Lee and Ricky Yeung (GB<->Big5). Includes Russian, numbers, Japanese, graphing symbols and "incorrect" codes. Text file, comments included.

    author: Chi-Ming Tsai

    ftp://cnd.org/pub/software/unix/convert/sym-supp.tab

    ftp://cnd.org/pub/software/unix/convert/in-corr.tab


    

    software: CJK version: 3.0 beta 1, 1995.12.6 function: Enables Chinese/Japanese/Korean for use with LaTeX2e; supports GB, BIG5, CNS, JIS, KSC, and UTF8. author: Werner Lemberg

    ftp://cnd.org/pub/software/tex/CJK.3_0.1.tar.gz


    

    Hey kids, here is another interesting site!

    It includes that smash-hit CWIN in file chdemo.

    ftp://ftp.technet.sg/pub/chinese/ms-win/chinese-sys/


    

    INFORMATION

  • Back to Index

    This unusual dictionary uses traditional etymologies and a unique series of charts based on them to show the close relationships between Chinese characters. While Chinese characters are often thought of as overly complex, in fact they are all derived from about 200 simple pictographs and ideographs (these wen are more fundamental than bushous) in ways that are usually quite logical and easy to remember. Since Chinese characters form a self-contained system, their etymologies are far easier to understand and far more helpful than, for instance, English etymologies with their myriad of foreign roots.
    http://www.zhongwen.com
    Use of this dictionary NO LONGER requires a Chinese language system, an add-on or helper program to allow your browser to see characters, or Chinese fonts to for your browser. However, Your browser should support FRAMES.


    

    This is an excellent and comprehensive online document that by Ken Lunde that provides information on CJK (that is, Chinese, Japanese, and Korean) character set standards and encoding systems. In short, it provides detailed information on how CJK text is handled electronically. Mr. Lunde is happy to share this information with others and would appreciate any comments/feedback on its content. The current version (master copy) of this document is maintained at:

    ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/cjk.inf
    This file may also be obtained by contacting Mr. Lunde directly using one of the e-mail addresses listed in the CONTACT INFORMATION section.

    The following files contain the latest information about changes (additions and corrections) made to UJIP for various printings, both for those that have taken place (such as for the second printing) and for those that are planned (the first digit is the edition, and the second is the printing):

    ftp://ftp.ora.com/pub/examples/nutshell/ujip/errata/ujip-errata-1-2.*
    ftp://ftp.ora.com/pub/examples/nutshell/ujip/errata/ujip-errata-1-3.*
    The asterisk is used for the file extension that indicates the Japanese code used (jis, euc, or sjs) -- pick the one that is best suited for your display or printing environment. I *highly* recommend that all readers of UJIP obtain these errata files. Those without FTP access can request copies directly from me.

    If you happen to be running X Windows, it is very easy to display these CJK character sets (if a bitmapped font for the character set exists, that is). Here is what I usually do:
    o Obtain a BDF (Bitmap Distribution Format) font for the target character set. Try the following URLs:

    ftp://cair-archive.kaist.ac.kr/pub/hangul/fonts/
    ftp://etlport.etl.go.jp/pub/mule/fonts/
    ftp://ftp.ifcss.org/pub/software/fonts/{big5,cns,gb,misc,unicode}/bdf/
    ftp://ftp.ora.com/pub/examples/nutshell/ujip/unix/
    BDF files usually have the string "bdf" somewhere in their file name. If the file is compressed (noticing that it ends in .gz or .Z is a good indication), decompress it. BDF files are text files.

    Base64 encoding is most commonly used for encoding non-ASCII text that appears in e-mail headers. Of all the portions of an e-mail message, its header gets manipulated the most during transmission, and Base64 encoding offers a safe way to further encode non-ASCII text so that it is not altered by mail-routing software.

    One typically does not need to worry about encoding text as Base64 (MIME-compliant mailing software usually performs this task for you). The problem is usually trying to decode Base64-encoded text. A Base64 decoder is available in Perl at the following URL:

    ftp://ftp.ora.com/pub/examples/nutshell/ujip/perl/b64decode.pl
    Note that this program takes "raw" Base64 data as input. Any non- Base64 stuff must be stripped. I usually run this from within Mule (C-u M-| b64decode.pl) after defining a region around the Base64- encoded material.

    Most MIME-compliant e-mail software can decode Base64-encoded text.


    

    For software that handles Chinese code conversion (this includes conversion to and from Japanese), I suggest browsing at the following URLs:

    ftp://ftp.ifcss.org/pub/software/dos/convert/
    ftp://ftp.ifcss.org/pub/software/mac/convert/
    ftp://ftp.ifcss.org/pub/software/ms-win/convert/
    ftp://ftp.ifcss.org/pub/software/unix/convert/
    ftp://ftp.ifcss.org/pub/software/vms/convert/
    ftp://ftp.ora.com/pub/examples/nutshell/ujip/map/
    The latter URL mirrors some material made available from Koichi Yasuoka (yasuoka@kudpc.kyoto-u.ac.jp).


     http://condor.stcloud.msus.edu:20020/netscape.html
    I *highly* suggest reading it.


    7.1: MULE

    Mule (multilingual enhancement to GNU Emacs), written by Kenichi Handa (handa@etl.go.jp), is the first (and only?) CJK-capable editor for UNIX systems, and is freely available under the terms of the GNU General Public License. Mule was developed from Nemacs (Nihongo Emacs).

    Mule is available at the following URL:

    ftp://etlport.etl.go.jp/pub/mule/


    7.2: CNPRINT

    CNPRINT, developed by Yidao Cai (cai@neurophys.wisc.edu), is a utility to print CJK text (or convert it to a PostScript file), and is available for DOS, VMS, and UNIX systems. A wide range of encoding methods are supported by CNPRINT.

    CNPRINT is available at the following URLs:

    ftp://ftp.ifcss.org/pub/software/{dos,unix,vms}/print/
    ftp://neurophys.wisc.edu/[public.cn]/


    7.3: NISUS WRITER

    Nisus Writer, written by Nisus Software and available for Macintosh, is fully CJK-capable as long as you have the appropriate scripts installed (such as CLK for Chinese or JLK for Japanese). A"Language Key" is also required for Chinese and Korean (and some one-byte scripts such as Arabic and Hebrew).

    The CJK encodings that are supported by Nisus Writer are the same as made available by the underlying Macintosh operating system. No import/export of other encodings is supported. You must run separate conversion utilities for both import and export.

    A demo version of Nisus Writer is available at the following URL:

    ftp://ftp.nisus-soft.com/pub/nisus/demos/NisusWriterDemo.sea.hqx
    Give it a try! Updaters are also available at the same FTP site. Nisus Software can be contacted using the following e-mail address or through their WWW page:

    info@nisus-soft.com http://www.nus/


    

    The following Usenet Newsgroups typically have postings with information relevant to issues discussed in CJK.INF (in alphabetical order): alt.chinese.computing

    alt.chinese.text (HZ encoding used for Chinese text)

    alt.chinese.text.big5 (Big Five encoding used for Chinese text)

    alt.japanese.text (JIS encoding used for Japanese text)

    comp.software.international

    comp.std.internat

    fj.editor.mule (JIS encoding used for Japanese text)

    fj.kanji (JIS encoding used for Japanese text)

    sci.lang.japan (JIS encoding used for Japanese text)

    If your local news host does not provide a feed of the fj.* newsgroups (shame on them!), or if you do not have access to Usenet News, you can alternatively fetch them from the following URL:

    ftp://kuso.shef.ac.uk/pub/News/
    The subdirectories correspond to the newsgroup name, but with the"dots" being replaced by "slashes." For example, the "fj.binaries.mac" newsgroup is archived in the "fj/binaries/mac" subdirectory. Many thanks to Earl Kinmonth (jp1ek@sunc.shef.uc.uk) for this service.


    A.2.1: USEFUL FTP SITES

    Below are the URLs for useful FTP sites. The directory specified is the recommended place from which to start poking around for useful files.

    ftp://cair-archive.kaist.ac.kr/pub/hangul/
    ftp://etlport.etl.go.jp/pub/mule
    ftp://ftp.adobe.com/pub/adobe/
    ftp://ftp.cc.monash.edu.au/pub/nihongo/
    ftp://ftp.ifcss.org/pub/software/
    ftp://ftp.ifcss.org/pub/software/
    ftp://ftp.ora.com/pub/examples/nutshell/ujip/
    ftp://ftp.uwtc.washington.edu/pub/Japanese/
    ftp://kuso.shef.ac.uk/pub/Japanese/
    ftp://unicode.org/pub/
    This list is expected to grow.


    

    A.2.4: USEFUL WWW SITES

    Because the World-Wide Web is a constantly changing place (and more importantly, because I don't want to re-issue a new version of this document every month!), I will maintain links to useful documents at my WWW Home Page. Its URL is as follows:

    http://jasper.ora.com/lunde/
    A.3.4: RFCs

    Many RFCs (Request for Comments) are relevant to this document. They are:

    o RFC 1341: "MIME (Multipurpose Internet Mail Extensions): Mechanisms for Specifying and Describing the Format of Internet Message Bodies," by Nathaniel Borenstein and Ned Freed, June 1992.

    o RFC 1468: "Japanese Character Encoding for Internet Messages," by Jun Murai et al., June 1993.

    o RFC 1554: "ISO-2022-JP-2: Multilingual Extension of ISO-2022-JP," by Masataka Ohta and Kenichi Handa, December 1993.

    o RFC 1557: "Korean Character Encoding for Internet Messages," by Uhhyung Choi et al., December 1993.

    These RFCs can be obtained from FTP archives that contain all RFC documents (such as URLs ftp://nic.ddn.mil/rfc/ or ftp://ftp.uu.net/ inet/rfc/), but these specific ones are mirrored at the following URL:

    ftp://ftp.ora.com/pub/examples/nutshell/ujip/Ch9/
    An RFC to cover Chinese is currently in the works, and will be made available at the above URL as soon as it is ready.


    A.3.5: FAQs

    There are several FAQ (Frequently Asked Questions) files that provide useful information. The following is a listing of a some along with their URLs:

    o "sci.lang.japan" FAQ by Rafael Santos (santos@mickey.ai.kyutech.ac.jp) at:

    http://www.mickey.ai.kyutech.ac.jp/japanese/
    Update announcements are usually posted to the sci.lang.japan newsgroup.

    o "Programming for Internationalization" FAQ by Michael Gschwind (mike@vlsivie.tuwien.ac.at) at:

    ftp://ftp.vlsivie.tuwien.ac.at/pub/8bit/ISO-programming
    Also posted to the comp.software.international newsgroup.

    o "Japanese Internet Service Providers" FAQ by Mike Collinson (mike@uxp.bs2.mt.nec.co.jp) at ftp://ftp.ora.com/pub/examples/nutshell/ujip/faq/japan-internet.FAQ
    o "Internationalization Reference List" by Eugene Dorr (gdorr@pgh.legent.com) at: ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/i18n-books.txt
    Note really a FAQ, but quite useful.

    o "How to Use Japanese on the Internet with a PC: From Login to WWW"by Hideki Hirayama (sgw01623@niftyserve.or.jp) at:

    ftp://ftp.ora.com/pub/examples/nutshell/ujip/faq/jpn-inet.FAQ
    

    OTHER LINKS

  • Back to Index

    Check it out!


    Press here to learn All About Sports in Pittsburgh.

    Press here to learn Miscelanious Computer stuff, like Pascal and shell accounts.

    Press here for an English text book for Japanese speakers (a great place to test out Japanese language software!)

    Press here to go back to Titus North's Home Page, featuring All About CHON WOLSON, Links to Japanese and Chinese language shareware, information about the International Chindogu Society, and much much more.

    Squeeze here for the University of Pittsburgh's East Asian Library, which can be interesting at times.