Uncategorized – CJK Type Blog http://ccjktype.fonts.adobe.com/ CJK Fonts, Character Sets & Encodings. Thu, 19 Nov 2020 05:24:15 +0000 en-US hourly 1 https://wordpress.org/?v=5.4.4 Year of the Dog https://ccjktype.fonts.adobe.com/2018/02/year-of-the-dog.html https://ccjktype.fonts.adobe.com/2018/02/year-of-the-dog.html#comments Thu, 15 Feb 2018 13:35:55 +0000 http://blogs.adobe.com/CCJKType/?p=7057 Continue reading ]]>

I’d like to use this opportunity to welcome the year of the dog, which is expressed using the CJK Unified Ideograph (U+620C), and to wish a Happy Chinese New Year to all of my friends, colleagues, and blog readers who are celebrating this holiday. May this year be safe, prosperous, and enjoyable.

As a side note, all 12 of the animals that are associated with the Chinese New Year have a corresponding emoji, such as U+1F415 🐕 DOG for this year.

🐡

]]>
https://ccjktype.fonts.adobe.com/2018/02/year-of-the-dog.html/feed 1
25 Years of #AdobeLife & @AdobeType https://ccjktype.fonts.adobe.com/2016/07/25-years-adobe.html Fri, 01 Jul 2016 11:38:35 +0000 http://blogs.adobe.com/CCJKType/?p=4813 Continue reading ]]>

Today is Friday, July 1st, 2016, which is a date that has a special significance for me. I am publishing this from Hot Springs, South Dakota where I am enjoying a few days away from work.

My life was put on a new path exactly 25 years ago, on Monday, July 1st, 1991. I was 25 years old at the time, and I am therefore 50 years old now. It was on this date that I started working at Adobe as a member of its Type Development team. My employee number is 879, though at the time there were approximately 500 employees in total. It was a much smaller company back then. As you can see from my very first business card below, I was involved in things related to Japanese type from the very beginning:

 

This event effectively launched a 25-year career that is still going strong, and which has been in the same department doing essentially the same thing, though the technologies and related standards have changed or evolved.

The rest of this somewhat lengthy article will be used to highlight some of my accomplishments during each five-year period.

The First 5 Years

One of my first projects at Adobe was to use a new typeface-design technology that we called Cube to create a small proof-of-concept font based on the Heisei Mincho W3 (平成明朝 W3) typeface design. This technology was so named because it was thought that three design axes could be used for the individual multiple master–like elements that are used as stroke-like components to compose glyphs. I ended up presenting the results to Adobe’s co-founders, John Warnock and Chuck Geschke, in late 1991. This technology—though with additional design axes beyond the initial three—was eventually used to design the glyphs for kanji in the Kozuka Mincho (小塚明朝) and Kozuka Gothic (小塚ゴシック) typeface designs, along with the glyphs for ideographs (aka hànzì, kanji, and hanja) and hangul syllables in the Adobe-branded Source Han Sans (思源黑体 or 思源黑體 or 源ノ角ゴシック or 본고딕) and Google-branded Noto Sans CJK typeface designs.

Other significant projects during this period were the production of the first two Heisei (平成) fonts, Heisei Mincho W3 (平成明朝 W3) and Heisei Kaku Gothic W5 (平成角ゴシック W5), that were released as a package called Adobe ValuePack-J, along with a package called 平成明朝W3 GaijiPack that included the glyphs for JIS X 0212-1990, along with 250 glyphs for JIS C 6226-1978 (aka JIS78) kanji.

Speaking of glyphs, I published Adobe’s first Simplified Chinese, Traditional Chinese, and Korean glyph sets as Adobe-GB1-0, Adobe-CNS1-0, and Adobe-Korea1-0, respectively. Adobe-GB1-1 and Adobe-Korea1-1 were also published during this first five-year period. See Adobe Tech Note #5079 (Adobe-GB1-5), Adobe Tech Note #5080 (Adobe-CNS1-6), and Adobe Tech Note #5093 (Adobe-Korea1-2) for more details.

Two other achievements during this first five years were the writing and typesetting (using Aldus PageMaker Version 4.0J) of my first book, Understanding Japanese Information Processing, which was published in 1993, followed by my PhD (linguistics) dissertation, entitled Prescriptive Kanji Simplification, that I wrote and successfully defended the following year. The Japanese translation of my book, aptly entitled 日本語情報処理 (nihongo jōhō shori), was published by SoftBank in 1995.

Years 6 through 10

One of the first things that took place was the moving of Adobe’s headquarters from Mountain View to downtown San José in early September of 1996.

This second five-year period gave me the opportunity to do the production for Adobe’s very first Adobe Originals typeface families for Japanese, the first one being Kozuka Mincho (小塚明朝), first deployed as sfnt-wrapped CIDFonts in 1997, followed by Kozuka Gothic (小塚ゴシック), first deployed as OpenType/CFF fonts in 2001. It was also during this period that OpenType was born, in April of 1997 to be exact. As an aside, I think that I have the largest collection of printed Kozuka Mincho specimen books.

I also wrote and typeset (using Adobe FrameMaker Version 5.5) my second book, CJKV Information Processing, which was published at the end of 1998.

In the year 2000, I published Adobe’s very first “Pro” Japanese glyph set, Adobe-Japan1-4, in February. The CD in the photo above shows evidence for Adobe-Japan1-4 CID+14106 (劍󠄁; aka Adobe-Japan1 IVS <528D E0101>). Related to CJK glyph sets during this period, I also published Supplements 2 through 4 of Adobe-GB1, Supplements 1 through 3 of Adobe-CNS1, and the still-current Adobe-Korea1-2.

Years 11 through 15

Shortly after the Adobe-Japan1-4 glyph set was released, the JIS X 0213:2000 standard was published. This ultimately led to the development of Adobe-Japan1-5, which was done in cooperation with our friends at Apple, and issued in 2002. Adobe-Japan1-6, whose primary purpose was to incorporate the remaining glyphs for the JIS X 0212-1990 standard, was subsequently issued in 2004. This resulted in the deprecation of the Adobe-Japan2-0 glyph set (see Adobe Technical Note #5097). See Adobe Technical Note #5078 for more information about Adobe-Japan1-6 and earlier Supplements. Also during this period, I published the still-current Adobe-GB1-5, along with Supplements 4 and 5 of Adobe-CNS1.

Oh, and Japanese and (Traditional) Chinese translations of my second book were published in 2002, entitled CJKV 日中韓越情報処理 and 中日韓越資訊處理, respectively.

Years 16 through 20

One of the most interesting and challenging typeface designs I helped to develop was Kazuraki (かづらき), whose active development began in 2006, and which was first released in 2009. Its typeface designer is the very talented Ryoko Nishizuka (西塚涼子) in the Tokyo branch of our team. In addition to being the very first fully-proportional Japanese font, it also represents the first broad deployment of the special-purpose Adobe-Identity-0 ROS. To learn more about the Kazuraki development process, please go through my ATypI Hong Kong 2012 presentation.

Related to Japanese and Unicode, my first IVD (Ideographic Variation Database) collection, Adobe-Japan1, was registered at the end of 2007. For those who are unaware, the IVD represents a mechanism for supporting unencoded ideograph (aka kanji) variants in “plain text” using variation sequences. Adobe-Japan1 is the most broadly implemented IVD collection, and is supported by hundreds of OpenType Japanese fonts. I also became the IVD Registrar during this period, meaning that I currently manage all aspects of the IVD. This article from earlier in the year provides some information about the current state of IVD support.

I wrote and typeset (using Adobe InDesign CS3-J) my third book, CJKV Information Processing, Second Edition, which was published at the end of 2008.

I also published the still-current Adobe-CNS1-6 glyph set, designed to accommodate Hong Kong SCS-2008.

My first and only patent, entitled Methods and apparatus for retrieving font data, which was originally filed on April 21st, 2006, was issued on May 3rd, 2011 as US Patent 7937658.

Lastly, I became directly involved in the development of the ISO/IEC 14496-28 standard, entitled Information technology—Coding of audio-visual objects—Part 28: Composite font representation. This standard was first published in 2012, and while it was originally designed to break the 64K-glyph barrier by defining CFR (Composite Font Representation) objects that reference one or more component fonts, it can also be used to define fallback fonts with very rich settings. I managed to successfully argue that this ISO standard be added to the freely-available ones. Be sure to read this article that is one of several CJK Type Blog articles about this standard.

Years 21 through 25

I had the opportunity to attend an ATypI (Association Typographique Internationale) conference for the first time, which was ATypI Hong Kong 2012 that took place in October of 2012. This was also the first time that this annual conference took place in East Asia, so it made sense for me to attend it, and to present, twice. I first presented Kazuraki: Under The Hood, and then presented the first two hours of a three-hour workshop, with my portion being entitled Manipulating CID-Keyed Fonts Using AFDKO Tools. My esteemed colleague, Masataka Hattori (服部正貴), delivered the last hour of the workshop, with his portion being entitled Turning CID-Keyed Fonts Into OpenType Fonts Using AFDKO. I very much enjoyed attending this conference, which gave me an opportunity to meet and speak with key people in the fields of type design and type development.

My biggest accomplishment over the past five years was the planning, development, and deployment of the Adobe-branded Source Han Sans and Google-branded Noto Sans CJK open source Pan-CJK typeface families. This project represents the culmination of an idea that I had way back in 1994, and in many ways represents a dream come true.

Speaking of open source, almost all of my font-related open source projects were released during the past five years. Ignoring Source Han Sans that was mentioned above, the following 16 additional open source projects were prepared and are maintained by me: Adobe Blank, Adobe Blank 2, Adobe NotDef, AGL & AGLFN, AGL Specification, CMap Resources, Command-line Perl Scripts, CSS Orientation Test, FDArray Test, IVS Test, Kenten Generic, LOCL Test, Mapping Resources for PDF, PanCJKV IVD Collection (UNREGISTERED), Tombo SP, and Width Test. Whew!

I also became Adobe’s primary representative to The Unicode Consortium in March of 2015 when Eric Muller left Adobe. I had been serving as Adobe’s alternate representative from January of 2006. I am now attending all four UTC (Unicode Technical Committee) meetings per year.

One of the last things that I will force readers to endure is a look at is my current business card:

 

It certainly has been a long and exciting road, and I learned a lot. Naturally, I still have plenty to learn, and I am sure that more challenges lie ahead. I have worked with many highly-talented colleagues, and many of the achievements described above wouldn’t have been possible without their help and encouragement.

In closing, I would like to point out that while reaching this 25-year milestone is clearly an important personal achievement, the average tenure of the current eleven-person Adobe Type Development team (Nicole Miñoza and Steve Ross are still at Adobe, but no longer considered part of this team) is a remarkable 19 years and 10 months. As long as our headcount remains unchanged, we will reach the 20-year milestone in September. In terms of individual tenure, I rank third in the team, behind world-renown typeface designer Robert Slimbach (1987) and David Lemon (1986), who has been my manager for more than half of my tenure.

#AdobeLife

🐡

]]>
“One Fish, Two Fish, Blowfish, Blue Fish” https://ccjktype.fonts.adobe.com/2016/04/blowfish.html Sun, 17 Apr 2016 20:23:24 +0000 http://blogs.adobe.com/CCJKType/?p=4747 Continue reading ]]>

That’s the title of the eleventh episode of the second season of The Simpsons which originally aired in early 1991.

This article will instead be about the history and evolution of the blowfish image that graces the cover of my books that were published by O’Reilly Media. The following is the first paragraph of the Colophon of CJKV Information Processing, Second Edition:



I first proposed a book to O’Reilly in the latter half of 1992, and part of the process involved visiting their headquarters in Sebastopol, California during which my editor, Peter Mui, and I perused the Dover Pictorial Archive for suitable blowfish images. If memory serves, we found three candidates. The responsibility of selecting an animal for a book cover is usually left to Edie Freedman, who also designed the book covers themselves. Sometimes, when an author makes a case for a particular animal, that animal is chosen. That was the case for Understanding Japanese Information Processing (日本語情報処理 nihongo jōhō shori in Japanese), which was published in 1993.

Why a blowfish?

Back in the early 1990s, localizing software for non-English regions involved not only translating the user interface and documentation, but also dealing with various character sets and encodings. For regions such as Japanese, this meant dealing with three different encoding systems, ISO-2022-JP, Shift-JIS, and EUC-JP, along with a newcomer known as Unicode. There’s an analogy: If you don’t prepare blowfish properly, it will kill you. Likewise, if you don’t prepare your software to properly support Japanese, it will kill your market potential.

All three books, along with their Japanese (1995 and 2002) and Chinese (2002) translations, use the same blowfish image, because each subsequent book is either effectively or literally a new edition of the previous book. The first two books, published in 1993 and 1998, respectively, have the blowfish facing to the left. This was changed in the third and current book. Below are the cover images of all three books:

  

This blowfish image has become somewhat iconic, which is also true of the various animal images that appear on the cover of other books published by O’Reilly.

In closing, when Unicode announced its Adopt a Character campaign late last year, sponsoring U+1F421 🐡 BLOWFISH at the Silver level seemed like an appropriate thing to do, and it was for a good cause. And, did I mention that I really like my digital badge?

Edited on 2016-04-19 to add that Edie kindly pointed out on Twitter that T-shirts were made for my first book, and were quite popular. I even had a sweatshirt made. A former Adobe colleague, Lynn Shade, designed the T-shirt for my second book, and many were made, including a very small number of hooded sweatshirts. While I did prepare a T-shirt design for my latest book, only a handful of prototypes were made. The front and back are shown below:

🐡

]]>
The Passing of My Mentor https://ccjktype.fonts.adobe.com/2015/12/the-passing-of-my-mentor.html Thu, 10 Dec 2015 03:51:40 +0000 http://blogs.adobe.com/CCJKType/?p=4174 Continue reading ]]>

On Thursday, December 3rd of 2015, a great man—and a man of faith—passed from our world to the next. Professor Edward Daub, my mentor and youngest son’s namesake, passed away at the age of 91.

I first met Professor Daub when I was an undergraduate student at UW-Madison, having enrolled in his Technical Japanese 1 class in Fall of 1986. We used the Comprehending Technical Japanese (aka CTJ) textbook for that class, and for the subsequent Technical Japanese 2 class the following semester. Shortly after those classes ended, Professor Daub asked whether I would be interested in working for him, to assist him and Professors Inoue (RIP) and Bird on their Basic Technical Japanese (aka BTJ) book project. Professor Daub observed that I was able to noticed subtle differences in the forms of kanji (Japanese ideographs), and I suspect that he felt that such a skill was useful for this project. The book was published in 1990. I convinced Professors Daub and Bird, along with UW Press, that I could typeset the book using the Japanese version of Aldus PageMaker, but that we needed to acquire an Apple LaserWriter II NTX-J in order to print the camera-ready copy for publishing. We did so, and ended up using Morisawa’s Ryumin and Gothic BBB typefaces for the layout. BTJ was the very first book that I typeset.

My work with Professor Daub contributed very much to the funding of my graduate studies at UW-Madison, which happened to be in a completely different field (linguistics). When my son was born in early 1990, Professor Daub served as a very appropriate namesake. I eventually left the Madison area in Summer of 1991 to start a career at Adobe, and I have been there ever since. When I returned to UW-Madison in May of 1994 to receive my PhD degree, I asked Professor Daub to accompany me when I accepted the degree. Given the extent of his influence in my life and career, this was very appropriate. It was a great honor. I stayed in touch with Professor Daub over the years, and made a point of visiting him whenever my travels brought me back to Wisconsin, where I was born. My father was the first to learn the news of his passing, which he conveyed to me on the day of his passing. I am very grateful about that.

Professor Daub’s funeral will take place in Madison, Wisconsin on the afternoon of Saturday, January 16th of 2016, and I plan to be there, along with my son, Edward, who happens to live and work in the same city.

RIP my friend, my mentor, and my son’s namesake. You will be missed, and I look forward to meeting you again on the other side.

Please join me in raising a glass in his memory and honor. 🍻

]]>
Never Say Never https://ccjktype.fonts.adobe.com/2012/04/never-say-never.html https://ccjktype.fonts.adobe.com/2012/04/never-say-never.html#comments Mon, 30 Apr 2012 21:40:48 +0000 http://blogs.adobe.com/CCJKType/?p=1562 Continue reading ]]> In the realm of CJK Unified Ideographs, there is always talk about no more characters to encode, or that any new characters are simply unifiable variants. This is, in large part, merely wishful thinking.

In my experience, there are three important words to embrace: Never Say Never.

While perusing IWATA Corporations’s website, I came across the page about their extension to Kyodo News’ U-PRESS character set, which included a convenient PDF. I checked all of the characters, mainly to establish as many mappings to Adobe-Japan1-6 as possible, and found that 8 of the kanji were not in Unicode, and this effort involved checking the latest version of Extension E (aka IRG N1830), which is soon to become standardized. The image below highlights in yellow the 8 kanji that are not yet in Unicode:

What this demonstrates is simply that CJK Unified Ideographs are genuinely an open-ended script, and that there is always a possibility that new characters will be coined or discovered.

]]>
https://ccjktype.fonts.adobe.com/2012/04/never-say-never.html/feed 2
The All-Important Macron https://ccjktype.fonts.adobe.com/2012/04/macron.html https://ccjktype.fonts.adobe.com/2012/04/macron.html#comments Tue, 24 Apr 2012 17:05:01 +0000 http://blogs.adobe.com/CCJKType/?p=1491 Continue reading ]]> When transliterating Japanese text using Latin characters, there are three systems or methods for doing so. Of these, the Hepburn system (ヘボン式 hebon shiki) is the most commonly used one, and differs in one important way: long vowels are represented with a macron (U+00AF MACRON or U+0304 COMBINING MACRON) diacritic. Almost all signage in Japan that includes transliterated text, such as in train and subway stations, uses the Hepburn system. However, if we look back to the 1990s and earlier, it was not common to include glyphs for macroned vowels in fonts, whether they were for Latin or Japanese use.

The two other systems, the Kunrei system (訓令式 kunrei shiki) and the Nippon system (日本式 nippon shiki), represent long vowels with a circumflex (U+005E CIRCUMFLEX ACCENT or U+0302 COMBINING CIRCUMFLEX ACCENT) diacritic. It was common for Latin fonts to include glyphs for circumflexed vowels, meaning U+00C2/U+00E2 (Ââ), U+00CA/U+00EA (Êê), U+00CE/U+00EE (Îî), U+00D4/U+00F4 (Ôô), and U+00DB/U+00FB (Ûû), by virtue of being included in ISO/IEC 8859-1 (aka Latin 1). However, due to limitations of Shift-JIS encoding, even Japanese fonts did not include glyphs for these characters.

I can think of three specific things that paved the way to broader use of macroned vowels:

First and foremost, Unicode and its de facto status for representing digital text was a key factor, and laid the foundation. These characters are encoded at U+0100/U+0101 (Āā), U+0112/U+0113 (Ēē), U+012A/U+012B (Īī), U+014C/U+014D (Ōō), and U+016A/U+016B (Ūū).

Second, to enable macroned vowels in mainstream Japanese fonts, a standard glyph set needed to include their glyphs. When I was developing Adobe-Japan1-4 in the late 1990s, glyphs for macroned vowels were early candidates, and eventually became CIDs 9361 through 9370. Of course, they are encoded according to Unicode in the Unicode CMap resources.

Third, mainstream Latin fonts began including glyphs for macroned vowels, mainly thanks to Unicode, and OpenType’s excellent support for Unicode. In terms of Adobe’s glyph sets, glyphs for macroned vowels are included in fonts that support Adobe Latin 3 (aka Adobe CE) or better.

Now, thanks to these efforts, it is relatively easy to transliterate 東京 using the more common Tōkyō, as opposed to the less common Tôkyô. The difference is shown below at a larger size:

Tōkyō versus Tôkyô

]]>
https://ccjktype.fonts.adobe.com/2012/04/macron.html/feed 2
Adobe-Japan1-6 Radical/Stroke Database https://ccjktype.fonts.adobe.com/2012/04/aj16-radical-stroke-db.html Tue, 17 Apr 2012 20:19:17 +0000 http://blogs.adobe.com/CCJKType/?p=1456 Continue reading ]]> I spent approximately two weeks in August of 2004 developing a radical/stroke database for the 14,664 kanji in Adobe-Japan1-6 (CIDs 656, 1125–7477, 7633–7886, 7961–8004, 8266, 8267, 8284, 8285, 8359–8717, 13320–15443, 16779–20316, and 21071–23057), which is available as a tab-delimited text file that is keyed by Adobe-Japan1-6 CIDs, and as a PDF file that is keyed by indexing radical, then by the number of strokes of the indexing radical instance, followed by the number of remaining strokes, and finally by Adobe-Japan1-6 CID.

For each Adobe-Japan1-6 kanji, there is at least one radical/stroke record, which consists of the following three comma-separated fields: the indexing radical (1–214), the number of strokes of the indexing radical instance, and the number of remaining strokes. Of course, the total number of strokes can be calculated by simply adding the second and third field.

The excerpt below, taken from the PDF file, illustrates two features of this radical/stroke database:

First, it demonstrates the importance and usefulness of recording the number of strokes for the indexing radical instance. Radical #162, for example, can consist of three (辶), four (辶), or even seven (辵) strokes, depending on the glyph.

Second, it shows that some glyphs have additional radical/stroke records, which are shown under the Adobe-Japan1-6 CID. In the tab-delimited text file, multiple radical/stroke records are separated by a semicolon. 2,429 of the 14,664 kanji have two radical/stroke records, and 21 of them have three records.

Enjoy! I hope that others find these to be useful resources when working with Adobe-Japan1-6 kanji.

]]>
Advantages of Numeric Character References https://ccjktype.fonts.adobe.com/2012/04/ncr.html Fri, 06 Apr 2012 19:57:35 +0000 http://blogs.adobe.com/CCJKType/?p=1393 Continue reading ]]> Unicode has become the preferred way in which to represent text in digital form, and for good reason. Its broad coverage of our planet’s scripts and languages is the single greatest reason why this has happened. All of the major OSes have embraced Unicode. In other words, if you develop a product that makes use of text data, and if it doesn’t support Unicode, you’re doing something wrong.

Unicode comes in a variety of representations called encoding forms. The three most basic Unicode encoding forms are UTF-8, UTF-16, and UTF-32. The latter two are also available in explicit little- or big-endian flavors: UTF-16LE, UTF-16BE, UTF-32LE, and UTF-32BE. These are covered in Chapter 4 of CJK Information Processing, Second Edition. But, there are times when a bomb-proof way of representing Unicode characters is needed, or when an otherwise ASCII-only web document requires the occasional Unicode characters. For these purposes, and in the context of web documents, Numeric Character References (aka, NCRs) have great advantages. One of the advantages is its human-readability in terms of conveying an explicit Unicode code point. Another advantage is that only ASCII characters are used for this notation, which is its bomb-proof aspect.

To briefly demonstrate now NCRs work, consider the following two renditions of U+4E00, which is the first CJK Unified Ideograph of the URO (Unified Repertoire & Ordering), and which means “one”: 一 (NCR representation: “&#x4E00;“) versus 一 (binary representation in one of the Unicode encoding forms). Both forms look the same, and they should. If you examine the HTML source of this page by using the appropriate display option of your browser, you’ll see the difference. All modern browsers supports NCRs.

An NCR is composed of three parts. The first part is comprised of the three characters &#x. What follows is the character designator, which is best described as a Unicode scalar value, such as U+4E00, but without the “U+” prefix, meaning 4E00. The last part is simply a trailing semicolon.

Some appropriate uses of NCRs include the occasional use of a copyright or trademark symbol in what would otherwise be an ASCII-only web document, or for the occasional use of any non-ASCII character, such as 一.

]]>
Not One, But Three, IVD Code Charts https://ccjktype.fonts.adobe.com/2012/03/three-ivd-code-charts.html Fri, 23 Mar 2012 14:06:43 +0000 http://blogs.adobe.com/CCJKType/?p=1334 Continue reading ]]> Thanks to an excellent suggestion from Taichi Kawabata (川幡太一), the 2012-03-02 version of the IVD (Ideographic Variation Database) includes three IVD Code Charts, which were released today. The two earlier versions of the IVD—2007-12-14 and 2010-11-14—included only one IVD Code Chart, named IVD_Charts.pdf.

One of the three IVD Code Charts is still named IVD_Charts.pdf, and represents the entire IVD for that particular version. The other ones, whose filenames include the name of each registered IVD collection, represent a subset of the IVD that is based on each registered IVD collection. For the 2012-03-02 version of the IVD, the two additional IVD Code Charts are thus named IVD_Charts_Adobe-Japan1.pdf and IVD_Charts_Hanyo-Denshi.pdf, for the Adobe-Japan1 and Hanyo-Denshi IVD Collections, respectively.

The IVD_Stats.txt datafile was also updated today, for all three versions of the IVD, and now reports additional statistics.

If you are interested in timely updates about the IVD, please consider following the IVD Registrar (@IVD_Registrar) on Twitter.

]]>
Genuine Han Unification https://ccjktype.fonts.adobe.com/2012/01/genuine-han-unification.html Wed, 04 Jan 2012 17:34:38 +0000 http://blogs.adobe.com/CCJKType/?p=540 Continue reading ]]> I have been attending the Internationalization & Unicode Conference (aka, IUC) every year for the past several years, and I typically deliver a presentation (or two) during the two-day conference proper. I was given the opportunity to present about an intriguing and forward-looking topic at IUC35 last October that I entitled Genuine Han Unification (click on the title to view the presentation slides).

The primary premise of the presentation is that it is possible that the (sometimes subtle) differences between CJK Unified Ideographs as used by the various regions—such as China, Taiwan, Hong Kong, Japan, and the Koreas—may become irrelevant, and that there may be a movement to unify these differences so that a single shape or form becomes the norm. Of course, this cannot happen today, mainly due to the biases of the current generation, but I am suggesting that a future generation may be bold enough to take on such an initiative. After all, these characters are from the same script, and even with subtle differences, they are mutually intelligible, as evidenced by today’s mobile devices that often deliver a single glyph per CJK Unified Ideograph code point regardless of the language.

Below is Table 3-99 that was excerpted from page 174 of CJKV Information Processing (Second Edition) that provides examples of CJK Unified Ideographs whose shapes may be different depending on the locale or region.

Of course, the CJK Unified Ideograph U+4E00 (一) is present in the table because it serves as a prototypical example of a CJK Unified Ideograph that requires only a single glyph regardless of the locale or region.

Interestingly, the first real-world implementation of Genuine Han Unification is arguably GB 18030, which is a character set standard that was established by China, whose first version was published in 2000. A revised version was subsequently published in 2005. A relevant characteristic of GB 18030 is that it is code-point–compatible with all future versions of Unicode, meaning that all CJK Unified Ideographs have a corresponding GB 18030 code point, and that GB 18030 defines a single glyph per CJK Unified Ideograph code point. This, by definition, is Genuine Han Unification. In other words, regardless of whether a particular CJK Unified Ideograph is used in China, GB 18030 includes a glyph for it, designed according to the conventions set forth by China. This means that CJK Unified Ideographs that are specific to Japan or Korea may look inappropriate to those from those regions.

In any case, I encourage those who are interested in this topic to peruse my IUC35 presentation whose link is provided earlier in this post. This topic is admittedly one that is likely to polarize people, meaning that some will vehemently disagree, and some will completely agree. Because it is a forward-looking topic, only time will tell what will happen.

]]>