(翻訳:Adobe Type チーム 山本太郎)
グリフの可変字幅を可能にしながら、縦組みでのグリフの回転が必要となる欧文や和文組版における縦中横(縦組み行の中に横組みの要素が入る)の組み方も取り扱えるモデルを、最近考案しました。
本記事の目的は、私が開発したオープンソースのフォントと、その動作モデルの記述に関心を寄せていただくことにあります。そのフォントに対応するアプリケーションソフトウェアとレイアウトエンジンを実装する開発者に活用していただくことを意図したものです。
テストフォントは二つの軸をもつバリアブルフォントで、1,111,998 個の Unicode のコードポイントを、UAX #50(Unicode Vertical Text Layout)のデータファイル「VerticalOrientation.txt」の Unicode 12.1 のバージョンに基づいて、次のグリフのどちらか一方に対応づけています。
縦組みのレイアウトでは、姿勢が直立か、回転させたものかの違いが重要となるため、上記のコードポイントとグリフとの対応づけは妥当と考えられます。
ここで紹介するテストフォントには、欧文と CJK のグリフそれぞれ 256 個のインスタンスが含まれ、GID は 1 から 256 までと GID が 257 から 512 になります。その形とパラメータは下記のとおりです。
それぞれのグリフに 256 のインスタンスを持たせている理由は、それによって「cmap」テーブルで指定される Unicode の対応づけを簡略化できるからです。このことは、対応関係が百万を超えるような場合に重要となります。
GID+513 は、縦中横で用いられる明示的に字幅が半角のグリフで、「hwid」(Half Widths—字幅半角)の GSUB フィーチャーを用いて欧文のグリフを置換するために用いられます。
このテストフォントには既に登録済みの「wdth」(Width—字幅)と未登録の「VWID」(Vertical Width—縦組み用字幅)のデザインのバリエーションの軸が含まれます。「縦組み用字幅」という訳語については、むしろ「垂直字送り量」という呼称が技術的にはより正確だとは認識していましたが、「字幅」の方が「wdth」軸との組み合わせという点では良いと思います。この両方の軸について、デフォルトの設定は 500 となっています。それは、横組みの欧文グリフの字送り量が 600 ユニットであることと、CJK グリフの 1000×1000 の正方形(字幅全角)のボディに対応したものです。設定値の最小値は 1 で、最大値は 1000 となっており、それぞれが 25% 狭い字幅と 25% 広い字幅とに対応します。
このテストフォントは GitHub 上のオープンソースの「Width & Vertical Width VF」プロジェクトにおいて、OpenType/CFF2 と TrueType の両方の形式で入手可能になっています。
横組みのレイアウトを行う場合には、この記事の中で記述されているモデルは比較的単純です。「wdth」軸がグリフを望み通りに X 軸に沿って狭くしたり広げたりするために利用されます。「VWID」軸はそのデフォルトの設定に固定され、CJK のグリフ用の全角の字幅または 1000 ユニットに対応しています。言いかえれば、相対的なグリフの高さは横組みでは変化しないということです。
次に示す GIF アニメーションは、このテストフォントを用いて作成したもので、可変字幅の横組みのレイアウトを示したものです(ここで使った文字コード列は実は「かなABC漢字」に対応していますが、矩形グリフだけで表示されているので、それは明示されません)。
可変字幅が狭くなるときも、広くなるときも、グリフの高さが変化しないのは、「VWID」軸がデフォルトの設定に固定されているからです。アニメーションの時間設定については、デフォルトの設定が 5 秒間、二つの両極に 2 秒間、中間の設定に 1 秒間を割り当てています。
縦組みのレイアウトを行うときには、縦組みでも直立の姿勢で変化しない CJK のグリフについては、比較的単純明快です。垂直の Y 軸に沿って可変字幅を好みに合わせて狭くしたり広げたりするために「VWID」軸が利用され、「wdth」軸は CJK グリフの全角の字幅または 1000 ユニットのデフォルトの設定に固定されます。特別な取り扱いが必要な場合には次の二つの場合があります。
次に示す GIF アニメーションは、このテストフォントを用いて作成したもので、可変字幅の縦組みのレイアウトを示したもので、回転した文字列と縦中横を含んでいます。(先の例と同様、ここで使った文字コード列は「あAB漢字12国」に対応したものですが、矩形グリフだけで表示されているので、それは明示されません)。
回転した欧文グリフが狭くなるときも、広くなるときも、相対的な高さは変化しません。回転した欧文グリフの相対的な高さは、CJK グリフの字幅に固定されています。同様に、縦中横のグリフの高さは、低くなったり、高くなったりしますが、その字幅は[縦中横の対象が 2 文字に限られる場合には]CJK グリフの字幅に固定されます。アニメーションの時間設定については、上記の横組み用のものと同じです。
次の表は、このモデルに関して、上述の二つのセクションで述べたレイアウト上の諸条件で、二つのデザインのバリエーションの軸の設定と設定範囲を要約したものです。
軸 | 横組み | 縦組み直立 | 縦組み回転形 | 縦組み縦中横 |
---|---|---|---|---|
wdth | 1〜1000 | 500 | 1〜1000 | 500 |
VWID | 500 | 1〜1000 | 500 | 1〜1000 |
もちろん、ここで実際に使われている設定と設定範囲は、私が開発したテストフォントを基にしたもので、このモデルに従って作られた他のバリアブルフォントが少し異なる値を使用することはありえます。ここで肝要なことは、複数の軸の内の一つはデフォルトの設定に固定されることで、私が開発したフォントの場合には 500 になるということです。
「wdth」と「VWID」のデザインのバリエーションの軸は、一つのバリアブルフォント中では別個の軸として実装され、それらの軸のうち一つは上述のように固定される必要がありますが、UI は「可変字幅」という名前をつけた、機能している一つの軸だけを表示するのが適切でしょう。横組みか縦組みかというレイアウトの方向、または文字が回転されるか縦中横の文脈で用いられるかに依存して、そこでの設定が適切な軸に適用されることが肝要です。もう一つ別のありうる方法は、デフォルトでは複数の軸のうちの一つを固定しますが、そのデフォルトの動作を上書きして、その固定を外すことができるようにするというものです。
どちらの場合でも、これら複数の軸がどのようにアプリケーションの UI 上で表示されるかは、アプリケーション自体がどれだけ洗練されたものであるかに大きく依存するかもしれません。あまり洗練されていないアプリケーションの場合には、機能的な軸を一つだけ表示するのが好ましいでしょうし、より洗練されたアプリケーションの場合には、上で述べたように、どちらの軸も表示してデフォルトの動作を上書きすることができるようにすることも可能でしょう。
もしこの記事で説明したモデルが一般に受け入れられるなら、「VWID」というデザインのバリエーション軸を登録することが必要になるでしょう。そうすれば、「vwid」や「vadv」などと同様に、すべて小文字で表記された名前を持つことになります。
最後に、この記事に書かれている事柄はすべて、現時点ではまだ、横組みにおいても縦組みにおいても可変字幅(狭い字幅・広い字幅あるいはその両方)に対応できる CJK バリアブルフォントを実装する場合の標準的な方法になればよいと私が考えているモデルの提案の段階にあります。
ぜひ、コメントをお持ちの場合には、お知らせください。
]]>I recently came up with a Variable Font model to handle glyph compression and expansion in horizontal and vertical layout that includes support for characters whose glyphs rotate in vertical layout, such as the glyphs for Western characters, along with TCY (縦中横 tatechūyoko in Japanese, which literally means “horizontal in vertical”) support.
The purpose of this article is to call attention to the open source test font that I developed, along with a description of the model itself, which are intended to be used by developers to implement such support in apps and layout engines.
The test font is a two-axis Variable Font that maps all 1,111,998 Unicode code points to one of two glyphs, based on the Unicode Version 12.1 version of the UAX #50 (Unicode Vertical Text Layout) data file, VerticalOrientation.txt:
Given that upright versus rotated orientation plays an important role in vertical layout, the above mappings seemed quite appropriate.
The test font includes 256 instances of the Western and CJK glyphs, from GIDs 1 through 256 and GIDs 257 through 512, respectively, whose shapes and parameters are described as follows:
Why 256 instances of each glyph? It simplified the Unicode mappings that are specified in the 'cmap' table. This is important when dealing with over a million mappings.
GID+513 is an explicit half-width glyph that is intended to be used for TCY purposes, and is used to substitute the Western glyphs via the 'hwid' (Half Widths) GSUB feature.
The test font includes the registered 'wdth' (Width) and unregistered 'VWID' (Vertical Width; and yes, I am aware that “Vertical Advance” would be technically more correct, but the use of “Width” better pairs with the 'wdth' axis) design-variation axes. For both axes, the default setting is 500, which corresponds to a 600-unit horizontal advance for the Western glyphs, and a 1000×1000 box (aka full-width) for the CJK glyphs. The minimum and maximum axis settings are 1 and 1000, which corresponds to 25% compression and 25% expansion, respectively.
The test font is available in the open source Width & Vertical Width VF project on GitHub in both OpenType/CFF2 and TrueType formats.
The model that is being described in this article is relatively simple when performing horizontal layout: the 'wdth' axis is used to compress or expand glyphs along the X-axis as desired, and the 'VWID' axis is constrained to its default setting, which corresponds to full-width or 1000 units for the CJK glyphs. In other words, the relative height of the glyphs remains unchanged in horizontal layout.
The animated GIF below was created using the test font, and illustrates compression and expansion in horizontal layout (although it’s not obvious, the character string that was used was “かなABC漢字”):
Note how the glyph height remains unchanged regardless of the compression or expansion, which is due to constraining the 'VWID' axis to its default setting. In terms of animation timing, the default setting is five seconds, the two extremes are two seconds, and the intermediate settings are one second.
When performing vertical layout, the handling of CJK glyphs that remain upright is relatively straight-forward: the 'VWID' axis is used to compress or expand glyphs along the Y-axis as desired, and the 'wdth' axis is constrained to its default setting, which corresponds to full-width or 1000 units for the CJK glyphs. The following are the two cases that require special handling:
The animated GIF below was created using the test font, and illustrates compression and expansion in vertical layout, which includes both rotated and TCY strings (once again, it’s not obvious that the character string that was used was “あAB漢字12国”):
Note how the rotated Western glyphs become narrower or wider, but that their relative height remains unchanged. The relative height of the rotated Western glyphs is bound to the width of the CJK glyphs. Likewise, the height of the TCY glyphs become shorter or taller, but their widths are bound to the width of the CJK glyphs. The animation timing is identical to the horizontal one.
The following table summarizes the model, in terms of the settings and setting ranges for the two design-variation axes in the layout conditions that were described in the previous two sections of this article:
Axis | Horizontal | Vertical—Upright | Vertical—Rotated | Vertical—TCY |
---|---|---|---|---|
wdth | 1–1000 | 500 | 1–1000 | 500 |
VWID | 500 | 1–1000 | 500 | 1–1000 |
Of course, the actual settings and setting ranges are based on the test font that I developed, and other Variable Fonts that follow this particular model may use slightly different values. The main point is that one of the axes is constrained to its default setting, which is 500 in the test font that I developed.
Although the 'wdth' and 'VWID' design-variation axes are implemented as separate axes in a Variable Font, and given that one of the axes needs to be constrained per the descriptions above, UIs should expose only a single functional axis, named “Width,” whose setting is applied to the appropriate axis, depending on the layout direction—horizontal or vertical—and whether the characters are rotated or used in a TCY context. Another alternative is to lock one of the axes by default, but to permit unlocking to override the default behavior.
In any case, how these axes are exposed to users in app UIs may largely depend on the sophistication of the apps themselves. Less-sophisticated apps would benefit by exposing only a single functional axis, and more-sophisticated ones may allow the default behavior to be overridden by exposing both as described at the end of the previous paragraph.
If the model described in this article becomes generally accepted, the 'VWID' design-variation axis will need to be registered, which means that it would become an all-lowercase tag, such as 'vwid', 'vadv', or similar.
In closing, everything that is described in this article is a proposal for a model that I hope will become the standard way in which CJK Variable Fonts that support compression, expansion, or both, in horizontal and vertical layout, is implemented.
Of course, comments are welcome and encouraged.
]]>I spent the last couple of weeks developing a Variable Font version of the infamous Adobe Blank, and the open source project, named Adobe Blank VF & Friends, was released yesterday evening. But, before I detail what makes the Variable Font versions special, besides being Variable Fonts, let’s briefly go over the history of Adobe Blank and Adobe Blank 2.
First released in 2013 as open source, Adobe Blank simply maps all 1,111,998 Unicode code points to non-spacing and non-marking glyphs. What made the project interesting for me was to find the right balance between the number of glyphs and the size of the 'cmap' table. When mapping over a million code points, this becomes a valid concern. After some experimentation, I found that 2,049 glyphs was the sweet spot that resulted in 'CFF ' and 'cmap' tables of a relatively small size.
Adobe Blank 2, which was first released in 2015, is a two-glyph version of Adobe Blank that includes a Format 13 (Many-to-one range mappings) 'cmap' subtable that maps all 1,111,998 Unicode code points to GID+1. At the time, there was no convenient way to create a Format 13 subtable, so I used ttx, and supplied the actual hex values of the compiled subtable. The current version of ttx can successfully compile a Format 13 subtable by explicitly specifying all 1,111,998 mappings.
That then brings us to the Variable Font versions…
Unlike Adobe Blank and Adobe Blank 2 that are CID-keyed and specify the special-purpose Adobe-Identity-0 ROS (Registry, Ordering, and Supplement), the Variable Font versions are not CID-keyed, because the name- versus CID-keyed distinction does not exist in the 'CFF2' table that is used for Variable Fonts.
Like Adobe Blank, Format 4 (Segment mapping to delta values) and Format 12 (Segmented coverage) 'cmap' subtables are included. The former subtable includes 63,454 mappings. The latter includes all 1,111,998 mappings.
I originally developed a marking version, named Adobe Black VF, in order to visually test its two design axes, 'wdth' (Width) and 'HGHT' (Height), and later figured that it serves as an excellent test font. The latter axis tag is all uppercase, because it is not yet registered. In keeping with the spirit of Adobe Blank in terms of being non-spacing, the default value of its two axes is zero (0). As the axis values increase, with 1000 being the maximum value, the horizontal or vertical advance also increases as appropriate.
What really makes these Variable Fonts special is the presence of the 'VVAR' (Vertical Metrics Variations) table, which is necessary to accommodate the variable metrics that are associated with the 'HGHT' axis. The AFDKO (Adobe Font Development Kit for OpenType) tools only recently started to support this table.
It is best to use the marking version, Adobe Black VF, to explore how the 'wdth' and 'HGHT' axes are expected to behave in horizontal and vertical writing modes:
Axis | Horizontal | Vertical |
---|---|---|
wdth | The glyph and its horizontal advance expand along the X-axis to the right from the horizontal origin as the value increases | The glyph expands along the X-axis from the center of the em-box to its left and right edges as the value increases |
HGHT | The glyph expands along the Y-axis from the center of the em-box to its top and bottom edges as the value increases | The glyph and its vertical advance expand along the Y-axis downward from the vertical origin as the value increases |
The five-frame animated image below shows how the two axes behave in horizontal writing mode, starting from axis values that are zero (0), alternately incrementing 'wdth' then 'HGHT' to 500 then to 1000 (surrounded by 1000×1000 cyan-colored boxes to better demonstrate the varying horizontal advances):
The five-frame animated image below illustrates the same, but in vertical writing mode, again starting from axis values that are zero (0), but alternately incrementing 'HGHT' then 'wdth' to 500 then to 1000:
Neat, eh?
Like Adobe Blank 2, a Format 13 'cmap' subtable is used to map all 1,111,998 Unicode code points to GID+1. Adobe Blank 2 VF and Adobe Black 2 VF are otherwise identical to Adobe Blank VF and Adobe Black VF, and should be used only in environments that support the Format 13 'cmap' subtable.
Enjoy!
]]>On this date last year, I published the Contextual Spacing GPOS Features article, and this briefer article serves as an update.
Two important steps toward implementation were taken during the past year:
With regard to the second step, the changes that were made, compared to Version 1.004, were 1) the @ColonExclamQuestion and @ColonExclamQuestionVert glyph classes that include the glyphs for the exclamation and question marks, along with left-justified forms of the colon and semicolon, were removed, as were their contexts; 2) quarter-width metrics adjustments were removed; 3) contexts that include centered punctuation followed by another centered punctuation or a full-width space were removed; and 4) contexts that include a justified period or comma followed by centered punctuation were added.
In closing, the animated image below is identical to the one in the previous article, but has been made using the latest font (none of the changes affected this particular text), and better illustrates the advantages of these GPOS features in environments with limited line-layout capabilities:
As usual, any and all feedback is greatly appreciated!
]]>This is a brief article to draw readers’ attention to my latest test font, which is a 12-font 65,535-glyph OpenType/CFF Collection that is intended to test how well an app or other font-consuming environment supports language tagging for East Asian text, to include the handling of localized strings, such as those for menu names in the 'name' table, and for named Stylistic Set 'GSUB' features.
The Variable Font Collection test fonts that were made available at the beginning of this month serve this purpose to some extent, but they also require an environment that supports not only Variable Fonts (aka OpenType/CFF2 fonts), but also Variable Font Collections (aka OpenType/CFF2 Collections). The main intent of this OpenType/CFF Collection is to remove the Variable Font baggage from the testing requirement. It also includes support for Macao SAR as a third form of Traditional Chinese, which was described in the previous article.
Please visit the open source Source LOCL Test project for more details, or to download the pre-built OpenType/CFF Collection binary from the Latest Release page.
Enjoy!
]]>Macao SAR (SAR stands for Special Administrative Region)—written 澳門特別行政區 or 澳門特區—is in the process of standardizing MSCS (Macao Supplementary Character Set or 澳門增補字符集 in Chinese), which is character set standard that is designed as a supplement to HKSCS (Hong Kong Supplementary Character Set), and by extension, as a supplement to Big Five. One reliable source told me that MSCS can be described as HKSCS plus approximately 150 additional characters.
There are a small number of components whose forms differ from HK conventions, in terms of ideographs that are common with HKSCS, to include Big Five. The preliminary information that I have seen indicates that the following components follow TW conventions: 次 (versus 次 for HK), 女 when used as a left-side component (such as in 好), 粵, 亦, 告 instead of 吿 (such as in 造), 香, 木 when used as a bottom component (such as in 榮), 䍃 (such as in 猺), 啚 (such as in 鄙), and 肉 when used as a bottom component (such as in 骨). Interestingly, it seems that the KR form of the component 关 is used (such as in 咲). The image below includes ideographs that contrast the MO (top) versus the HK (bottom) forms for most of the affected components:
In an effort to prepare to properly support Macao SAR as a third form of Traditional Chinese in the Source Han Sans and Source Han Serif typefaces, along with the Google-branded Noto CJK ones, the open source Variable Font Collection Test project includes three Variable Font Collections (aka OpenType/CFF2 Collections), along with their 12 component Variable Fonts (aka OpenType/CFF2 fonts), that support Macao SAR in two key ways: the 'name' table strings in the fonts whose default region is MO—the two-letter region code for Macao SAR—specify 0x1404 as the Language ID, and the 'locl' (Localized Forms) GSUB feature uses the soon-to-be-registered 'ZHTM' language tag.
In closing, I need to clarify that there are no immediate plans to support Macao SAR as a third form of Traditional Chinese in the Adobe- and Google-branded open source Pan-CJK typefaces, but it is something that I eventually plan to do. Still, no action can be taken until the MSCS standard is published, at least in terms of font development. Font development will also involve supporting the 21 IVSes in the MSARG IVD Collection, along with the appropriate glyphs. With regard to 'name' table and 'locl' GSUB feature support, testing can start now, thanks to the open source test fonts that I made available.
P.S. In the process of writing this article, I learned that the currency used in Macao SAR is called the Macanese pataca, and instead of using a special symbol to represent it, the four characters “MOP$” are used.
P.P.S. The Source LOCL Test OpenType/CFF Collection test font serves as an ideal vehicle for testing support for Macao SAR in apps and other font-consuming environments.
]]>This is a short article that is simply meant to draw developers’ attention to three OpenType/CFF2 Collections (aka Variable Font Collections) that I built this week, which are now available in the open source Variable Font Collection Test project. As stated in the project, the purpose of these Variable Font Collections is to simulate the Source Han and Noto CJK fonts deployed as Variable Fonts, to help make sure that the infrastructure—OSes, apps, layout engines, libraries, and so on—will support them. Remember that it took several years for Microsoft to support OpenType/CFF Collections (OTCs), which finally happened on 2016-08-02. In other words, this is not trivial.
In addition to being collections, these Variable Font Collections are also meant to exhibit characteristics that may have been overlooked in environments that support Variable Fonts, such as multiple FDArray elements, and also include a large number of glyphs and Unicode mappings. These are the characteristics of Pan-CJK fonts, meaning that these Variable Font Collections accurately simulate genuine real-world use cases. In terms of Adobe’s own apps, I am pleased to state that these Variable Font Collections are responsible for a half-dozen bug reports. None of my colleagues went into shock after learning that these fonts broke our apps.
Why collections? Deploying Pan-CJK fonts as separate language-specific Variable Fonts doesn’t make much sense, mainly because it defeats one of the purposes of Variable Fonts, which is a reduced footprint.
BTW, I built what may have been the very first Variable Font Collection on 2019-01-08 using Variable Fonts that my colleague, Masataka Hattori (服部正貴), prepared. I then built what may have been the very first Variable Font with multiple FDarray elements on 2019-01-29. Good stuff.
In closing, I’d like to thank my colleague, Miguel Sousa, for preparing the CFF2 glyph data that I used as the basis for these Variable Font Collections. They should serve as excellent testing fodder for developers of font-consuming software.
]]>Something extraordinary happened today.
This extraordinary event provided to me an opportunity to revisit the open source LOCL Test OpenType/CFF test font that I introduced over two years ago. I improved the language declarations in the 'locl' (Localized Forms) GSUB feature definition, and also made other minor tweaks, two of which can be seen in the image above.
The version of Adobe InDesign CC that was released today during Adobe MAX, Version 14.0, now supports language-tagging for a fifth East Asian language: Traditional Chinese for Hong Kong. This new language-tagging option appears as “Chinese: Hong Kong” in the Character Styles and Paragraph Styles panels, and as the same in the Character panel.
For those who were not aware, OpenType has supported language-tagging for Hong Kong, a flavor of Traditional Chinese, for over 10 years via the three-letter language tag ZHH, which was introduced in Version 1.5 (May 2008) of the OpenType Specification. ZHS is the language tag for Simplified Chinese, and ZHT is the one for Traditional Chinese, but for Taiwan. For Japanese and Korean, JAN and KOR are their language tags, respectively. I am very pleased that Adobe InDesign finally supports all five of these OpenType language tags.
The timing couldn’t have been better…
]]>(After realizing that the retargeting of Adobe-Japan1-7 to include only two glyphs, and with a fairly predictable release date range, exhibited characteristics of a pregnancy, I became inspired to write the text for the Adobe-Japan1-6 is Expecting! article while flying from SJC to ORD on the morning of 2018-07-20. I also prepared the article’s images while in-flight. The passenger sitting next to me was justifiably giving me funny looks. My flight to MSN, which was the final destination to attend my 35th high school class reunion in greater-metropolitan Mount Horeb, was delayed three hours, and this gave me an opportunity to publish the article while still on the ground at ORD.)
What do we know about Japan’s new era name? First and foremost, its announcement is unlikely to occur before 2019-02-25, because doing so would divert attention away from the 30th anniversary of the enthronement, 2019-02-24, but it may occur as late as 2019-05-01, which is the date on which the new era begins. That’s effectively a two-month window of uncertainty.
Interestingly, the date 2019-05-01 takes place not only during UTC #159, which will be hosted by me at Adobe, but also during Japan’s Golden Week (ゴールデンウィーク), which may begin early to prepare for the imperial transition.
It needs to be made absolutely clear that it is perfectly acceptable to represent Japan’s era names as two separate kanji, such as 平成 (Heisei) for the current era. However, the very first version of Unicode, Version 1.1 (1993), includes the two-kanji square ligature form of 平成 as U+337B ㍻ SQUARE ERA NAME HEISEI, along with those for the three previous era names. JIS X 0213:2000 was the first JIS standard in which these four characters appeared (JIS X 0221-1995 doesn’t count). This means that there is a precedent for applying the same treatment to Japan’s forthcoming new era name.
I predict that the two-kanji square ligature form of Japan’s new era name will be used more frequently than U+337B ㍻ for the current era, mainly because its use will be considered trendy. In addition, because it requires half the number of encoding units to represent, it may become popular or preferred in length–challenged environments, such as Twitter.
This also means that the JIS X 0213 standard, which was amended in 2004 and 2012, may be amended for a third time to include this new character. If that actually happens, which seems very likely, my best guess is that its Plane-Row-Cell value will be 1-13-63, which is the code point immediately before 1-13-64 (aka U+337B ㍻).
Adobe is already making preparations for its apps, fonts, and other pieces of infrastructure on which our customers and development partners depend.
To that end—and to minimize risk due to the fairly fixed timeline—I decided to define Adobe-Japan1-7 to add exactly two glyphs, CIDs 23058 and 23059, for the horizontal and vertical forms, respectively, of the two-kanji square ligature that will represent Japan’s forthcoming new era name. Furthermore, the code point that will represent the two-kanji square ligature form of Japan’s new era name, U+32FF ㋿, has been reserved by both the UTC (aka Unicode) and WG2 (aka ISO/IEC 10646). Given that the code point and CIDs are stable, I was able to release the Adobe-Japan1-7 versions of the CMap resources and “Adobe-Japan1-UCS2” ToUnicode mapping file to the CMap Resources and Mapping Resources for PDF open source projects, respectively, on 2018-07-30. This allows our apps to update to the Adobe-Japan1-7 versions. It also enables Japanese type foundries to prepare prototype Adobe-Japan1-7 fonts that use placeholder glyphs for CIDs 23058 and 23059. The updated ToUnicode mapping file, which maps Adobe-Japan1-7 CIDs to Unicode values, is particularly important for PDF workflows, especially for PDFs that do not include their own ToUnicode mapping table. Without the ToUnicode mapping file update, PDFs that include this glyph, but lack an embedded ToUnicode mapping table, will not be able to properly Copy&Paste U+32FF ㋿ into other apps.
The actual Adobe-Japan1-7 specification cannot be updated until shortly after the announcement, because representative glyphs are necessary. Its Wiki does describe Adobe-Japan1-7, and also indicates that any glyphs that were previously candidates for Adobe-Japan1-7 are now Adobe-Japan1-8 ones.
Adobe’s priority, in terms of updating key typeface families to include the glyphs for U+32FF ㋿, is Kozuka Mincho (小塚明朝), because its glyphs are needed as the representative glyphs for the glyph charts of the Adobe-Japan1-7 specification. Next will be the open source Source Han Sans (源ノ角ゴシック) and Noto Sans CJK Pan-CJK typeface families, primarily because our friends at Google will need the latter family’s fonts for their ecosystem. (I am planning to include the hooks for supporting U+32FF ㋿ in the Version 2.000 update, by including placeholder glyphs and their mappings, to make the already-planned dot-release much easier.) I will then turn my attention to the Kozuka Gothic (小塚ゴシック) typeface family.
I already built—for internal Adobe testing purposes—Adobe-Japan1-7 prototypes of the Kozuka fonts that use placeholder glyphs for CIDs 23058 and 23059. The 'cmap' table maps U+32FF ㋿ to CID+23058, and the 'vert' (Vertical Alternates) GSUB feature substitutes CID+23058 for CID+23059 when in vertical writing mode. These fonts have already proven to be very useful. The JIS2004-savvy fonts include “Pr7N” in their names, and the JIS90-savvy ones include “Pr7” instead. I also used the opportunity to build two types of OpenType/CFF Collections. One type includes two fonts, specifically the Pr7N and Pr7 versions for each family and weight, which share the same CFF. The other one simply includes all 24 Kozuka fonts, and is a little less than 70MB. It’s not difficult to guess which one I am using.
Although I cannot make the prototype Kozuka fonts available, because they are commercial fonts, I did build an open source font named Adobe Japan1 7 Heavy whose CIDs use full-width glyphs of the single digits that represent the Supplement to which they belong. For example, the glyphs for CIDs 23058 and 23059 are displayed as the digit for the number seven. I prepared test PDFs that were exported from Adobe InDesign and Adobe Illustrator that include five Japan era name characters—U+337E ㍾, U+337D ㍽, U+337C ㍼, U+337B ㍻, & U+32FF ㋿—in a horizontal and vertical text frame. (Note that the original Adobe InDesign or Adobe Illustrator files are attached to the respective PDFs for the purpose of repurposing.) The horizontal text displays as “00017” because the glyphs for the first three are in Supplement 0 (aka Adobe-Japan1-0), that of the fourth is in Supplement 1 (aka Adobe-Japan1-1), and that of the fifth is in Supplement 7 (aka Adobe-Japan1-7). The vertical text displays as “44447” because the glyphs for the first four are in Supplement 4 (aka Adobe-Japan1-4), and that of the fifth is in Supplement 7 (aka Adobe-Japan1-7). The InDesign-exported PDF file includes an embedded ToUnicode mapping table, whereas the Illustrator-exported one does not, and therefore depends on Adobe Acrobat’s “Adobe-Japan1-UCS2” ToUnicode mapping file to correctly Copy&Paste the glyphs for U+32FF ㋿.
Even if Japan’s new era name were to be announced as early as 2019-02-25, it would still be too late to include it in Unicode Version 12.0, which is scheduled to be released on 2019-03-05. The UTC has therefore decided to issue a dot-release, Version 12.1, shortly after the announcement, and it will include a single character, specifically U+32FF ㋿. This isn’t the first time that Unicode issued a dot-release that included only one character. Version 6.2, which was released in 2012, added only U+20BA ₺ TURKISH LIRA SIGN. Anyway, what makes this new character particularly problematic, in terms of not being able to include it in Version 12.0, is the fact that its character name cannot be established until the two kanji that comprise it are announced, and also that it requires a decomposition to be defined that results in the same two kanji. For example, U+337B ㍻ decomposes to 平成 according to NFKD (Normalization Form KD: Compatibility Decomposition) and NFKC (Normalization Form KC: Compatibility Decomposition, followed by Canonical Composition).
Shortly after Unicode Version 12.1 is released, CLDR (Common Locale Data Repository) and ICU (International Components for Unicode) are expected to be updated to support the new era name. This is particularly important for updating calendar and date formats.
In closing, if your company develops products that may be effected by Japan’s era name change, I strongly encourage you to start taking action now, at least to the extent that is possible. If you are a font developer, hopefully the preparations that I have made thus far are helpful for your own efforts.
]]>Japanese line layout is very complex, and the first attempt to standardize its rules and principles was in the JIS X 4051 standard, which was first issued in 1993 with the title 日本語文書の行組版方法 (Line Composition Rules for Japanese Documents in English). There was a revision issued in 1995, and the latest version was issued in 2004 with the slightly different title 日本語文書の組版方法 (Formatting rules for Japanese documents). Another important document is the W3C Working Group Note JLREQ (Requirements for Japanese Text Layout), which provides much of what is described in JIS X 4051, but covers additional areas, and is tailored toward web technologies. Although still considered working drafts, W3C is also preparing similar documents for Chinese and Korean as CLREQ (Requirements for Chinese Text Layout) and KLREQ (Requirements for Hangul Text Layout and Typography), respectively.
This article is not about these standards per se, which are intended for apps and environments that implement sophisticated line layout. Rather, this article is about harsher “plain text” or comparable environments that generally do not need such treatment, yet still benefit from a modest amount of context-based spacing adjustment, particularly to get rid of unwanted space between full-width brackets and other punctuation whose glyphs generally fill half of the em-box. App menus, app dialogs, and simple text editors are examples of where such adjustments can improve text layout in these modest ways.
This all started when I met with Koji Ishii (石井宏治) of Google last Friday morning to discuss ways in which fonts can provide a modest amount of contextual metrics for these simpler environments, and while the existing kerning features were discussed as a possible vehicle for such metrics information, there is an obvious conflict for fonts that are intended to supply genuine kerning values versus contextual spacing values that generally used fixed values and affect a much smaller number of glyphs. Some fonts may choose to support both: genuine kerning and associated metrics for more complex environments, and the contextual spacing that is described in this article which is meant for simpler environments.
Our meeting ended with lunch at PHB, along with the idea to register two new OpenType GPOS (Glyph Positioning) features that would be tentatively tagged 'cspc' (“Contextual Spacing”) and 'vcsp' (“Vertical Contextual Spacing”), and which would behave like the 'kern' and 'vkrn' in that they adjust inter-glyph spacing.
I used Friday afternoon to come up with a preliminary set of characters, which include all of the full-width brackets in the U+3000 and U+FF00 blocks. I subsequently organized them into glyph classes. The vertical variants in the U+FE10 and U+FE30 blocks are also included only because their glyphs are mapped from those codes points, but are also specified as vertical variants via the 'vert' GSUB feature.
Saturday morning was spent building an OpenType/CFF font with less than 100 glyphs that is based on Source Han Serif. This font is still considered Pan-CJK in that it supports region-specific forms for some punctuation and one ideograph. As a result, the hani, kana, and latn scripts are supported, as are the JAN, ZHS, and ZHT languages, and the appropriate declarations are specified in both the 'locl' (Localized Forms) and 'vert' GSUB features. In order to be able to test in environments that do not support arbitrary OpenType features, the example font includes the lookups for the 'cspc' and 'vcsp' GPOS features in the 'kern' and 'vkrn' GPOS features, respectively.
The image below is a close re-creation of the seven example strings that are shown in Figure 3.13 of JLREQ that was prepared using the example implementation, whereby the right side shows the glyphs set solid with default full-width metrics, and the left side shows the result when the 'vcsp' GPOS feature is applied:
Not bad, right?
The latest example OpenType/CFF font, SourceHanSerifCSP-Heavy.otf Version 1.004 built on 2018-04-23, includes glyphs for ‘’“” 、。〈〉《》「」『』【】〔〕〖〗〘〙〚〛〝〟あ・好(),.!:;?[]{}⦅⦆ ← you can copy the characters from this paragraph. U+00B7 ·, U+2022 •, and U+2027 ‧ map to the glyph for U+30FB ・, and U+2329 〈 and U+232A 〉 map to the glyphs for U+3008 〈 and U+3009 〉, respectively. Vertical forms that are encoded in the U+FE10 and U+FE30 blocks are also mapped from those code points, but are also accessible via the 'vert' GSUB feature.
For those who are interested in this idea, please download and use the example font, and be sure to language-tag the text for languages other than Japanese, specifically Simplified Chinese and Traditional Chinese, as the forms of some punctuation and single ideograph will change as appropriate.
Of course, these GPOS features are not meant as a substitute for implementing proper line layout support based on JIS X 4051 or JLREQ for apps that deserve to provide their customers such support. And, because these are separate GPOS features, it is possible for fonts that include them to continue to function in environments that support more sophisticated line layout, but also in simpler environments that would need to invoke only the 'cspc' or 'vcsp' GPOS feature, as appropriate. Feel free to examine the raw features file that uses Unicode-based glyph names. Any feedback is welcome.
One important caveat for font developers is that all four valuerecord values—xPlacement, yPlacement, xAdvance, and yAdvance—need to be explicitly supplied for each glyph or glyph class pair of the 'vcsp' GPOS feature as specified in the “features” file in order for the spacing adjustments to be applied along the Y-axis, not the X-axis. Most of the pairs use <0 0 0 -500> as the valuerecord, which removes 500 units from the vertical advance of the first glyph in the pair. The AFDKO makeotf tool treats the 'vkrn' GPOS feature special in this regard, which is why only a single adjustment value needs to be supplied.
]]>