Page 63 of 76
Internationalization
The vast majority of the worlds languages can be used with the Report Generator. Thanks to XML’s natural character set of UTF-8, as
well as it’s ability to use “preferred” native character sets like Shift-JIS or EUC-KR, specifying the actual characters to display is not a
problem. When editing your XML just remember to save the file in the correct encoding - UTF-8 unless you’ve specified otherwise.
When it comes to actually displaying the characters, the key is to use the right font. The standard fonts (Helvetica, Times and Courier)
will, as far as we know, display the following languages correctly:
English, French, German, Portuguese, Italian, Spanish, Dutch (no “ij” ligature), Danish, Swedish, Norwegian, Icelandic, Finnish,
Polish, Croatian, Czech, Hungarian, Romanian, Slovak, Slovenian, Latvian, Lithuanian, Estonian, Turkish, Catalan (although the “L
with dot” character is missing), Basque, Albanian, Rhaeto-Romance, Sorbian, Faroese, Irish, Scottish, Afrikaans, Swahili, Frisian,
Galician, Indonesian/Malay and Tagalog.
For Chinese, Japanese and Korean the obvious choice is to use the standard east asian fonts like “hygothic”, “heiseimin” and
“mhei” (the full list is in the Fonts section).
For other languages like Czech, Slovenian, Russian or Hebrew that require characters not directly supported by the PDF specification,
the best method is to embed an appropriate OpenType or Type 1 font using the LINK element. Provided the font contains the
character, and the “embed” attributes is left at it’s default values of “true”, the characters should display correctly.
Right-to-left languages (arabic, hebrew, syriac and urdu) are supported. The “direction” attribute controls the overall flow of the text
and defaults to “rtl” for these languages, and can also be set manually. The “unicode-bidi” CSS property is not supported, so for
further control it is necessary to embed the correct byte-order marks in the text, eg ‫.
Every element in the document can have a language set using the “lang” attribute, which defaults to the current locale of the PDF
generation process. This attribute affects a few things - the style of quote substitution if the “requote” attribute is true, the type of
currency format to use when a “currency()” formatter is used with graphs, default text direction, default font (if the language is
Chinese, Japanese or Korean) and default page size - for en_US, en_CA and fr_CA the default is Letter, for everyone else it’s A4.
Examples would be “de” for German and “en_GB” for British English. Generally it is enough to set the “lang” attribute of the <pdf>
element, which sets the language for the entire document.
When creating documents from JSP pages, remember to set the character set to match the <?xml?> declaration. This also applies to
pages included via the <jsp:include> method. The following examples are all valid:
<?xml version="1.0" encoding="ISO-8859-1"?>
<%@ page language="java" contentType="text/xml"%>
<!-- document follows in ISO-8859-1 -->
<?xml version="1.0"?>
<%@ page language="java" contentType="text/xml; charset=UTF-8"%>
<!-- document follows in UTF-8 -->
<?xml version="1.0" encoding="ShiftJIS"?>
<%@ page language="java" contentType="text/xml; charset=ShiftJIS"%>
<!-- document follows in Shift JIS -->