Unicode and Charset

Start page  Previous page  Next page

 

n2pdf has Unicode support (http://en.wikipedia.org/wiki/Unicode). This support is enabled as standard and can be disabled if required by the option N2PDFOPTION_SYSTEM_UNICODE_MODE .

 

Unicode provides a basis for processing the contents of texts in different languages. It is therefore possible for example for function calls of n2pdf texts to be transferred in different languages.

 

112

Unicode only provides a basis for processing the contents of texts in different languages. Visual display of these texts depends on different techniques within each medium, e.g. font-embedding or CID fonts.

 

 

Unicode also provides an option for creating PDF files on systems whose "native language" (Codepage: http://en.wikipedia.org/wiki/Codepage) match the contents of the PDF file. It can therefore be possible for example for a computer with a Codepage 1252 (Latin) also to create a file for Codepage 932 (Japanese). This however requires that support for the relevant language is installed and that all technical requirements (e.g. required character sets) are in place for creation of the PDF file.

 

112

At present, only character sets with LTR (left-to-right) alignment are supported. It is not therefore possible to display Arabic or Hebrew character sets (RTL (right-to-left).

 

In conjunction with Unicode, particular note should be paid to N2PDFOPTION_PDF_CID_FONT_MODE. This setting has a major influence on the visual display of Unicode contents in a PDF document. You should therefore read the description of this parameter under "PDF settings".

 

 

Unicode restrictions

Passwords: User and owner passwords for the PDF file must not contain any Unicode characters. Also any password set for compressing the created PDF file must not contain any Unicode characters.

 

Filename for the PDF file If the created PDF file is to be compressed after creation of the ZIP file, the filename must not contain any Unicode characters. This is a restriction imposed by the Zip file format. However, if the PDF file is not to be compressed, the filename should then contain Unicode characters.

 

Templates: When defining text formatting templates no Unicode characters can be used in the template designation. In a template, no character sets should be used which contain Unicode characters in their name.

 

File linking: If file attachments are stored on a drive as files and are added to the PDF file as a link, the filenames (including folder) must not contain any Unicode characters. No Unicode characters can be used for embedding or importing file attachments.

 

Enabling Unicode support

Call N2PDFSetOption ( JobID, N2PDFOPTION_SYSTEM_UNICODE_MODE, N2PDFVALUE_TRUE, "" )

 

 

Codepage and Character Set (Charset)

With enabled Unicode support, when a PDF file is created the "Character Set" belonging to the current codepage (http://www.microsoft.com/globaldev/reference/WinCP.mspx) (http://en.wikipedia.org/wiki/Character_set) is set as a template (e.g. in Codepage 1251 (Cyrillic), Character Set 204 is enabled). The PDF file is therefore always based on the character set which is enabled on the computer at the time the PDF file was created.

 

Codepage of operating system

assigned Character Set

1250 (Central Europe)

EASTEUROPE_CHARSET (238)

1252 (Latin I)

DEFAULT_CHARSET (1)

1251 (Cyrillic)

RUSSIAN_CHARSET (204)

1253 (Greek)

GREEK_CHARSET (161)

1254 (Turkish)

TURKISH_CHARSET (162)

1257 (Baltic)

BALTIC_CHARSET (186)

1258 (Vietnam)

VIETNAMESE_CHARSET (163)

874 (Thai)

THAI_CHARSET (222)

932 (Japanese Shitf-JIS)

SHIFTJIS_CHARSET (128)

936 (Simplified Chinese)

GB2312_CHARSET (134)

950 (Traditional Chinese Big5)

CHINESEBIG5_CHARSET (136)

949 (Korean)

HANGEUL_CHARSET (129)

 

You need to make a change if you create a PDF file for a different character set, i.e. one not based on the computer's current character set. This can for example be necessary if you wish to create a PDF file with Chinese content on an English-language operating system. In this instance, you need to deliberately alter the character set of the PDF file. With the following call, you can alter the character set (as soon as possible to N2PDFInit).

 

Call N2PDFSetOption ( JobID, N2PDFOPTION_PDF_CHARSET,  134, "" )        

 

You will find further information aboutCharSets in the PDF settings section.