Encodings

The following encodings are supported by Server (and Views):

  • US-ASCII

  • ISO-8859-1 (Latin1)

  • ISO-8859-2 (Latin2)

  • ISO-8859-3 (Latin3)

  • ISO-8859-4 (Latin4)

  • ISO-8859-5 (LatinCyrillic)

  • ISO-8859-6 (LatinArabic)

  • ISO-8859-7 (LatinGreek)

  • ISO-8859-8 (LatinHebrew)

  • ISO-8859-9 (Latin5)

  • ISO-8859-10 (Latin6)

  • ISO-8859-13 (Latin7)

  • ISO-8859-14 (Latin8)

  • SO-8859-15 (Latin9)

  • EUC-JP

  • Shift_JIS

  • EUC-KR

  • GB2312

  • Big5

  • EUC-TW

  • hp-roman8

  • IBM850

  • windows-1250

  • windows-1251

  • windows-1252

  • windows-1253

  • windows-1254

  • windows-1255

  • windows-1256

  • windows-1257

  • windows-949

  • UTF-8

ISO-8859-1

Latin1 covers most West European languages, such as:

  • Afrikaans (af)

  • Albanian (sq)

  • Basque (eu)

  • Catalan (ca)

  • Danish (da)

  • Dutch (nl)

  • English (en)

  • Faroese (fo)

  • Finnish (fi)

  • French (fr)

  • Galician (gl)

  • German (de)

  • Icelandic (is)

  • Irish (ga)

  • Italian (it)

  • Norwegian (no)

  • Portuguese (pt)

  • Scottish (gd)

  • Spanish (es)

  • Swedish (sv)

ISO-8859-2

Latin2 covers the languages of Central and Eastern Europe:

  • Croatian (hr),

  • Czech (cs),

  • Hungarian (hu),

  • Polish (pl),

  • Romanian (ro),

  • Slovak (sk),

  • Slovenian (sl)

ISO-8859-3

Latin3 is popular with authors of Esperanto (eo), Maltese (mt), and it covered Turkish before the introduction of Latin5.

ISO-8859-4

Latin4 introduced letters for Estonian, Baltic languages, Latvian and Lithuanian, Greenlandic and Lappish. It is an incomplete precursor of Latin6.

ISO-8859-5

With these Cyrillic letters you can type Bulgarian (bg), Byelorussian (be), Macedonian (mk), Russian (ru), Serbian (sr) and Ukrainian (uk).

ISO-8859-6

This is the Arabic (ar) alphabet.

Note

This version of Views does not support bidirectional text.

ISO-8859-7

This is modern Greek (el).

ISO-8859-8

This is Hebrew (iw).

Note

This version of Views does not support bidirectional text.

ISO-8859-9

Latin5 replaces the rarely needed Icelandic letters in Latin1 with the Turkish (tr) ones.

ISO-8859-10

Latin6 rearranged Latin4, added the last missing Inuit (Greenlandic Eskimo) and non-Skolt Sami (Lappish) letters, and reintroduced the rarely Icelandic letters to cover the entire Nordic area:

  • Estonian (et)

  • Lapp

  • Latvian (lv)

  • Lithuanian (lt)

Skolt Sami still needs a few more accents.

ISO-8859-13

To cover the Baltic Rim. Latin7 is going to cover the Baltic Rim and re-establish the Latvian (lv) support lost in Latin6 and may introduce the local quotation marks. It resembles WinBaltic, that is, windows-1257.

ISO-8859-14

To cover Celtic. Latin8 adds the last Gaelic and Welsh (cy) letters to Latin1 to cover all Celtic languages.

ISO-8859-15

Similar to Latin1 with euro and oe ligature. The new Latin9 nicknamed Latin0 aims to update Latin1 by replacing the less needed symbols ¨´¸ with forgotten French and Finnish letters and placing the U+20AC Euro sign in the cell =A4 of the former international currency sign ¤.

EUC-JP

Extended UNIX Code for Japanese.

Standardized by OSF, UNIX International, and UNIX Systems Laboratories Pacific. Uses ISO 2022 rules to select:

  • code set 0: JIS Roman (a single 7-bit byte set)

  • code set 1: JIS X0208-1990 (a double 8-bit byte set) restricted to A0-FF in both bytes

  • code set 2: Half Width Katakana (a single 7-bit byte set) requiring SS2 as the character prefix

  • code set 3: JIS X0212-1990 (a double 7-bit byte set) restricted to A0-FF in both bytes requiring SS3 as the character prefix

Shift_JIS

A Microsoft code that extends csHalfWidthKatakana to include kanji by adding a second byte when the value of the first byte is in the ranges 81-9F or E0-EF.

EUC-KR (KS C 5861-1992)

Extended UNIX® Code for Korean.

GB2312

Multibyte encoding standardized by the People’s Republic of China.

Big5

Multibyte encoding standardized by Taiwan

EUC-TW (cns11643)

Extended UNIX Code for Traditional Chinese

hp-roman8

HP specific

IBM850

IBM specific

windows-1250

Windows 3.1 Eastern European languages.

windows-1251

Windows 3.1 Cyrillic

windows-1252

Windows 3.1 US (ANSI)

windows-1253

Windows 3.1 Greek

windows-1254

Windows 3.1 Turkish

windows-1255

Hebrew

Note

This version of Views does not support bidirectional text.

windows-1256

Arabic

Note

This version of Views does not support bidirectional text.

windows-1257

Baltic

windows-949

Korean (Wansung)

UTF-8

Unicode UTF-8