Luit -- locale and ISO 2022 support for Unicode terminals

This page was originally written in 2004. Since then, luit has been taken over -- twice. Once by the X.Org project, and once by Thomas Dickey. I am extremely grateful to both groups for keeping luit alive. Please direct any inquiries to them, not to me.

UTF-8 is a well-defined, simple, reliable, efficient and self-synchronising encoding for multilingual text. Hopefully, in a few years, all Unix-like systems will be using UTF-8 as the only encoding for plain text.

Many recent terminal emulators implement a UTF-8 mode; this includes the XFree86 version of XTerm, the Linux console, PuTTY, the latest versions of C-Kermit and Kermit-95, kterm and Gnome-terminal.

Unfortunately, most text-mode applications available today expect to speak to the terminal either in a locale-specific encoding, or using ISO 2022. While a number of localised and ISO 2022 terminal emulators do exist, none are as reliable as XTerm, and having multiple terminal emulators for multiple languages is madness.

Rather than implementing support for locale-specific encodings or (horror!) full ISO 2022 in XTerm, I have decided to write luit, a utility that can be run in any UTF-8 terminal, and will simulate locale-specific encodings as well as almost full ISO 2022 support.

Read the luit manual page. Also have a look at Pluto.

Luit is integrated in XFree86 since version 4.2. Since then, Tomohiro Kubota has done some very useful work to make luit support irregular character sets (such as Shift-JIS and GBK), and made XTerm automatically invoke luit when needed. In case you need a standalone version, you can still download an obsolete version of luit.


Screenshots

Luit used to simulate a terminal emulator for a number of locales. This example shows luit's support for the ISO 8859 series and EUC character sets, but also for irregular character sets such as KOI8-R and Big 5.

locale demo


An example of an application that likes to speak to the terminal by switching the ISO 2022 state using ISO 6429 escape sequences is Emacs/MULE (the extensible, customizable, self-documenting real-time display editor / incomprehensible extension). Here, Emacs/MULE is running within luit running within a UTF-8 XTerm. MULE is changing the state of luit using ISO 6429 escape sequences, and luit converts Emacs' output into UTF-8. You may notice that luit doesn't yet support some of the charsets used by MULE (notably Vietnamese, Thai and MULE's nonstandard variant of Big 5). You may also note that the order of the Hebrew characters are incorrect; but that's MULE's fault, luit never claimed it implements BIDI.

(An interesting question is whether BIDI should be implemented at the application level or below it, at the terminal emulator level. My opinion is that BIDI definitely belongs above the terminal emulator, and within the application.)

Luit's command line in this case was:

$ luit -gr g2

and MULE was informed that

M-x set-terminal-coding-system iso-2022-8bit-ss2

MULE demo

Note that you may disable MULE by putting

(standard-display-european t)

in your .emacs file. Aah, what a relief, Emacs works normally once again.


Here's a demonstration of terminal input. The two ideographs were pasted into XTerm using UTF8_STRING. XTerm passed them to luit as UTF-8; luit converted them into EUC-JP and passed them to the pty. The inferior terminal driver echoed the characters back, which were converted back into UTF-8 by luit and displayed by XTerm.

The dump shows that Luit used the control character SS2 (0x8F) to select JIS X 0212. (It also shows that luit chose to use GL codes for the input; for compatibility with ISO 2022-JP, recent versions of luit use GR instead.)

Input demo


From the manual page of luit:

None of this complexity should be necessary. Stateless UTF-8 throughout the system is the way to go.

Fortunately, this is the way Free Unix-like systems have chosen to go.


Back to my software page.

Juliusz Chroboczek