Utf-8 linux console

From SlackWiki
Jump to: navigation, search

by Dive

A Quick Look at Using UTF-8 in the Linux Console (without X)

A few people have asked about this in ##slackware lately so I though I'd put down a few notes on how I got utf-8 up and running. Feel free to add any further info.

In these days of high powered X desktop environments with all their flashy composite bells and whistles it may seem strange to worry about how the plain Linux console behaves, but we need to remember that there are some very nice applications that live in the console and can be just as (or even more) productive than the X Windows counterparts. I'm thinking of screen, irssi, mutt, elinks, vim/vi, mplayer (using svga), links, zgv and slrn to name but a few. And of course we have to mention Slackware tools like pgktool, slackpkg and sbopkg. Once you get hooked on using the console it is hard to go back. Anyway here we go ...

The Console

The first thing to do is configure the console. There is a kernel parameter we can add to /etc/lilo.conf to do this:

append="vt.default_utf8=1"

Here's an example from mine:

 # Linux bootable partition config begins
 image = /boot/vmlinuz-2.6.27.7-custom
 root = /dev/hda1
 label = Linux-custom
 append="resume=/dev/hda2 vt.default_utf8=1"
 read-only
 # Linux bootable partition config ends

Once you have saved lilo.conf and run lilo a quick reboot is needed.

LANG environment variable

I find that typing multibyte characters like "ä" and erasing them with backspace, I can erase much more than what I wrote. It helps to set the LANG variable, preferably by editing /etc/profile.d/lang.sh (or lang.csh) to set it to en_US.UTF-8 or similar.

Fonts

After that we need to find a font that actually contains the characters you want to see. My 'font of the moment' is lat9w-16. This contains the British £ (pound) and € (euro) symbols that I need, plus a lot of useful accented characters. But what is just as important are the correct drawing characters so that curses programs like pkgtool and sbopkg are printed on the screen correctly. If you experiment with the setconsolefont command you will find that some fonts draw curses box borders as squares or question marks, so check out some fonts and see which ones work and which don't. The setconsolefont command will put your chosen font into /etc/rc.d/rc.font so it will load at bootup.


Keyboard

Ok, great you can now read characters correctly, but you need the correct keymap to be able to type them. That's a little harder. If you can find a utf-8 keymap for your locale/hardware then that's fine. Personally I didn't find any of the installed keymaps suitable so I looked around for an alternative. I found a uk-utf8 keymap on the intertubes, but this was unfortunately missing all the Ctrl+[a-z] keys so I edited it and added my own. I also added some AltGr+[a-z] (well not all of those) for accented characters. You may find this keymap useful as a starting point to create your own if you cannot find a more suitable one. Have a look through the comments and the codes and you should pick up the method. To add your own combinations you will need to know the correct codes for the characters you need. You can find a table of utf-8 characters here:

[1]http://home.tiscali.nl/t876506/utf8tbl.html

Look up the code in that table and convert it to hex. E.G. è is 232 in decimal, E8 in hex; é is code 233 in decimal, E9 in hex, so the entry in your keymap should be:

 # key      = normal   shifted   AltGr+e   AltGr+shift+e
 keycode 18 = e        E         U+00E8    U+00E9
 # prints:    e        E         è         é  

You can find the keycodes with the showkey program. Once edited you can load the map with the loadkeys program. Loadkeys will update /etc/rc.d/rc.keymap but you will need to copy the keymap to /usr/share/kbd/keymaps/i386/qwerty/ so it is found on bootup. Here is a link to my updated uk-utf8 map:

[2]http://www.unrealize.co.uk/source/uk-utf.map.gz

You can load this ungzipped but it's better to gzip it back after editing to keep things consistent.

Conclusion

By now you should have a fully usable keyboard/console correctly printing unicode characters. You will probably need to tell some programs like mutt and irrsi that you are using a utf-8 system.


Note: There exists two shell scripts - unicode_start and unicode_stop. Typing unicode_start [font] will load the required font and set up keyboard correctly for unicode input.


Bye for now,

Dive 02:33, 20 March 2009 (UTC)