In this section I'll describe how to customize various text editors to work with Cyrillic text. This doesn't cover the word processors, which will be described later (see section <@@ref>word-processingword-processing ).
There are two version of the Emacs editor - GNU Emacs and XEmacs. While they provide more or less same functionality, some implementation details are significantly different. Cyrillic setup requires some low-level (in Emacs Lisp sense) tweaking, and it differs a bit for those two versions.
NOTE: Apart from the setup described here, there is an alternative way to configure both versions of emacs - use MULE (MULtilanguage Emacs support). The latter way is fairly complicated and (to the best of my knowledge) rarely used, so I don't discuss it here.
The minimal cyrillic support in GNU emacs (you don't have to do
it for the XEmacs) is done by adding the following calls to one's
.emacs
(provided that the Cyrillic character set support is
installed for console or X respectively):
(standard-display-european t) (set-input-mode (car (current-input-mode)) (nth 1 (current-input-mode)) 0)
This allows the user to view and input documents in Russian.
However, it isn't enough. Emacs doesn't know yet, that Cyrililic characters may constitute a word, let alon the upper/lower case conversion rules. In order to teach Emacs doing that, you have to modify the syntax and case tables of emacs:
(require 'case-table) (let* ((ruc "\341\342\367\347\344\345\263\366\372\351\352\353\354\355\356\357\360\362\363\364\365\346\350\343\376\373\375\370\371\377\374\340\361") (rlc "\301\302\327\307\304\305\243\326\332\311\312\313\314\315\316\317\320\322\323\324\325\306\310\303\336\333\335\330\331\337\334\300\321") (i 0) (len (length ruc))) (while (< i len) (modify-syntax-entry (elt ruc i) "w ") (modify-syntax-entry (elt rlc i) "w ") (set-case-syntax-pair (elt ruc i) (elt rlc i) (standard-case-table)) (setq i (+ i 1))))
For this purpose I created a rusup.el
file which does this, as
well as a couple handy functions. You have to load it in your
~/.emacs
.
Finally, the
russian.el package by Valery Alexeev
(valery@math.uga.edu
) allows the user to switch between cyrillic
and regular input mode and to translate the contents of a buffer from
one Cyrillic coding standard to another (which is especially useful
while reading the texts imported from MS-DOS or Windows).
The vi editor (at least it's clone vim, available in most Linux distributions) is aware of 8-bit characters. It will allow you to enter cyrillic characters and will be able to recognize the word boundaries correctly. I don't know about the upper-/lower-case conversion rules, since I don't use vi much. If you know something about it, please inform me.
Joe requires a special -asis
option to recognize 8-bit
characters. You may either specify this option at the command line, or
to put it in ~/.joerc
file (for personal use, or in
/usr/lib/joerc
for system-wide setup.
If your program doesn't understand -asis
option, you have to
upgrade to the newer version.
However, joe doesn't seem to understand the cyrillic words' boundaries correctly. I assume, that it applies both to the case conversion rules.
The program I use to spell-check text is the GNU ispell. It is very flexible and extensible, so it is possible to use it to spell-check text in languages, other than English, by adding new spell dictionaries.
Constantine Knizhnik has created a very good Russian dictionary for ispell. You may find it at his homepage. The distribution includes a handy incremental spelling script for emacs.
Ideally, if you already have an ispell properly installed, you
have to just step into the newly-created directory and generate the
dictionary, using the commands provided in the Makefile
. However,
chances are quite high, that you'll see a lot of complaints about the
ispell's unawareness of the 8-bit data. This is because in most
distributions, ispell is compiled without 8-bit data support. In
this case, you cannot avoid recompiling the ispell package.
Again, RedHat users will be delighted to know that I've rebuilt the ispell package with both Russian and German dictionaries. As usual, you may grab it from the RedHat FTP site.
Once you have everything installed, you may invoke Russian
spell-check, by supplying '-d russian'
option to ispell.
Now, if you use Emacs, you may want to add a menu item for a
russian dictionary. I sent a proposed menu entry to the ispell.el
maintainer and he kindly agreed to include it in the the next public
release of the file. Meanwhile, you may do it by adding the following
code in your ~/.emacs
(or in
/usr/share/emacs/site-lisp/site-start.el
for a system-wide
setup):
(setq ispell-dictionary-alist (append ispell-dictionary-alist '(("russian" "[\341\342\367\347\344\345\263\366\372\351\352\353\354\355\356\357\360\362\363\364\365\346\350\343\376\373\375\370\371\377\374\340\361\301\302\327\307\304\305\243\326\332\311\312\313\314\315\316\317\320\322\323\324\325\306\310\303\336\333\335\330\331\337\334\300\321]" "[^\341\342\367\347\344\345\263\366\372\351\352\353\354\355\356\357\360\362\363\364\365\346\350\343\376\373\375\370\371\377\374\340\361\301\302\327\307\304\305\243\326\332\311\312\313\314\315\316\317\320\322\323\324\325\306\310\303\336\333\335\330\331\337\334\300\321]" "[']" t ("-C" "-d" "russian") "~latin1")))) (define-key-after ispell-menu-map [ispell-select-russian] '("Select Russian (KOI-8)" . (lambda () (interactive) (ispell-change-dictionary "russian"))) 'british)
Unfortunately, it won't work for the XEmacs. I'll try to solve this problem later.