Unicode Polytonic Greek for the World Wide Web
 Linux Without XFree6 4+ ▣ Home | ◈ Contents | △ Section | ◁ Previous | Next ▷

Unicode Polytonic Greek
for the World Wide Web

Version 0.9.7

D R A F T

(This page is under development; various drafts are represented in the text side by side. Sooner or later this will be divided into two pages: Linux with XFree86 4.x and Linux without XFree86 4.x)

Linux Without XFree86 4+

The suggestions on this page were tested successfully with RedHat Linux 6.1 and 6.2. They may not work on other Linux distributions. If your distribution has XFree86 4.0 or higher, I would suggest starting out with the Linux With XFree86 4+ page (remember, both of these pages were only tested with RedHat).

My notes on using Unicode in the browser in Linux is still under development. I've managed to get Mozilla M18 and some later versions to display Unicode Greek in 6.2, both with combining diacriticals (though with artifacts) and with precomposed characters (the latter using Athena). Mozilla 0.8.0 (and presumably 0.8.1) and RedHat 7.0 display Unicode Greek with combining diacriticals out of the box (thanks to XFree86 4.0), but there are artifacts that make for readability issues (the default font of XFree86 isn't able to display combining diacriticals in the same character space as the characters they modify, so there are a lot of extra spaces when you read Greek texts with combining diacriticals in Mozilla on an XFree86 4.0-capable Linux distribution (this may be due to bugs in Mozilla-for-Linux's implementation of CSS1 or CSS2 in a UTF-8 environment). But the Greek is readable). There are other kinks, too.

The KDE web browser/file manager Konqueror reports full Unicode capabilities, within the limitations of the fonts. On a clean install of RedHat 7.0, installing the KDE 1.93 RPMs in the Previews folder of RH7 Disk 2, Konqueror read this page, with polytonic Greek, out of the box.

Another application providing Unicode support for Linux is Yudit, a Unicode text editor. Theoretically, the console and Lynx should be able to provide Unicode support; but I have no reports of any success with experiments in enabling Unicode polytonic Greek support for a terminal emulator or Lynx.

XFree86 and Unicode

Linux is an open source (GNU Public License)operating system kernel which is compliant with the POSIX operating system definition and which is usually packaged with operating system libraries developed as part of the GNU project. In effect, this means that Linux is a free version of the Unix operating system, one created by collaborating programmers to provide an easily extensible, financially accessible alternative to the long-utilized, stable, but quite expensive commercial Unix implementations.

Linux users interact with programs through one of two interfaces. All Linux systems provide a terminal emulator or console with no graphical capabilities. While using the console, one is effectively limited to using Lynx for their web browser (and therefore not viewing graphics) and programs like Elm and Pine for their email clients. Most Linux systems also provide a graphical user interface based on X: (sometimes annoyingly called X Windows) on Linux systems, this is usually XFree86, often with a GUI library like Gnome, the K Desktop Environment, or Motif, plus a window manager like Enlightenment, fvwm2, AfterStep, and Sawfish (usually used with Gnome), the KDE window manager with KDE, Motif or a Motif clone like OpenMotif or Lesstif, or X alone with a window manager. Users of X with Gnome or KDE, on the other hand, have a number of choices for Web Browser, including Netscape 4.76, Mozilla, Opera, Amaya, and Konqueror in KDE.

The console and various console-based applications (Lynx, Emacs, etc.) can be viewed in windows or frames within the X desktop.

The most recent release of the X server user environment for Linux and other Unixes under the Gnu Public License is XFree86 4.0.1. This release of XFree86 implements a number of important features for Unicode users:

see the bibliography below for sources which will provide more information on the

(X, Free because it's GPL, and 86 for the Intel x86 architecture for which it was originally designed; XFree86 now works on a number of other architectures).

For issues with Unicode functionality in Linux (which should more properly be discussed in the Linux appendix, but since that remains to be written . . .), see Markus Kuhn's discussion of Unicode Normalization forms and precomposed characters and their implementations in Linux and X, respectively, in his FAQ for Unicode and UTF-8 in Linux.

Full Unicode functionality with all bells and whistles can only be expected from sophisticated multi-lingual word-processing packages. What Linux will use on a broad base to replace ASCII and the other 8-bit character sets is far simpler. Linux terminal emulators and command line tools will in the first step only switch to UTF-8. This means that only a Level 1 implementation of ISO 10646-1 is used (no combining characters), and only scripts such as Latin, Greek, Cyrillic, and many scientific symbols are supported that need no further processing support. At this level, UCS support is very comparable to ISO 8859 support and the only significant difference is that we have now thousands of different characters available, and that characters can be represented by multibyte sequences.

Combining characters will also be supported under Linux eventually, but even then the precomposed characters should be preferred over combining character sequences where available. More formally, the preferred way of encoding text in Unicode under Linux should be Normalization Form C as defined in Unicode Technical Report #15.

[. . . .] Combining characters: The X11 specification does not support combining characters in any way. The font information lacks the data necessary to perform high-quality automatic accent placement (as it is found for example in all TeX fonts). Various people have experimented with implementing simplest overstriking combining characters using zero-width characters with ink on the left side of the origin, but details of how to do this exactly are unspecified (e.g., are zero-width characters allowed in CharCell and Monospaced fonts?) and this is therefore not yet widely established practice.

Unicode Polytonic Greek in KDE 2.0 (and 1.93)

The K Desktop Environment, version 2.0, was released in September 2000. KDE2 has support for Unicode polytonic Greek out of the box.

Linux (specifically, Red Hat 6.2 and 7.0 for i386)

1. Use XFS, the X font server, provided with XFree86 4.0.

Font De-Uglification Mini Howto: http://www.linux.org/docs/ldp/howto/mini/FDU.html

  1. Make a directory /usr/share/fonts/ttfonts. Obviously this is not a required step, but I'd suggest sticking with the Mini-Howto's way of doing things so you can remember where you've kept things.
    mkdir /usr/share/fonts/ttfonts
  2. Next copy the fonts to that directory. All filenames must be lowercase.
  3. Next, AS ROOT OR SU, type
    ttmkfdir -o fonts.scale
    mkfontdir
    
  4. Next, edit the file /etc/X11/fs/config and add the line /usr/share/fonts/ttfonts
  5. Finally, change the FontPath line in in /etc/X11/XF86Config to read FontPath "unix/:-1" or FontPath "unix/:7100"
  1. Now you have to add the 10646 definitions to the fonts.scale and fonts.dir files in the ttfonts directory.
  2. Edit the fonts.dir file. Copy the ISO8859-1 font definition for each Unicode font, paste it at the end of the list of font definitions, and replace ISO8859-1 in the copy with ISO10646-1 . Repeat this step for each Unicode font you have installed.
  3. Repeat the process with the fonts.scale file.
  4. Reboot your system (of course, stopping and starting xfs would be easier, but rebooting is more thorough)
  1. http://www.linux.org/docs/ldp/howto/Font-HOWTO.html
  1. http://www.linux.org/docs/ldp/howto/Unicode-HOWTO.html

I've managed to get Mozilla M18 to display Unicode Greek in 6.2, both with combining diacriticals (though with artifacts) and with precomposed characters (the latter using Athena). Mozilla M18 and RedHat 7.0 display Unicode Greek with combining diacriticals out of the box (thanks to XFree86 4.0), but there are artifacts that make for readability issues (XFree86 doesn't seem to be able to display combining diacriticals in the same character space as the characters they modify, so there are a lot of extra spaces when you read Greek texts with combining diacriticals in the browser on an XFree86 4.0-capable Linux distribution. This is still a major improvement over XFree86 3.x, though). There are a lot of other kinks, too.

As I've mentioned above in reference to browsers, the KDE web browser/file manager reports full Unicode capabilities, within the limitations of the fonts, but I have not yet tested it.

TrueType Unicode fonts are hard to use in X, even with XFSTT (X Font Server/True Type) or the Red Hat version of XFS (they might be easier to use in XFree86 version 4.0.1, but I don't have any reports on that); 2. Netscape 4.x for Linux doesn't allow one to use any fonts except the so-called pseudo-fonts; 3. Mozilla's Unicode font support has just recently come up to speed; 4. Amaya hasn't finished it's Unicode support yet; 5. There's no Internet Explorer for Linux. I've been keeping my eyes open for upgrades to Unicode support, and experimenting myself, and now have Unicode polytonic Greek working on my Linux box (RedHat 6.2, with the standard kernel and version of Enlightenment and Gnome and the S3 Virge X server, with the RedHat version of XFS running; Mozilla M17 (binary distribution, why build it yourself when you don't have to?); and a couple of TrueType Unicode fonts installed following the instructions in the TrueType Mini-Howto with the addition that one must add the -10646-1 or -10646-2 definitions into the encoding definitions file. I'll explain what that means as soon as I can and add the relevant information to this page. I have screenshots of Unicode working in Mozilla on Perseus and on an earlier version of this page (with Athena Roman).

Linux (specifically, Red Hat 6.2 and 7.0 for i386)

My appendix on using Unicode in the browser in Linux is still under development. I've managed to get Mozilla M18 to display Unicode Greek in 6.2, both with combining diacriticals (though with artifacts) and with precomposed characters (the latter using Athena). Mozilla M18 and RedHat 7.0 display Unicode Greek with combining diacriticals out of the box (thanks to XFree86 4.0), but there are artifacts that make for readability issues (XFree86 isn't able to display combining diacriticals in the same character space as the characters they modify, so there are a lot of extra spaces when you read Greek texts with combining diacriticals in Mozilla on an XFree86 4.0-capable Linux distribution (this may be due to bugs in Mozilla-for-Linux's implementation of CSS1 or CSS2 in a UTF-8 environment). But the Greek is readable). There are other kinks, too.

The KDE web browser/file manager Konqueror reports full Unicode capabilities, within the limitations of the fonts. On a clean install of RedHat 7.0, installing the KDE 1.93 RPMs in the Previews folder of RH7 Disk 2, Konqueror read this page, with polytonic Greek, out of the box. I have not yet tested combining diacriticals, but as the flaw in combining diacriticals support is in X and not the window manager or GUI toolkit, it is unlikely that Konqueror supports them. I have contacted KDE to find out what the font is being used in the default install of Konqueror (I suspect it may be Titus Cyberbit Basic).


 Unicode Polytonic Greek for the World Wide Web Version 0.9.7
 Copyright © 1998-2002 Patrick Rourke. All rights reserved.
D R A F T - Under Development
 Please do not treat this as a published work until it is finished!
▣ Home | ◈ Contents | △ Section | ◁ Previous | ◁ Next