xenocara/doc/xorg-docs/general/fonts/fonts.7

1707 lines
45 KiB
Groff
Raw Normal View History

.\" automatically generated with docbook2mdoc fonts.xml
.Dd 16 March 2012
.Dt FONTS 7
.Os
.Sh NAME
.Nm fonts
.Nd Fonts in X11R6
.Sh INTRODUCTION
This document describes the support for fonts in X11R6.
.Sx Installing_fonts
is aimed at the
casual user wishing to install fonts in X11R6 the rest of the
document describes the font support in more detail.
.Pp
We assume some familiarity with digital fonts.
If anything is not
clear to you, please consult
.Sx Appendix_background_and_terminology
at the
end of this document for background information.
.Ss Two font systems
X11 includes two font systems: the original core X11 fonts
system, which is present in all implementations of X11, and the Xft
fonts system, which may not yet be distributed with implementations of
X11 that are not based on either XFree86 or X11R6.8 or later.
.Pp
The core X11 fonts system is directly derived from the fonts system
included with X11R1 in 1987, which could only use monochrome bitmap
fonts.
Over the years, it has been more or less happily coerced into
dealing with scalable fonts and rotated glyphs.
.Pp
Xft was designed from the start to provide good support for scalable
fonts, and to do so efficiently.
Unlike the core fonts system, it
supports features such as anti-aliasing and sub-pixel rasterisation.
Perhaps more importantly, it gives applications full control over the
way glyphs are rendered, making fine typesetting and WYSIWIG display
possible.
Finally, it allows applications to use fonts that are not
installed system-wide for displaying documents with embedded fonts.
.Pp
Xft is not compatible with the core fonts system: usage of Xft
requires fairly extensive changes to toolkits (user-interface
libraries). While X.Org will continue to maintain the core fonts
system, toolkit authors are encouraged to switch to Xft as soon as
possible.
.Sh INSTALLING FONTS
This section explains how to configure both Xft and the core fonts
system to access newly-installed fonts.
.Ss Configuring Xft
Xft has no configuration mechanism itself, it relies upon the
.Lk http://www.fontconfig.org/ fontconfig
library to configure and customise fonts.
That library is
not specific to the X Window system, and does not rely on any
particular font output mechanism.
.Pp
.Sy Installing fonts in Xft
.Pp
Fontconfig looks for fonts in a set of well-known directories that
include all of X11R6's standard font directories
.Pf ( Pa /usr/share/fonts/X11/* )
by default) as well as a
directory called
.Pa .fonts/
in the user's home directory.
Installing a font for use by Xft applications is as simple
as copying a font file into one of these directories.
.Bd -literal
$ cp lucbr.ttf ~/.fonts/
.Ed
.Pp
Fontconfig will notice the new font at the next opportunity and rebuild its
list of fonts.
If you want to trigger this update from the command
line, you may run the command
.Dq Nm fc-cache .
.Bd -literal
$ fc-cache
.Ed
.Pp
In order to globally update the system-wide Fontconfig information on
Unix systems, you will typically need to run this command as root:
.Bd -literal
$ su -c fc-cache
.Ed
.Pp
.Sy Fine-tuning Xft
.Pp
Fontconfig's behaviour is controlled by a set of configuration
files: a standard configuration file,
.Pa /etc/fonts/fonts.conf ,
a host-specific configuration file,
.Pa /etc/fonts/local.conf ,
and a user-specific file called
.Pa .fonts.conf
in the user's
home directory (this can be overridden with the
.Dq Ev FONTCONFIG_FILE
environment variable).
.Pp
Every Fontconfig configuration file must start with the following
boilerplate:
.Bd -literal
<?xml version="1.0"?>
<!DOCTYPE fontconfig SYSTEM "fonts.dtd">
<fontconfig>
.Ed
.Pp
In addition, every Fontconfig configuration file must end with the
following line:
.Bd -literal
</fontconfig>
.Ed
.Pp
The default Fontconfig configuration file includes the directory
.Pa \[u02DC]/.fonts/
in the list of directories searched for font
files, and this is where user-specific font files should be installed.
In the unlikely case that a new font directory needs to be added, this
can be done with the following syntax:
.Bd -literal
<dir>/usr/local/share/fonts/</dir>
.Ed
.Pp
Another useful option is the ability to disable anti-aliasing (font
smoothing) for selected fonts.
This can be done with the following
syntax:
.Bd -literal
<match target="font">
<test qual="any" name="family">
<string>Lucida Console</string>
</test>
<edit name="antialias" mode="assign">
<bool>false</bool>
</edit>
</match>
.Ed
.Pp
Anti-aliasing can be disabled for all fonts by the following incantation:
.Bd -literal
<match target="font">
<edit name="antialias" mode="assign">
<bool>false</bool>
</edit>
</match>
.Ed
.Pp
Xft supports sub-pixel rasterisation on LCD displays.
X11R6 should
automatically enable this feature on laptops and when using an LCD
monitor connected with a DVI cable; you can check whether this was
done by typing
.Bd -literal
$ xdpyinfo -ext RENDER | grep sub-pixel
.Ed
.Pp
If this doesn't print anything, you will need to configure Render for
your particular LCD hardware manually; this is done with the following
syntax:
.Bd -literal
<match target="font">
<edit name="rgba" mode="assign">
<const>rgb</const>
</edit>
</match>
.Ed
.Pp
The string
.Dq Li rgb
within the
.Dq Li <const> . . . Ns Dq Ns Li </const>
specifies the order of pixel components on your display, and should be
changed to match your hardware; it can be one of
.Dq Li rgb
(normal
LCD screen),
.Dq Li bgr
(backwards LCD screen),
.Dq Li vrgb
(LCD
screen rotated clockwise) or
.Dq Li vbgr
(LCD screen rotated
counterclockwise).
.Pp
.Sy Configuring applications
.Pp
A growing number of applications use Xft in preference to the core
fonts system.
Some applications, however, need to be explicitly
configured to use Xft.
.Pp
A case in point is XTerm, which can be set to use Xft by using the
.Dq Li -fa
command line option or by setting the
.Dq Li XTerm*faceName
resource:
.Bd -literal
XTerm*faceName: Courier
.Ed
.Pp
or
.Bd -literal
$ xterm -fa "Courier"
.Ed
.Pp
For KDE applications, you should select
.Dq Anti-alias fonts
in the
.Dq Fonts
panel of KDE's
.Dq Control Center .
Note that this option is
misnamed: it switches KDE to using Xft but doesn't enable
anti-aliasing in case it was disabled by your Xft configuration file.
.Pp
Gnome applications and Mozilla Firefox will use Xft by default.
.Ss Configuring the core X11 fonts system
Installing fonts in the core system is a two step process.
First,
you need to create a
.Em font directory
that contains all the
relevant font files as well as some index files.
You then need to
inform the X server of the existence of this new directory by
including it in the
.Em font path .
.Pp
.Sy Installing bitmap fonts
.Pp
The X11R6 server can use bitmap fonts in both the cross-platform
BDF format and the somewhat more efficient binary PCF format.
(X11R6 also supports the obsolete SNF format.)
.Pp
Bitmap fonts are normally distributed in the BDF format.
Before
installing such fonts, it is desirable (but not absolutely necessary)
to convert the font files to the PCF format.
This is done by using the
command
.Dq Nm bdftopcf ,
.Em e.g.
.Bd -literal
$ bdftopcf courier12.bdf
.Ed
.Pp
You may then want to compress the resulting PCF font files:
.Bd -literal
$ gzip courier12.pcf
.Ed
.Pp
After the fonts have been converted, you should copy all the font
files that you wish to make available into a arbitrary directory, say
.Pa /usr/local/share/fonts/bitmap/ .
You should then create the
index file
.Pa fonts.dir
by running the command
.Dq Nm mkfontdir
(please see the
.Lk mkfontdir.1.html mkfontdir(1)
manual page for more information):
.Bd -literal
$ mkdir /usr/local/share/fonts/bitmap/
$ cp *.pcf.gz /usr/local/share/fonts/bitmap/
$ mkfontdir /usr/local/share/fonts/bitmap/
.Ed
.Pp
All that remains is to tell the X server about the existence of the
new font directory; see
.Sx Setting_the_servers_font_path
below.
.Pp
.Sy Installing scalable fonts
.Pp
The X11R6 server supports scalable fonts in multiple
formats, including Type\ 1, TrueType, and OpenType/CFF.
(Earlier versions of X11 also included support for the Speedo and
CID scalable font formats, but that is not included in current releases.)
.Pp
Installing scalable fonts is very similar to installing bitmap fonts:
you create a directory with the font files, and run
.Dq Nm mkfontdir
to create an index file called
.Pa fonts.dir .
.Pp
There is, however, a big difference:
.Dq Nm mkfontdir
cannot
automatically recognise scalable font files.
For that reason, you
must first index all the font files in a file called
.Pa fonts.scale .
While this can be done by hand, it is best done
by using the
.Dq Nm mkfontscale
utility.
.Bd -literal
$ mkfontscale /usr/local/share/fonts/Type1/
$ mkfontdir /usr/local/share/fonts/Type1/
.Ed
.Pp
Under some circumstances, it may be necessary to modify the
.Pa fonts.scale
file generated by
.Nm mkfontscale ;
for more
information, please see the
.Lk mkfontdir.1.html mkfontdir(1)
and
.Lk mkfontscale.1.html mkfontscale(1)
manual pages and
.Sx Core_fonts_and_internationalisation
later in this document.
.Pp
.Sy CID-keyed fonts
.Pp
The CID-keyed font format was designed by Adobe Systems for fonts
with large character sets.
The CID-keyed format is obsolete, as it
has been superseded by other formats such as OpenType/CFF and
support for CID-keyed fonts has been removed from X11.
.Pp
.Sy Setting the server's font path
.Pp
The list of directories where the server looks for fonts is known
as the
.Em font path .
Informing the server of the existence of a new
font directory consists of putting it on the font path.
.Pp
The font path is an ordered list; if a client's request matches
multiple fonts, the first one in the font path is the one that gets
used.
When matching fonts, the server makes two passes over the font
path: during the first pass, it searches for an exact match; during
the second, it searches for fonts suitable for scaling.
.Pp
For best results, scalable fonts should appear in the font path before
the bitmap fonts; this way, the server will prefer bitmap fonts to
scalable fonts when an exact match is possible, but will avoid scaling
bitmap fonts when a scalable font can be used.
(The
.Dq Li :unscaled
hack, while still supported, should no longer be necessary in X11R6.)
.Pp
You may check the font path of the running server by typing the command
.Bd -literal
$ xset q
.Ed
.Pp
.Sy Font path catalogue directories
.Pp
You can specify a special kind of font path directory in the form
.Pa catalogue:<dir> .
The directory specified after the
.Pa catalogue:
prefix will be scanned for symlinks and each symlink destination will be
added as a local font path entry.
.Pp
The symlink can be suffixed by attributes such as
.Pf ' Ql unscaled Ns ',
which will be passed through
to the underlying font path entry.
The only exception is the newly
introduced
.Pf ' Ql pri Ns '
attribute, which will be
used for ordering the font paths specified by the symlinks.
.Pp
An example configuration:
.Bd -literal
75dpi:unscaled:pri=20 -> /usr/share/X11/fonts/75dpi
ghostscript:pri=60 -> /usr/share/fonts/default/ghostscript
misc:unscaled:pri=10 -> /usr/share/X11/fonts/misc
type1:pri=40 -> /usr/share/X11/fonts/Type1
type1:pri=50 -> /usr/share/fonts/default/Type1
.Ed
.Pp
This will add
.Pa /usr/share/X11/fonts/misc
as the
first font path entry with the attribute
.Ql unscaled .
This is functionally equivalent to
setting the following font path:
.Bd -literal
/usr/share/X11/fonts/misc:unscaled,
/usr/share/X11/fonts/75dpi:unscaled,
/usr/share/X11/fonts/Type1,
/usr/share/fonts/default/Type1,
/usr/share/fonts/default/ghostscript
.Ed
.Pp
.Sy Temporary modification of the font path
.Pp
The
.Dq Nm xset
utility may be used to modify the font path for the
current session.
The font path is set with the command
.Nm xset fp ;
a new element is added to the front with
.Nm xset +fp ,
and added to
the end with
.Nm xset fp+ .
For example,
.Bd -literal
$ xset +fp /usr/local/fonts/Type1
$ xset fp+ /usr/local/fonts/bitmap
.Ed
.Pp
Conversely, an element may be removed from the front of the font path
with
.Dq Nm xset -fp ,
and removed from the end with
.Dq Nm xset fp- .
You may reset the font path to its default value with
.Dq Nm xset fp default .
.Pp
For more information, please consult the
.Lk xset.1.html xset(1)
manual page.
.Pp
.Sy Permanent modification of the font path
.Pp
The default font path (the one used just after server startup or
after
.Dq Nm xset fp default )
may be specified in the
X server's
.Pa xorg.conf
file.
It is computed by appending all the
directories mentioned in the
.Dq Li FontPath
entries of the
.Dq Li Files
section in the order in which they appear.
If no font path is specified in a config file, the server uses a default
value specified when it was built.
.Bd -literal
FontPath "/usr/local/fonts/Type1"
\&...
FontPath "/usr/local/fonts/bitmap"
.Ed
.Pp
For more information, please consult the
.Lk xorg.conf.5.html xorg.conf(5)
manual page.
.Pp
.Sy Troubleshooting
.Pp
If you seem to be unable to use some of the fonts you have
installed, the first thing to check is that the
.Pa fonts.dir
files
are correct and that they are readable by the server (the X server
usually runs as root, beware of NFS-mounted font directories). If
this doesn't help, it is quite possible that you are trying to use a
font in a format that is not supported by your server.
.Pp
X11R6 supports the BDF, PCF, SNF, Type 1, TrueType, and OpenType
font formats.
However, not all X11R6 servers
come with all the font backends configured in.
.Pp
On most platforms, the X11R6 servers no longer uses font
backends from modules that are loaded at runtime.
The built in
font support corresponds to the functionality formerly provided by
these modules:
.Bl -bullet
.It
.Ql \(dqbitmap\(dq :
bitmap fonts
.Pf ( Pa *.bdf ,
.Pa *.pcf
and
.Pa *.snf ) ;
.It
.Ql \(dqfreetype\(dq :
TrueType fonts
.Pf ( Pa *.ttf
and
.Pa *.ttc ) ,
OpenType fonts
.Pf ( Pa *.otf
and
.Pa *.otc )
and
Type\ 1 fonts
.Pf ( Pa *.pfa
and
.Pa *.pfb ) .
.El
.Sh FONTS INCLUDED WITH X11R6
.Ss Standard bitmap fonts
The Sample Implementation of X11 (SI) comes with a large number of
bitmap fonts, including the
.Dq Li fixed
family, and bitmap versions
of Courier, Times, Helvetica and some members of the Lucida family.
.Pp
In X11R6, a number of these fonts are provided in Unicode-encoded
font files now.
At build time, these fonts are split into font
files encoded according to legacy encodings, a process which allows
us to provide the standard fonts in a number of regional encodings
with no duplication of work.
.Pp
For example, the font file
.Bd -literal
/usr/share/fonts/X11/misc/6x13.bdf
.Ed
.Pp
with XLFD
.Bd -literal
-misc-fixed-medium-r-semicondensed--13-120-75-75-c-60-iso10646-1
.Ed
.Pp
is a Unicode-encoded version of the standard
.Dq Li fixed
font with
added support for the Latin, Greek, Cyrillic, Georgian, Armenian, IPA
and other scripts plus numerous technical symbols.
It contains over
2800 glyphs, covering all characters of ISO\ 8859 parts 1-5,
7-10, 13-15, as well as all European IBM and Microsoft code pages,
KOI8, WGL4, and the repertoires of many other character sets.
.Pp
This font is used at build time for generating the font files
.Bd -literal
6x13-ISO8859-1.bdf
6x13-ISO8859-2.bdf
\&...
6x13-ISO8859-15.bdf
6x13-KOI8-R.bdf
.Ed
.Pp
with respective XLFDs
.Bd -literal
-misc-fixed-medium-r-normal--13-120-75-75-c-60-iso8859-1
\&...
-misc-fixed-medium-r-normal--13-120-75-75-c-60-iso8859-15
-misc-fixed-medium-r-normal--13-120-75-75-c-60-koi8-r
.Ed
.Pp
The standard short name
.Dq Li fixed
is normally an alias for
.Bd -literal
-misc-fixed-medium-r-normal--13-120-75-75-c-60-iso8859-1
.Ed
.Ss The ClearlyU Unicode font family
The ClearlyU family of fonts provides a set of 12\ pt,
100\ dpi proportional fonts with many of the glyphs needed for
Unicode text.
Together, the fonts contain approximately 7500 glyphs.
.Pp
The main ClearlyU font has the XLFD
.Bd -literal
-mutt-clearlyu-medium-r-normal--17-120-100-100-p-101-iso10646-1
.Ed
.Pp
and resides in the font file
.Bd -literal
/usr/share/fonts/X11/misc/cu12.pcf.gz
.Ed
.Pp
Additional ClearlyU fonts include
.Bd -literal
-mutt-clearlyu alternate glyphs-medium-r-normal--17-120-100-100-p-91-iso10646-1
-mutt-clearlyu pua-medium-r-normal--17-120-100-100-p-111-iso10646-1
-mutt-clearlyu arabic extra-medium-r-normal--17-120-100-100-p-103-fontspecific-0
-mutt-clearlyu ligature-medium-r-normal--17-120-100-100-p-141-fontspecific-0
.Ed
.Pp
The
.Em Alternate Glyphs
font contains additional glyph shapes that
are needed for certain languages.
A second alternate glyph font will
be provided later for cases where a character has more than one
commonly used alternate shape
.Pf ( Em e.g.
the Urdu heh).
.Pp
The
.Em PUA
font contains extra glyphs that are useful for certain
rendering purposes.
.Pp
The
.Em Arabic Extra
font contains the glyphs necessary for
characters that don't have all of their possible shapes encoded in
ISO\ 10646.
The glyphs are roughly ordered according to the order
of the characters in the ISO\ 10646 standard.
.Pp
The
.Em Ligature
font contains ligatures for various scripts that
may be useful for improved presentation of text.
.Ss Standard scalable fonts
X11R6 includes all the scalable fonts distributed with X11R6.
.Pp
.Sy Standard Type\e1 fonts
.Pp
The IBM Courier set of fonts cover ISO\ 8859-1 and
ISO\ 8859-2 as well as Adobe Standard Encoding.
These fonts have
XLFD
.Bd -literal
-adobe-courier-medium-*-*--0-0-0-0-m-0-*-*
.Ed
.Pp
and reside in the font files
.Bd -literal
/usr/share/fonts/X11/Type1/cour*.pfa
.Ed
.Pp
The Adobe Utopia set of fonts only cover ISO\ 8859-1 as well as
Adobe Standard Encoding.
These fonts have XLFD
.Bd -literal
-adobe-utopia-*-*-normal--0-0-0-0-p-0-iso8859-1
.Ed
.Pp
and reside in the font files
.Bd -literal
/usr/share/fonts/X11/Type1/UT*.pfa
.Ed
.Pp
Finally, X11R6 also comes with Type\ 1 versions of Bitstream
Courier and Charter.
These fonts have XLFD
.Bd -literal
-bitstream-courier-*-*-normal--0-0-0-0-m-0-iso8859-1
-bitstream-charter-*-*-normal--0-0-0-0-p-0-iso8859-1
.Ed
.Pp
and reside in the font files
.Bd -literal
/usr/share/fonts/X11/Type1/c*bt_.pfb
.Ed
.Ss The Bigelow & Holmes Luxi family
X11R6 includes the
.Em Luxi
family of scalable fonts, in both
TrueType and Type\ 1 format.
This family consists of the fonts
.Em Luxi Serif ,
with XLFD
.Bd -literal
-b&h-luxi serif-medium-*-normal--*-*-*-*-p-*-*-*
.Ed
.Pp
.Em Luxi Sans ,
with XLFD
.Bd -literal
-b&h-luxi sans-medium-*-normal--*-*-*-*-p-*-*-*
.Ed
.Pp
and
.Em Luxi Mono ,
with XLFD
.Bd -literal
-b&h-luxi mono-medium-*-normal--*-*-*-*-m-*-*-*
.Ed
.Pp
Each of these fonts comes Roman, oblique, bold and bold oblique variants
The TrueType version have glyphs covering the basic ASCII Unicode
range, the Latin\ 1 range, as well as the
.Em Extended Latin
range and some additional punctuation characters.
In particular,
these fonts include all the glyphs needed for ISO\ 8859 parts 1,
2, 3, 4, 9, 13 and 15, as well as all the glyphs in the Adobe Standard
encoding and the Windows 3.1 character set.
.Pp
The glyph coverage of the Type\ 1 versions is somewhat reduced,
and only covers ISO\ 8859 parts 1, 2 and 15 as well as the Adobe
Standard encoding.
.Pp
The Luxi fonts are original designs by Kris Holmes and Charles
Bigelow.
Luxi fonts include seriffed, sans serif, and monospaced
styles, in roman and oblique, and normal and bold weights.
The fonts
share stem weight, x-height, capital height, ascent and descent, for
graphical harmony.
.Pp
The character width metrics of Luxi roman and bold fonts match those
of core fonts bundled with popular operating and window systems.
.Pp
The license terms for the Luxi fonts are included in the file
.Pa COPYRIGHT.BH ,
as well as in the
.Lk License "License document"
.Pq Bigelow_Holmes_Inc_and_URW_GmbH_Luxi_font_license .
.Pp
Charles Bigelow and Kris Holmes from Bigelow and Holmes Inc.
developed the Luxi typeface designs in Ikarus digital format.
.Pp
URW++ Design and Development GmbH converted the Ikarus format fonts
to TrueType and Type1 font programs and implemented the grid-fitting
"hints" and kerning tables in the Luxi fonts.
.Pp
For more information, please contact
.Aq Mt design@bigelowandholmes.com
or
.Aq Mt info@urwpp.de ,
or consult
.Lk http://www.urwpp.de "the URW++ web site" .
.Pp
An earlier version of the Luxi fonts was made available under the
name
.Em Lucidux .
This name should no longer be used due to
trademark uncertainties, and all traces of the
.Em Lucidux
name have been removed from X11R6.
.Sh MORE ABOUT CORE FONTS
This section describes XFree86-created enhancements to the core
X11 fonts system that were adopted by X.Org.
.Ss Core fonts and internationalisation
The scalable font backends (Type\ 1 and TrueType) can
automatically re-encode fonts to the encoding specified in the
XLFD in
.Pa fonts.dir .
For example, a
.Pa fonts.dir
file can
contain entries for the Type\ 1 Courier font such as
.Bd -literal
cour.pfa -adobe-courier-medium-r-normal--0-0-0-0-m-0-iso8859-1
cour.pfa -adobe-courier-medium-r-normal--0-0-0-0-m-0-iso8859-2
.Ed
.Pp
which will lead to the font being recoded to ISO\ 8859-1 and
ISO\ 8859-2 respectively.
.Pp
.Sy The fontenc layer
.Pp
Two of the scalable backends (Type\ 1 and the
.Em FreeType
TrueType backend) use a common
.Em fontenc
layer for
font re-encoding.
This allows these backends to share their encoding
data, and allows simple configuration of new locales independently of
font type.
.Pp
.Em Please note:
the X-TrueType (X-TT) backend is not included
in X11R6. That functionality has been merged into the FreeType
backend.
.Pp
In the
.Em fontenc
layer, an encoding is defined by a name (such as
.Ql iso8859-1 ) ,
possibly a number of aliases (alternate names), and
an ordered collection of mappings.
A mapping defines the way the
encoding can be mapped into one of the
.Em target encodings
known to
.Em fontenc ;
currently, these consist of Unicode, Adobe glyph names,
and arbitrary TrueType
.Dq cmap Ns s.
.Pp
A number of encodings are hardwired into
.Em fontenc ,
and are
therefore always available; the hardcoded encodings cannot easily be
redefined.
These include:
.Bl -bullet
.It
.Ql iso10646-1 :
Unicode;
.It
.Ql iso8859-1 :
ISO\ Latin-1 (Western Europe);
.It
.Ql iso8859-2 :
ISO\ Latin-2 (Eastern Europe);
.It
.Ql iso8859-3 :
ISO\ Latin-3 (Southern Europe);
.It
.Ql iso8859-4 :
ISO\ Latin-4 (Northern Europe);
.It
.Ql iso8859-5 :
ISO\ Cyrillic;
.It
.Ql iso8859-6 :
ISO\ Arabic;
.It
.Ql iso8859-7 :
ISO\ Greek;
.It
.Ql iso8859-8 :
ISO\ Hebrew;
.It
.Ql iso8859-9 :
ISO\ Latin-5 (Turkish);
.It
.Ql iso8859-10 :
ISO\ Latin-6 (Nordic);
.It
.Ql iso8859-15 :
ISO\ Latin-9, or Latin-0 (Revised
Western-European);
.It
.Ql koi8-r :
KOI8 Russian;
.It
.Ql koi8-u :
KOI8 Ukrainian (see RFC 2319);
.It
.Ql koi8-ru :
KOI8 Russian/Ukrainian;
.It
.Ql koi8-uni :
KOI8
.Dq Unified
(Russian, Ukrainian, and
Byelorussian);
.It
.Ql koi8-e :
KOI8
.Dq European,
ISO-IR-111, or ECMA-Cyrillic;
.It
.Ql microsoft-symbol
and
.Ql apple-roman :
these are only
likely to be useful with TrueType symbol fonts.
.El
.Pp
Additional encodings can be added by defining
.Em encoding files .
When a font encoding is requested that the
.Em fontenc
layer doesn't
know about, the backend checks the directory in which the font file
resides (not necessarily the directory with
.Pa fonts.dir ! )
for a
file named
.Pa encodings.dir .
If found, this file is scanned for
the requested encoding, and the relevant encoding definition file is
read in.
The
.Dq Nm mkfontdir
utility, when invoked with the
.Dq Li -e
option followed by the name of a directory containing
encoding files, can be used to automatically build
.Pa encodings.dir
files.
Please see the
.Lk mkfontdir.1.html mkfontdir(1)
manual page for more details.
.Pp
A number of encoding files for common encodings are included with
X11R6. Information on writing new encoding files can be found in
.Sx Format_of_encoding_directory_files
and
.Sx Format_of_encoding_files
later in this document.
.Pp
.Sy Backend-specific notes about fontenc
.Pp
.Sy The FreeType backend
.Pp
For TrueType and OpenType fonts, the FreeType backend scans the
mappings in order.
Mappings with a target of PostScript are ignored;
mappings with a TrueType or Unicode target are checked against all the
cmaps in the file.
The first applicable mapping is used.
.Pp
For Type\ 1 fonts, the FreeType backend first searches for a
mapping with a target of PostScript.
If one is found, it is used.
Otherwise, the backend searches for a mapping with target Unicode,
which is then composed with a built-in table mapping codes to glyph
names.
Note that this table only covers part of the Unicode code
points that have been assigned names by Adobe.
.Pp
Specifying an encoding value of
.Ql adobe-fontspecific
for a
Type\ 1 font disables the encoding mechanism.
This is useful with
symbol and incorrectly encoded fonts (see
.Sx Hints_about_using_badly_encoded_fonts
below).
.Pp
If a suitable mapping is not found, the FreeType backend defaults to
ISO\ 8859-1.
.Pp
.Sy Format of encoding directory files
.Pp
In order to use a font in an encoding that the font backend does
not know about, you need to have an
.Pa encodings.dir
file either
in the same directory as the font file used or in a system-wide
location
.Pf ( Pa /usr/share/fonts/X11/encodings/
by default).
.Pp
The
.Pa encodings.dir
file has a similar format to
.Pa fonts.dir .
Its first line specifies the number of encodings,
while every successive line has two columns, the name of the encoding,
and the name of the encoding file; this can be relative to the current
directory, or absolute.
Every encoding name should agree with the
encoding name defined in the encoding file.
For example,
.Bd -literal
3
mulearabic-0 /usr/share/fonts/X11/encodings/mulearabic-0.enc
mulearabic-1 /usr/share/fonts/X11/encodings/mulearabic-1.enc
mulearabic-2 /usr/share/fonts/X11/encodings/mulearabic-2.enc
.Ed
.Pp
The name of an encoding
.Em must
be specified in the encoding file's
.Dq Li STARTENCODING
or
.Dq Li ALIAS
line.
It is not enough to create
an
.Pa encodings.dir
entry.
.Pp
If your platform supports it (it probably does), encoding files may be
compressed or gzipped.
.Pp
The
.Pa encoding.dir
files are best maintained by the
.Dq Nm mkfontdir
utility.
Please see the
.Lk mkfontdir.1.html mkfontdir(1)
manual page for more information.
.Pp
.Sy Format of encoding files
.Pp
The encoding files are
.Dq free form,
.Em i.e.
any string of
whitespace is equivalent to a single space.
Keywords are parsed in a
non-case-sensitive manner, meaning that
.Dq Li size ,
.Dq Li SIZE ,
and
.Dq Li SiZE
all parse as the same keyword; on the other hand, case is
significant in glyph names.
.Pp
Numbers can be written in decimal, as in
.Dq Li 256 ,
in hexadecimal,
as in
.Dq Li 0x100 ,
or in octal, as in
.Dq Li 0400 .
.Pp
Comments are introduced by a hash sign
.Dq Li # .
A
.Dq Li #
may
appear at any point in a line, and all characters following the
.Dq Li #
are ignored, up to the end of the line.
.Pp
The encoding file starts with the definition of the name of the
encoding, and possibly its alternate names (aliases):
.Bd -literal
STARTENCODING mulearabic-0
ALIAS arabic-0
.Ed
.Pp
The name of the encoding and its aliases should be suitable for use in
an XLFD font name, and therefore contain exactly one dash
.Dq Li - .
.Pp
The encoding file may then optionally declare the size of the
encoding.
For a linear encoding (such as ISO\ 8859-1), the SIZE
line specifies the maximum code plus one:
.Bd -literal
SIZE 0x2B
.Ed
.Pp
For a matrix encoding, it should specify two numbers.
The first is
the number of the last row plus one, the other, the highest column
number plus one.
In the case of
.Dq Li jisx0208.1990-0
(JIS\ X\ 0208(1990), double-byte encoding, high bit clear), it
should be
.Bd -literal
SIZE 0x75 0x80
.Ed
.Pp
In the case of a matrix encoding, a
.Dq Li FIRSTINDEX
line may be
included to specify the minimum glyph index in an encoding.
The
keyword
.Dq Li FIRSTINDEX
is followed by two integers, the minimum row
number followed by the minimum column number:
.Bd -literal
FIRSTINDEX 0x20 0x20
.Ed
.Pp
In the case of a linear encoding, a
.Dq Li FIRSTINDEX
line is not very
useful.
If for some reason however you chose to include on, it should
be followed by a single integer.
.Pp
Note that in most font backends inclusion of a
.Dq Li FIRSTINDEX
line
has the side effect of disabling default glyph generation, and this
keyword should therefore be avoided unless absolutely necessary.
.Pp
Codes outside the region defined by the
.Dq Li SIZE
and
.Dq Li FIRSTINDEX
lines are understood to be undefined.
Encodings
default to linear encoding with a size of 256 (0x100). This means
that you must declare the size of all 16 bit encodings.
.Pp
What follows is one or more mapping sections.
A mapping section
starts with a
.Dq Li STARTMAPPING
line stating the target of the mapping.
The target may be one of:
.Bl -bullet
.It
Unicode (ISO\ 10646):
.Bd -literal
STARTMAPPING unicode
.Ed
.It
a given TrueType
.Dq cmap :
.Bd -literal
STARTMAPPING cmap 3 1
.Ed
.It
PostScript glyph names:
.Bd -literal
STARTMAPPING postscript
.Ed
.El
.Pp
Every line in a mapping section maps one from the encoding being
defined to the target of the mapping.
In mappings with a Unicode or
TrueType mapping, codes are mapped to codes:
.Bd -literal
0x21 0x0660
0x22 0x0661
\&...
.Ed
.Pp
As an abbreviation, it is possible to map a contiguous range of codes
in a single line.
A line consisting of three integers
.Bd -literal
\[u003C]it/start/ \[u003C]it/end/ \[u003C]it/target/
.Ed
.Pp
is an abbreviation for the range of lines
.Bd -literal
.Em start Em target
.Ed
.Bd -literal
.Em start Ns +1 Em target Ns +1
.Ed
.Bd -literal
\&...
.Ed
.Bd -literal
.Em end Em target Ns Pf + Em end Ns Pf - Em start
.Ed
.Pp
For example, the line
.Bd -literal
0x2121 0x215F 0x8140
.Ed
.Pp
is an abbreviation for
.Bd -literal
0x2121 0x8140
0x2122 0x8141
\&...
0x215F 0x817E
.Ed
.Pp
Codes not listed are assumed to map through the identity
.Pf ( Em i.e.
to
the same numerical value). In order to override this default mapping,
you may specify a range of codes to be undefined by using an
.Dq Li UNDEFINE
line:
.Bd -literal
UNDEFINE 0x00 0x2A
.Ed
.Pp
or, for a single code,
.Bd -literal
UNDEFINE 0x1234
.Ed
.Pp
PostScript mappings are different.
Every line in a PostScript mapping
maps a code to a glyph name
.Bd -literal
0x41 A
0x42 B
\&...
.Ed
.Pp
and codes not explicitly listed are undefined.
.Pp
A mapping section ends with an
.Ql ENDMAPPING
line
.Bd -literal
ENDMAPPING
.Ed
.Pp
After all the mappings have been defined, the file ends with an
.Ql ENDENCODING
line
.Bd -literal
ENDENCODING
.Ed
.Pp
In order to make future extensions to the format possible, lines
starting with an unknown keyword are silently ignored, as are mapping
sections with an unknown target.
.Pp
.Sy Using symbol fonts
.Pp
Type\ 1 symbol fonts should be installed using the
.Ql adobe-fontspecific
encoding.
.Pp
In an ideal world, all TrueType symbol fonts would be installed using
one of the
.Ql microsoft-symbol
and
.Ql apple-roman
encodings.
A
number of symbol fonts, however, are not marked as such; such fonts
should be installed using
.Ql microsoft-cp1252 ,
or, for older fonts,
.Ql microsoft-win3.1 .
.Pp
In order to guarantee consistent results (especially between
Type\ 1 and TrueType versions of the same font), it is possible to
define a special encoding for a given font.
This has already been done
for the
.Ql ZapfDingbats
font; see the file
.Pa encodings/adobe-dingbats.enc .
.Pp
.Sy Hints about using badly encoded fonts
.Pp
A number of text fonts are incorrectly encoded.
Incorrect encoding
is sometimes done by design, in order to make a font for an exotic
script appear like an ordinary Western text font on systems which are
not easily extended with new locale data.
It is often the result of
the font designer's laziness or incompetence; for some reason, most
people seem to find it easier to invent idiosyncratic glyph names
rather than follow the Adobe glyph list.
.Pp
There are two ways of dealing with such fonts: using them with the
encoding they were designed for, and creating an
.Em ad hoc
encoding
file.
.Pp
.Sy Using fonts with the designer's encoding
.Pp
In the case of Type\ 1 fonts, the font designer can specify a
default encoding; this encoding is requested by using the
.Dq Li adobe-fontspecific
encoding in the XLFD name.
Sometimes, the
font designer omitted to specify a reasonable default encoding, in
which case you should experiment with
.Dq Li adobe-standard ,
.Dq Li iso8859-1 ,
.Dq Li microsoft-cp1252 ,
and
.Dq Li microsoft-win3.1 .
(The encoding
.Dq Li microsoft-symbol
doesn't
make sense for Type\ 1 fonts).
.Pp
TrueType fonts do not have a default encoding.
However, most TrueType
fonts are designed with either Microsoft or Apple platforms in mind,
so one of
.Dq Li microsoft-symbol ,
.Dq Li microsoft-cp1252 ,
.Dq Li microsoft-win3.1 ,
or
.Dq Li apple-roman
should yield reasonable
results.
.Pp
.Sy Specifying an ad hoc encoding file
.Pp
It is always possible to define an encoding file to put the glyphs
in a font in any desired order.
Again, see the
.Pa encodings/adobe-dingbats.enc
file to see how this is done.
.Pp
.Sy Specifying font aliases
.Pp
By following the directions above, you will find yourself with a
number of fonts with unusual names --- with encodings such as
.Dq Li adobe-fontspecific ,
.Dq Li microsoft-win3.1
.Em etc .
In order
to use these fonts with standard applications, it may be useful to
remap them to their proper names.
.Pp
This is done by writing a
.Pa fonts.alias
file.
The format of this file
is very simple: it consists of a series of lines each mapping an alias
name to a font name.
A
.Pa fonts.alias
file might look as follows:
.Bd -literal
"-ogonki-alamakota-medium-r-normal--0-0-0-0-p-0-iso8859-2" \e
"-ogonki-alamakota-medium-r-normal--0-0-0-0-p-0-adobe-fontspecific"
.Ed
.Pp
(both XLFD names on a single line). The syntax of the
.Pa fonts.alias
file is more precisely described in the
.Lk mkfontdir.1.html mkfontdir(1)
manual page.
.Ss Additional notes about scalable core fonts
.Sy About the FreeType backend
.Pp
The
.Em FreeType
backend (formerly
.Em xfsft )
is a backend based on version 2 of the FreeType library (see
.Lk http://www.freetype.org/ "the FreeType web site" )
and has
the X-TT functionalities for CJKV support provided by the After X-TT
Project (see
.Lk http://x-tt.sourceforge.jp/ "the After X-TT Project web site" ) .
The
.Em FreeType
backend has support for the
.Dq fontenc
style of internationalisation (see
.Sx The_fontenc_layer ) .
This backend supports TrueType font files
.Pf ( Pa *.ttf ) ,
OpenType font files
.Pf ( Pa *.otf ) ,
TrueType Collections
.Pf ( Pa *.ttc ) ,
OpenType Collections
.Pf ( Pa *.otc )
and Type 1 font
files
.Pf ( Pa *.pfa
and
.Pa *.pfb ) .
.Pp
In order to access the faces in a TrueType Collection file, the face
number must be specified in the fonts.dir file before the filename,
within a pair of colons, or by setting the 'fn' TTCap option.
For example,
.Bd -literal
:1:mincho.ttc -misc-pmincho-medium-r-normal--0-0-0-0-p-0-jisx0208.1990-0
.Ed
.Pp
refers to face 1 in the
.Pa mincho.ttc
TrueType Collection file.
.Pp
The new
.Em FreeType
backend supports the extended
.Pa fonts.dir
syntax introduced by X-TrueType with a number
of options, collectively known as
.Dq TTCap .
A
.Dq TTCap
entry follows the
general syntax
.Bd -literal
option=value:
.Ed
.Pp
and should be specified before the filename.
The new
.Em FreeType
almost perfectly supports TTCap options that are compatible with X-TT
1.4. The Automatic Italic
.Pf ( Dq Ns Li ai ) ,
Double Strike
.Pf ( Dq Ns Li ds )
and
Bounding box Width
.Pf ( Dq Ns Li bw )
options are indispensable in CJKV.
For example,
.Bd -literal
mincho.ttc -misc-mincho-medium-r-normal--0-0-0-0-c-0-jisx0208.1990-0
ds=y:mincho.ttc -misc-mincho-bold-r-normal--0-0-0-0-c-0-jisx0208.1990-0
ai=0.2:mincho.ttc -misc-mincho-medium-i-normal--0-0-0-0-c-0-jisx0208.1990-0
ds=y:ai=0.2:mincho.ttc -misc-mincho-bold-i-normal--0-0-0-0-c-0-jisx0208.1990-0
bw=0.5:mincho.ttc -misc-mincho-medium-r-normal--0-0-0-0-c-0-jisx0201.1976-0
bw=0.5:ds=y:mincho.ttc -misc-mincho-bold-r-normal--0-0-0-0-c-0-jisx0201.1976-0
bw=0.5:ai=0.2:mincho.ttc -misc-mincho-medium-i-normal--0-0-0-0-c-0-jisx0201.1976-0
bw=0.5:ds=y:ai=0.2:mincho.ttc -misc-mincho-bold-i-normal--0-0-0-0-c-0-jisx0201.1976-0
.Ed
.Pp
setup the complete combination of jisx0208 and jisx0201 using mincho.ttc
only.
More information on the TTCap syntax is found on
.Lk http://x-tt.sourceforge.jp/ "the After X-TT Project page" .
.Pp
The
.Em FreeType
backend uses the
.Em fontenc
layer in order to support
recoding of fonts; this was described in
.Sx The_fontenc_layer
and especially
.Sx The_FreeType_backend
earlier in this document.
.Pp
.Sy Delayed glyph rasterisation
.Pp
When loading a proportional fonts which contain a huge number of glyphs,
the old
.Em FreeType
delayed glyph rasterisation until the time at which
the glyph was first used.
The new FreeType (libfreetype-xtt2) has an
improved
.Dq very lazy
metric calculation method to speed up the process when
loading TrueType or OpenType fonts.
Although the
.Em X-TT
module also
has this method, the
.Pf \(dq Ql vl=y Ns \(dq
TTCap option must be set if you want to
use it.
This is the default method for
.Em FreeType
when it loads
multi-byte fonts.
Even if you use a unicode font which has tens of
thousands of glyphs, this delay will not be worrisome as long as you use
the new
.Em FreeType
backend -- its
.Dq very lazy
method is super-fast.
.Pp
The maximum error of bitmap position using
.Dq very lazy
method is 1 pixel,
and is the same as that of a character-cell spacing.
When the X-TT
backend is used with the
.Dq Li vl=y
option, a chipped bitmap is displayed
with certain fonts.
However, the new FreeType backend has minimal problem
with this, since it corrects left- and right-side bearings using
.Dq italicAngle
in the TrueType/OpenType post table, and does automatic
correction of bitmap positions when rasterisation so that chipped bitmaps
are not displayed.
Nevertheless if you don't want to use the
.Dq very lazy
method when using multi-bytes fonts, set
.Dq Li vl=n
in the TTCap option to
disable it:
.Bd -literal
vl=n:luxirr.ttf -b&h-Luxi Serif-medium-r-normal--0-0-0-0-p-0-iso10646-1
.Ed
.Pp
Of course, both backends also support an optimisation for character-cell
fonts (fonts with all glyph metrics equal, or terminal fonts). A font
with an XLFD specifying a character-cell spacing
.Dq Li c ,
as in
.Bd -literal
-misc-mincho-medium-r-normal--0-0-0-0-c-0-jisx0208.1990-0
.Ed
.Pp
or
.Bd -literal
fs=c:mincho.ttc -misc-mincho-medium-r-normal--0-0-0-0-p-0-jisx0208.1990-0
.Ed
.Pp
will not compute the metric for each glyph, but instead
trust the font to be a character-cell font.
You are
encouraged to make use of this optimisation when useful, but be warned
that not all monospaced fonts are character-cell fonts.
.Sh APPENDIX: BACKGROUND AND TERMINOLOGY
.Ss Characters and glyphs
A computer text-processing system inputs keystrokes and outputs
.Em glyphs ,
small pictures that are assembled on paper or on a
computer screen.
Keystrokes and glyphs do not, in general, coincide:
for example, if the system does generate ligatures, then to the
sequence of two keystrokes
.Pf < Ql f Ns > Ns < Ns Ql i Ns >
will typically
correspond a single glyph.
Similarly, if the system shapes Arabic
glyphs in a vaguely reasonable manner, then multiple different glyphs
may correspond to a single keystroke.
.Pp
The complex transformation rules from keystrokes to glyphs are usually
factored into two simpler transformations, from keystrokes to
.Em characters
and from characters to glyphs.
You may want to think
of characters as the basic unit of text that is stored
.Em e.g.
in
the buffer of your text editor.
While the definition of a character
is intrinsically application-specific, a number of standardised
collections of characters have been defined.
.Pp
A
.Em coded character set
is a set of characters together with a
mapping from integer codes --- known as
.Em codepoints
--- to
characters.
Examples of coded character sets include US-ASCII,
ISO\ 8859-1, KOI8-R, and JIS\ X\ 0208(1990).
.Pp
A coded character set need not use 8 bit integers to index characters.
Many early systems used 6 bit character sets, while 16 bit (or more)
character sets are necessary for ideographic writing systems.
.Ss Font files, fonts, and XLFD
Traditionally, typographers speak about
.Em typefaces
and
.Em founts .
A typeface is a particular style or design, such as
Times Italic, while a fount is a molten-lead incarnation of a given
typeface at a given size.
.Pp
Digital fonts come in
.Em font files .
A font file contains the
information necessary for generating glyphs of a given typeface, and
applications using font files may access glyph information in an
arbitrary order.
.Pp
Digital fonts may consist of bitmap data, in which case they are said
to be
.Em bitmap fonts .
They may also consist of a mathematical
description of glyph shapes, in which case they are said to be
.Em scalable fonts .
Common formats for scalable font files are
.Em Type\ 1
(sometimes incorrectly called
.Em ATM fonts
or
.Em PostScript fonts ) ,
.Em TrueType
and
.Em OpenType .
.Pp
The glyph data in a digital font needs to be indexed somehow.
How
this is done depends on the font file format.
In the case of
Type\ 1 fonts, glyphs are identified by
.Em glyph names .
In the
case of TrueType fonts, glyphs are indexed by integers corresponding
to one of a number of indexing schemes (usually Unicode --- see below).
.Pp
The X11 core fonts system uses the data in a font file to generate
.Em font instances ,
which are collections of glyphs at a given size
indexed according to a given encoding.
.Pp
X11 core font instances are usually specified using a notation known
as the
.Em X Logical Font Description
(XLFD). An XLFD starts with a
dash
.Dq Li - ,
and consists of fourteen fields separated by dashes,
for example:
.Bd -literal
-adobe-courier-medium-r-normal--12-120-75-75-m-70-iso8859-1
.Ed
.Pp
Or particular interest are the last two fields
.Dq Li iso8859-1 ,
which
specify the font instance's encoding.
.Pp
A scalable font is specified by an XLFD which contains zeroes instead
of some fields:
.Bd -literal
-adobe-courier-medium-r-normal--0-0-0-0-m-0-iso8859-1
.Ed
.Pp
X11 font instances may also be specified by short name.
Unlike an
XLFD, a short name has no structure and is simply a conventional name
for a font instance.
Two short names are of particular interest, as
the server will not start if font instances with these names cannot be
opened.
These are
.Dq Li fixed ,
which specifies the fallback font to
use when the requested font cannot be opened, and
.Dq Li cursor ,
which
specifies the set of glyphs to be used by the mouse pointer.
.Pp
Short names are usually implemented as aliases to XLFDs; the
standard
.Dq Li fixed
and
.Dq Li cursor
aliases are defined in
.Bd -literal
/usr/share/font/X11/misc/fonts.alias
.Ed
.Ss Unicode
Unicode
.Pf ( Lk http://www.unicode.org http://www.unicode.org )
is a coded character
set with the goal of uniquely identifying all characters for all
scripts, current and historical.
While Unicode was explicitly not
designed as a glyph encoding scheme, it is often possible to use it as
such.
.Pp
Unicode is an
.Em open
character set, meaning that codepoint
assignments may be added to Unicode at any time (once specified,
though, an assignment can never be changed). For this reason, a
Unicode font will be
.Em sparse ,
meaning that it only defines glyphs
for a subset of the character registry of Unicode.
.Pp
The Unicode standard is defined in parallel with the international
standard ISO\ 10646.
Assignments in the two standards are always
equivalent, and we often use the terms
.Em Unicode
and
.Em ISO\ 10646
interchangeably.
.Pp
When used in the X11 core fonts system, Unicode-encoded fonts should
have the last two fields of their XLFD set to
.Dq Li iso10646-1 .
.Sh REFERENCES
X11R6 comes with extensive documentation in the form of manual
pages and typeset documents.
Before installing fonts, you really should
read the
.Lk fontconfig.3.html fontconfig(3)
and
.Lk mkfontdir.1.html mkfontdir(1)
manual pages; other
manual pages of interest include
.Lk X.7.html X(7) ,
.Lk Xserver.1.html Xserver(1) ,
.Lk xset.1.html xset(1) ,
.Lk Xft.3.html Xft(3) ,
.Lk xlsfonts.1.html xlsfonts(1)
and
.Lk showfont.1.html showfont(1) .
In addition, you may want to read the
.Lk xlfd "X Logical Font Description document"
.Pq xlfd
by Jim Flowers.
.Pp
The
.Lk http://www.faqs.org/faqs/by-newsgroup/comp/comp.fonts.html "comp.fonts FAQ" ,
which is unfortunately no longer being maintained, contains a wealth
of information about digital fonts.
.Pp
Xft and Fontconfig are described on
.Lk http://www.fontconfig.org "the Fontconfig site" .
.Pp
The
.Lk http://www.dcs.ed.ac.uk/home/jec/programs/xfsft/ "xfsft home page"
has been superseded by this document, and is now obsolete; you may
however still find some of the information that it contains useful.
.Lk http://www.joerg-pommnitz.de/TrueType/xfsft.html "Joerg Pommnitz' xfsft page"
is the canonical source for the
.Dq Nm ttmkfdir
utility, which is the
ancestor of
.Nm mkfontscale .
.Pp
.Lk http://www.pps.jussieu.fr/~jch/software/ "The author's software pages"
might or might not contain related scribbles and development versions
of software.
.Pp
The documentation of
.Em X-TrueType
is available from
.Lk http://x-tt.sourceforge.jp/ "the After X-TT Project page" .
.Pp
While the
.Lk http://www.unicode.org "Unicode consortium site"
may be of interest, you are more likely to find what you need in
Markus Kuhn's
.Lk http://www.cl.cam.ac.uk/~mgk25/unicode.html "UTF-8 and Unicode FAQ" .
.Pp
The IETF RFC documents, available from a number of sites throughout
the world, often provide interesting information about character set
issues; see for example
.Lk https://datatracker.ietf.org/doc/rfc373/ "RFC\e373" .
.Sh AUTHORS
.An -nosplit
X Version 11, Release 6
.An Juliusz Chroboczek Aq Mt jch@freedesktop.org