Text preview for : Font_Representations_and_Formats_Mar77.pdf part of xerox Font Representations and Formats Mar77 xerox alto printing Font_Representations_and_Formats_Mar77.pdf



Back to : Font_Representations_and_ | Home

XEROX
PALO ALTO RESEARCH CENTER
Computer Sciences Lahoratol),
March 5, 1977

For Xerox Internal Use Only

To: File

From: Bob Sproull
Subject: Font Representations and Formats
Filed on:

This report presents the various standard and device-dependent font formats in use at PARCo

1. Introduction

A font is a collection of character descriptions, indexed by a character code. These
descriptions represent, in one fashion or another, the appearance of the character. The
ultimate purpose of maintaining a font is for use when generating a raster-scanned image of
a document. This image may be created on a display and used for interactive purposes, or it
may be generated by a printing service as part of a "hard copy" function. In both cases, for
purposes of space and device independence, the document itself does not normally contain
the character representations, but only codes used to iCientify the characters that comprise
the document.

It is important to distinguish font representations from font formats.

We use two generically different representations for character shapes. The first, loosely
termed "s'plines" or "spline fonts," represents the outline of the each character shape with a
series of parametric cubic splme curves (see Figure 1). This representation is handy because
it is independent of the particular output device and its resolution: the outlines describe the
desired a earance of the character. The second representation we use is a raster
sometimes oosely termed a bit map ), as shown in Figure 2. This representation records,
in some way, a two-dimensional (binary) occupancy map: it tells where the character lies on
a two-dimensional grid. This representation is handy for actually building raster images of
documents: the occupancy map is combined with color information,' often at very high
speed, to generate a larger raster image of the document. The raster character description is
in effect merged into the page raster at the proper position.

When characters are recorded in font jiles, we choose a particular format for the file; quite
a number of different formats have emerged. This is because there are many ways to encode
digitally the information in either an outline or raster representation of a character. The
details of the encoding are often of vital concern when making a particular piece of
hardware or software generate page rasters rapidly.

Fortunately, we can write conversion programs that are able to generate the various
specialized formats from standard formats. When an artist (or a needy user) devotes a
large amount of effort to designing and debugging a font, it should be recorded and
disseminated in one of the standard formats. Clients can then easily convert to one of the
subsidiary formats, or to their own private format
Font Representations and Fonnats 2



Widths
An important adjunct to the font descriptions themselves is the "widths' file," which
summarizes the dimensions of all characters in the font data base. This summary must be
available to a text editor when it fonnats a document for hard copy: the widths are used to
detennine how many characters will fit on a line and to perfonn justification calculations.
Because the infonnation in this file can be independent of any particular output device, the
hard-copy file produced by the editor can be printed on any of a number of printing
devices.

The widths summary is, in effect, extracted from infonnation recorded in the standard
fonnats of the relevant fonts.

Software

The PARC font descriptions are supponed by a reasonably full set of software:

FRED: Interactive program for building outline font representations. Documentation
is on Fred.Press. The program is on Fred.Dm.

PREPRESS: Interactive program for building standard raster font representations. The
program also contains numerous options for converting from standard to subsidiary
fonnats. Documentation is on PrePress.Press. . The program is on
.PrePress.Run.

COMPRESS: A program that converts .CU fonnat to EARS (.EP and .EL) fonnats. The
program is on Compress.Run.

The reader is invited to consult PrePress documentation for miscellaneous lore relating to
fonts and for "standard operating procedures" for maintaining font files.

People

This document is simply a convenient summary of fonnats and techniques developed by a
large number of individuals. The people behind the fonnats include Patrick Baudelaire,
Peter Deutsch, Diana Merry, Ron Rider, Bob Sproull, Larry Tesler, and Chuck Thacker.

2. Terminology

The tenninology that has developed around fonts is hopelessly inconsistent. This section is
intended to serve as a glossary for the descriptions in the remainder of this document. Be
forewarned that tenninology used elsewhere may not match.

2.1 Characters

Family is the tenn given to a particular design of characters. Examples of families are
"Times Roman," or "Helvetica."

Point size of a character refers to size measurements used in. the printing industry. If text is
n points high, this means that closely-spaced lines of text will fall nl72 inches apart on the
page. Note that the point size does not relate in any consistent way to the geometry of
characters, e.g., to the height of an upper case A.

Face denotes a number of attributes of a particular font: italic. bold, light, condensed,
expanded are all attributes of the font. Sometimes this is called a "style." Sometimes the
Font Representations and Formats 3



face is defined with a three-letter code: the first letter is L for light, M for medium, or B for
bold; the second is R for regular or I for italic; the third is C for condensed, R for regular or
E for expanded.

Rotation refers to the orientation of the character. If a string of characters is intended to be
horizontal, it has rotation zero; if a string runs vertically upward, it has a rotation of 90
degrees.

Font, as we use the term, refers to a collection of characters of the same family, the same
size, the same rotation, and the same face attributes.

Character code refers to a number (usually only 8 bits) that identifies a character. All our
fonts use standard ASCII conventions, when the conventions are meaningfu1. For special-
character fonts (e.g., mathematics, logic design), another mapping must generally be devised.

Origin of a character (sometimes called "the (0,0) point") is conceptually a reference mark
that is used to describe a character's location on a page or display. Thus a directive to
"display an A at x=103, y=204" is interpreted to mean "place an instance of the symbol A on
the display so that the character origin coincides with the coordinate x = 103, y = 204." Figures
1 and 2 show the origin of a sample character.

Width of a character is a two-dimensional vector that represents the incremental translation
that should take place to determine the placement of the origin of the next character to be
displayed in a (conventionally aligned) string of characters. In the example of Figure 3, if
we assume the x direction points to the right and the y direction up, we see that the width
vector has a zero y component.

In all our font representations, we associate the width vector with each character code: If this width
vector is used for character positioning, the spacing between the origin of a A (say) and the origin of
the next character is independent of that next character. This is not always desirable: because of the
different shapes of characters, spacing between differing pairs may want to be adjusted slightly to make
the text line appear more pleasing.

Bounding box is the term for a rectangle that just barely surrounds the character (see Figure
3). It is characterized by its width and height, and by a two-dimensional vector that
specifies where the lower-left corner of the bounding box is with respect to the origin of the
character inside. These four numbers are named (in this document) BBdx, BBdy, BBox, and
BBoy.

The font bounding box is a bounding box that applies to all characters in the font. That is,
if all the characters in the font were placed with their origins coincident, the smallest
rectangle that encloses every part is the font bounding box. The four parameters of the font
bounding box are named (in this document) FBBdx, FBBdy, FBBox, and FBBoy

The coordinate system assumed for this document is that x points to the right on a
(portrait-oriented) page, and y points up. A mica is a unit of measure, equal to 10