mf2pt3 (c) Copyright 1998 Apostolos Syropoulos
[[apostolo@obelix.ee.duth.gr]]
This is program [[mf2pt3]] a Perl script that generates a PostScript
Type 3 font that corresponds to a METAFONT font description. In order
to achieve its goal the program utilizes another program: mfplain
(METAPOST with the mfplain base preloaded). This program generates EPSF
files for each character. This document assumes that the reader is
familiar with the Type 3 font terminology and its structure. For
more information one should consult the ``PostScript User Manual'',
or any good book on PostScript like ``Postscript by Example'',
by Henry McGilton and Mary Campione, published by Addison-Wesley Pub Co.
@ We now describe the way the program operates. First of all, we
generate for each character of a METAFONT font an EPSF file. Next
we collect the [[BoundingBox]] information for each character, as this
piece of information is vital to the construction of the Type 3 font.
Now we can proceed with the construction of the font. Finally, we
delete some unnecessary files and we output the line the user must
add to his/her [[psfonts.map]] file in order to use the Type 3 font
when he/she creates a PostScript file from a DVI file.
<<*>>=
#!/usr/bin/perl
#
#(c) Copyright 1998 Apostolos Syropoulos
# apostolo@obelix.ee.duth.gr
#
<>
<>
<>
<>
<>
print "\n$MFfile $MFfile <$MFfile.pt3\n";
@ Since we don't know on what system the program will be used, we
must make sure it calls the GhostScript and METAPOST programs in the proper
way. Moreover, we supply to each command the proper command line switches.
The magnification is set to 100 as the usual design size is 10 pt.
The [[BoundingBox]] information are kept in a compact format in an array.
<>=
$mfplain="mfplain \'\\mode=localfont; \\batchmode; ";
<>
<>
@ Since Perl does not provide record structures, we use the [[pack]]
function to create a structure which will contain the [[BoundingBox]]
information. Each [[BoundingBox]] corresponds to four numbers: [[llx]],
[[lly]], [[urx]], and [[ury]]. If any of the 256 character slots is
undefined each of these four numbers is set to zero. For efficiency reasons
each [[BoundingBox]] structure contains one more piece of information--- an
ASCII character, which indicates whether the corresponding character is
defined ("d") or undefined ("u"). All the [[BoundingBox]] information are
kept in an array which is assumed to contain only undefined characters.
<>=
$notdef=pack("ai4","u",0,0,0,0);
for($i=0; $i<=255; $i++){ $BoundingBox[$i]=$notdef }
@ The encoding vector is a vital part of a PostScript font. The internal
name of each character is completely irrelevant to the final output. So,
someone can choose any name it pleases him.
<>=
@Encoding = ("/_a0", "/_a1", "/_a2", "/_a3", "/_a4",
"/_a5", "/_a6", "/_a7", "/_a8",
"/_a9", "/_a10", "/_a11", "/_a12",
"/_a13", "/_a14", "/_a15", "/_a16",
"/_a17", "/_a18", "/_a19", "/_a20",
"/_a21", "/_a22", "/_a23", "/_a24",
"/_a25", "/_a26", "/_a27", "/_a28",
"/_a29", "/_a30", "/_a31", "/_a32",
"/_a33", "/_a34", "/_a35", "/_a36",
"/_a37", "/_a38", "/_a39", "/_a40",
"/_a41", "/_a42", "/_a43", "/_a44",
"/_a45", "/_a46", "/_a47", "/_a48",
"/_a49", "/_a50", "/_a51", "/_a52",
"/_a53", "/_a54", "/_a55", "/_a56",
"/_a57", "/_a58", "/_a59", "/_a60",
"/_a61", "/_a62", "/_a63", "/_a64",
"/_a65", "/_a66", "/_a67", "/_a68",
"/_a69", "/_a70", "/_a71", "/_a72",
"/_a73", "/_a74", "/_a75", "/_a76",
"/_a77", "/_a78", "/_a79", "/_a80",
"/_a81", "/_a82", "/_a83", "/_a84",
"/_a85", "/_a86", "/_a87", "/_a88",
"/_a89", "/_a90", "/_a91", "/_a92",
"/_a93", "/_a94", "/_a95", "/_a96",
"/_a97", "/_a98", "/_a99", "/_a100",
"/_a101", "/_a102", "/_a103", "/_a104",
"/_a105", "/_a106", "/_a107", "/_a108",
"/_a109", "/_a110", "/_a111", "/_a112",
"/_a113", "/_a114", "/_a115", "/_a116",
"/_a117", "/_a118", "/_a119", "/_a120",
"/_a121", "/_a122", "/_a123", "/_a124",
"/_a125", "/_a126", "/_a127", "/_a128",
"/_a129", "/_a130", "/_a131", "/_a132",
"/_a133", "/_a134", "/_a135", "/_a136",
"/_a137", "/_a138", "/_a139", "/_a140",
"/_a141", "/_a142", "/_a143", "/_a144",
"/_a145", "/_a146", "/_a147", "/_a148",
"/_a149", "/_a150", "/_a151", "/_a152",
"/_a153", "/_a154", "/_a155", "/_a156",
"/_a157", "/_a158", "/_a159", "/_a160",
"/_a161", "/_a162", "/_a163", "/_a164",
"/_a165", "/_a166", "/_a167", "/_a168",
"/_a169", "/_a170", "/_a171", "/_a172",
"/_a173", "/_a174", "/_a175", "/_a176",
"/_a177", "/_a178", "/_a179", "/_a180",
"/_a181", "/_a182", "/_a183", "/_a184",
"/_a185", "/_a186", "/_a187", "/_a188",
"/_a189", "/_a190", "/_a191", "/_a192",
"/_a193", "/_a194", "/_a195", "/_a196",
"/_a197", "/_a198", "/_a199", "/_a200",
"/_a201", "/_a202", "/_a203", "/_a204",
"/_a205", "/_a206", "/_a207", "/_a208",
"/_a209", "/_a210", "/_a211", "/_a212",
"/_a213", "/_a214", "/_a215", "/_a216",
"/_a217", "/_a218", "/_a219", "/_a220",
"/_a221", "/_a222", "/_a223", "/_a224",
"/_a225", "/_a226", "/_a227", "/_a228",
"/_a229", "/_a230", "/_a231", "/_a232",
"/_a233", "/_a234", "/_a235", "/_a236",
"/_a237", "/_a238", "/_a239", "/_a240",
"/_a241", "/_a242", "/_a243", "/_a244",
"/_a245", "/_a246", "/_a247", "/_a248",
"/_a249", "/_a250", "/_a251", "/_a252",
"/_a253", "/_a254", "/_a255");
@ The program accepts at most five command line arguments:
- [[-d size]] --- explicitly specify the [[$design_size]].
- [[-nodel]] --- suppress deleting of intermediate files,
such as [[foo.tfm]] or [[foo.ddd]].
- [[-eofill]] --- ignore fill commands that are not
immediately after gsave. If this happens
put eofill in the end of the character.
- The UniqueID, i.e., a unique font identity number, if not specified the
program will automatically generate one
- The name of the METAFONT file with or without extension.
In case it is invoked without any command line arguments,
it prints usage information. In order to properly use the program,
one has to provide only the font name.
@ Command line processing is being done in a relatively standard way.
A while loop goes through each command line argument, checks its form
and sets special global variables. In the case of the UniqueID argument
we make sure it lies within the valid range, i.e., it is a number
greater or equal than 4,000,000 and less than 5,000,000.
@ Once we have finished with the scanning of the command line arguments
we must further process certain pieces of information.
<>=
$argc = @ARGV;
$design_size = -1;
$nodel = 0;
$eofill = 0;
$noID = 1;
SWITCHES: while($_ = $ARGV[0], /^-/)
{
shift;
if(/^-d(\d+)/)
{
$design_size = $1;
}
elsif(/^-nodel$/)
{
$nodel = 1;
}
elsif(/^-eofill$/)
{
$eofill = 1;
}
elsif(/^-I(\d+)$/)
{
die "UniqueID must lie in the range 4,000,000...4,999,999\n"
if ($1 > 4999999 || $1 < 4000000);
$UniqueID = $1;
$noID = 0;
}
elsif (!@ARGV)
{
last SWITCHES;
}
}
if (!@ARGV)
{
print <
Usage
exit(0);
}
else
{
$MFfile = $ARGV[0];
}
<>
@ If the user has specified a Unique Font Identity number, the program
must generate one. Moreover, if the user hasn't explicitly specified a
design size, the program must extract it from the font name. Finally, we
have to remove the file name extension in order to get the name of
the new PostScript font.
<>=
if ($noID)
{
<>
}
<>
<>
@ In case the user hasn't specified a UniqueID we must generate one
For this purpose we use function [[rand]] and the random number generator
seed ([[srand]]), in order to ensure some sort of... randomness.
Since [[rand]] produces a number in
the range 0...1, we multiply the output of [[rand]] by 999999 so that
we have a number number in the range 0...999999, and we add to this
the number 4000000 so that the final random number is in the expected
range.
<>=
srand();
$UniqueID = int(999999*rand())+4000000;
@ As it is know a METAFONT font name consists of two parts: a symbolic
acronym, specifying its characteristics, and a number specifying its
design size. The number can be either a two or a four digit number (compare
cmr17 with ecsx1095.) In both cases we extract the number from the
font name and then in the first case we divide it by 10 and in the second
case we divide it by 1000 to get the magnification factor. Finally, we
set the magnification. The number 100 is chosen because font data
must be integers greater than 100. Of course, nn case the user has already
given the design size, there is no reason to extract if from the font name.
But, still we must process it in oder to ensure the generation of valid
output.
<>=
if ($design_size == -1)
{
if ($MFfile =~ /\D+(\d+)$/)
{
$design_size=$1;
}
else
{
die "$MFfile must be a PostScript font name: there is no design size.\n";
}
}
if($design_size >100)
{
$mag_factor=$design_size/1000;
}
else
{
$mag_factor=$design_size/10;
}
$mag = 100 /$mag_factor;
@ If the user supplies the METAFONT file name with an extension
we simple chop it off. This is done pretty simple by employing Perl's
fantastic regular-expression mechanism.
<>=
$MFfile = $1 if $MFfile =~ /(\w+)\.\w*/;
@ We proceed now to the generation of the EPSF files. This task
is performed by METAPOST. Initially, we create the EPSF files by executing
[[mfplain]]. If for some reason there is no TFM file, the program
stops and prints an error message. (It is most likely that the user has
typed the name of non-existing METAFONT font.) The [[$mfplain]] command
is augmented by the magnification factor and the input part.
<>=
$mfplain .= "mag=$mag; input $MFfile \'";
system($mfplain);
if (!(-e "$MFfile.tfm"))
{
$nodel || unlink "mpout.log";
die "$MFfile: no such font in system\n";
}
@ Since, the various EPSF files have been generated successfully, we
can now start collecting the [[BoundingBox]] information. First, we
get the names of all EPSF files. Next, we must open each EPSF file,
and find the line that contains the [[BoundingBox]] information.
This is easy, since in an EPSF file the line that contains this
piece of information look like the following one:
[[%%BoundingBox: 0 -1 6 6]]. While doing this we must get the
[[FontBBox]] information. We use four variables for this purpose.
The next step is produce the first part of the font, i.e., the
metrics section and the encoding information.
<>=
<>
$Min_llx = $Min_lly = $Max_urx = $Max_ury = 0;
<>
<>
@ It could be easy to get the file names by a simple pipe, but
since this program may be used in OS other that Unix, we prefer
to do it in a more portable way--- we simply open the directory
and store all the file names that fulfill with a name that is
identical to [[$MFfile]].
<>=
opendir(Dir, ".");
$pattern = "$MFfile" . "\\.\\d+";
@EPSFs = grep(/$pattern/, readdir(Dir));
closedir Dir;
@ In order to get the [[BoundingBox]] information we open each
EPSF file and we get the line that contains these information. For
this we simply employ the pattern matching capabilities of Perl. Then
we store these information in a compact way to the appropriate index
in the [[BoundingBox]] array. (Readers not familiar with regular
expressions should consult the Perl manual.) As a side effect we calculate
the total number of characters that the font will provide in variable
[[$total_chars]] (plus one as there is always the [[/.notdef]] character).
<>=
$total_chars = @EPSFs+1;
foreach $file (@EPSFs)
{
open(EPSF_FILE,"$file")||die "Can't open file $file\n";
while ()
{
$BBox = pack("ai4","d",$1,$2,$3,$4)
if /%%BoundingBox: (-?\d+) (-?\d+) (-?\d+) (-?\d+)/;
}
close EPSF_FILE;
$_=$file;
/$MFfile\.(\d+)/;
$BoundingBox[$1] = $BBox;
($_, $llx, $lly, $urx, $ury) = unpack("ai4", $BBox);
$Min_llx = $llx if $llx < $Min_llx;
$Min_lly = $lly if $lly < $Min_lly;
$Max_urx = $urx if $urx > $Max_urx;
$Max_ury = $ury if $ury > $Max_ury;
}
@ We now have all the information we need in order to generate the complete
Type 3 font. We first, create the font file and then we print
to it some information which are pretty standard, such as the [[/FontType]],
etc. Next, we let PostScript know which characters will the font provide,
and then we generate the [[BoundingBox]] dictionary, the [[Metrics]]
dictionary, and the [[CharProcs]] dictionary. Finally, we generate the
[[BuildGlyph]] procedure and we define the font.
<>=
open(TYPE3, ">$MFfile.pt3")||die "Can't create file $MFfile.pt3\n";
$date = localtime;
print TYPE3 <>
<>
<>
<>
<>
@ After initializing the Encoding vector, we must make all those
assignments so that PostScript will know which characters are in this font.
This is trivial--- we simply scan the [[BoundingBox]] array and for
each defined character we print a line of the form [[Encoding N Name]],
where [[N]] is the number of the character and [[Name]] the [[N]]th
entry in the Encoding array.
<>=
for($i=0; $i<=256; $i++)
{
$_ = unpack("ai4", $BoundingBox[$i]);
print TYPE3 "Encoding $i $Encoding[$i] put\n" if $_ eq "d";
}
@ In general the bounding boxes dictionary consists of some of entries
like the following one: [[/A [ 0 -100 600 700 ] def]]. Since, the numbers
are stored in the [[BoundingBox]] array our task is simple. However we must
print some data that let PostScript know the size of the dictionary.
<>=
print TYPE3 "/BoundingBoxes $total_chars dict def\n";
print TYPE3 "BoundingBoxes begin\n";
print TYPE3 "/.notdef { 0 0 0 0 } def\n";
for($i=0; $i<=256; $i++)
{
($_, $llx, $lly, $urx, $ury) = unpack("ai4",$BoundingBox[$i]);
print TYPE3 "$Encoding[$i] [ $llx $lly $urx $ury ] def\n"
if $_ eq "d";
}
print TYPE3 "end %BoundingBoxes\n";
@ The metrics dictionary is created in way similar to the bounding box
dictionary. Generally it consists of lines of the form: [[/A 600 def]]
where the number is the difference urx-llx. As usual we must first
output some book-keeping information.
<>=
print TYPE3 "/Metrics $total_chars dict def\n";
print TYPE3 "Metrics begin\n";
print TYPE3 "/.notdef 0 def\n";
for($i=0; $i<=256; $i++)
{
($_, $llx, $lly, $urx, $ury) = unpack("ai4",$BoundingBox[$i]);
$diff = $urx - $llx;
print TYPE3 "$Encoding[$i] $diff def\n" if $_ eq "d";
}
print TYPE3 "end %Metrics\n";
@ Generating the CharProcs dictionary involves the extraction of
the PostScript code from the various EPSF files. Moreover, we must
be careful to avoid extracting comments and to delete the keyword
[[showpage]] from the source code. Both, operations are getting done
by making use of the regular expression facilities and operators that
Perl provide. Apart from that we generation process is similar to
the previous ones. First, we generate some standard code and then
the PostScript code for each character. Please note, that after
initial experimentation I have found that METAPOST generates
a lot of strange [[setgray]] commands which actually confuse a
PostScript interpreter, so the program eliminates all such commands.
<>=
print TYPE3 "/CharProcs $total_chars dict def\n";
print TYPE3 "CharProcs begin\n";
print TYPE3 "/.notdef { } def\n";
for($i=0; $i<=256; $i++)
{
$_ = unpack("ai4", $BoundingBox[$i]);
if ($_ eq "d")
{
open(CHAR, "$MFfile.$i")||
die "Can't open file $MFfile.$i\n";
$code = ""; #"100 100 scale\n";
while ()
{
$code .= $_ if $_ !~ /^%/;
}
close CHAR;
$code =~ s/showpage\n*//mg; #eliminate showpage
$code =~ s/\d setgray//mg; #eliminate setgray
$code =~ s/(\d*)\.\d+/$1/mg; #chop decimal digits
print TYPE3 $Encoding[$i], " { %\n";
if ($eofill)
{
<>
}
else
{
print TYPE3 $code;
}
print TYPE3 "} bind def\n";
}
}
print TYPE3 "end %CharProcs\n";
@ In the [[eofill]] mode, we eliminate newpath calls and
fill calls that are not following gsave.
And in the end we call eofill.
<>=
$code =~ s/gsave fill/GSAVE FILL/mg; # save for soon restore
$code =~ s/fill//mg; # eliminate fill
$code =~ s/newpath//mg; # eliminate fill
$code =~ s/GSAVE FILL/gsave fill/mg; # restore
print TYPE3 $code, "eofill\n";
@ Procedure [[BuildGlyph]] is very important as it defines the way
PostScript will be able to use the font. The generated code is
pretty standard.
<>=
print TYPE3 <>=
if (!$nodel)
{
unlink @EPSFs;
unlink "$MFfile.log", "$MFfile.tfm";
}
@ Acknowledgments
John Hobby (the creator of METAPOST) and Yotam Medini
had helped me in the development phase and the debugging phase
correspondingly. Many thanks to both of you!