On the embedding of fonts in documents
Embedding of fonts in documents and websites is not an easy matter. Font licenses are not equipped to deal with the complex new ways in which fonts can be used. Just going along with the flow isn't a good idea: some nasty scenarios are possible. So how to deal with it?
As a micro type foundry, LettError is concerned with the current developments of font embedding. In the rush to get to the web, large companies might be forgetting some issues like copyrights and the protection of proprietary font material.
In some cases outline based technology is superior to those based on bitmaps. But the current proposals are not doing enough to protect proprietary font information in documents. Nor do they propose an alternative system of paying for type, nor are they looking at alternatives. Therefore, all people that are involved in making type should consider the risks font embedding will pose to their work.
Although the companies that develop embedding schemes might be willing to sacrifice their own libraries to a bigger marketing goal, it can damage small type manufacturers.
The following text discusses, amongst others, the proposals of Adobe, BitStream and Microsoft, the risks of embedding, and some of the things that can be done alternatively, from the point of view of a type designer.
On Embedding
All discussions about 'fonts on the web' seem to have two recurring themes: 'it will happen' and 'yeah, we should do something about these copyright issues', without actually addressing either one of them in depth. Typedesigners and type-manufacturers have interest in the matter to see their copyrights protected in the best possible way. There is not yet seen a scheme in which embedded typefaces are protected from illegal use and copying. Nor have the alternatives been studied sufficiently. There are alternatives to embedded fonts that give publishers and users the same degree of typographic freedom, without putting proprietary outlines in jeopardy. This document discusses the problems with various existing proposals, as well as several alternatives.
Action
The fonts on the web discussion needs to be fuelled with real arguments about protection and outline safety. Talk about micro payments sounds good, but it is still quite far off, and the embeddable fonts discussion is acute and current. Font embedding will be implemented without any security for the fonts, or a system that allows for payment.
Decisions about how fonts will live on the web are made now by several organisations and companies that want to be involved in internet technology, but do not fully understand the effects of their actions. But these are not planetary movements, nor continents on drift: there is something that can be done about it. Only a couple of companies involved, and only a handful of people making the decisions.
Adobe's PDF
PDF files can contain the complete font in postscript form when the author wants it. The type1 data can be located quite easily and extracted from the file. On some platforms it is not even necessary to open an application. Commercial applications like Metamorphosis or Fontographer can then be used to generate bitmaps or convert the font to whatever format the user wants. In case of Adobe fonts, additional kerning information can be obtained from the Adobe web site. Although the author can choose to not include the original proprietary outlines in the document, it will mean a large percentage of pdf files will contain copyrighted outline material and expose them.
Embedding is not covered in a lot of end-user licenses. Adobe urged type manufacturers to allow fonts users to include the fonts for viewing and printing. Some big foundries now allow embedding in pdf, but they made their decision based on the relatively small use of pdf technology, a couple of years ago. Now that pdf is used on the net, the exposure the fonts get is exponentially bigger. Would these foundries have allowed embedding had they been aware of the risks of pdf on the web? Probably not.
Many of the smaller foundries do not allow their fonts to be embedded at all. Users who do so break the license agreement, and use the fonts illegally.
BitStream's TrueDoc
Truedoc appears to use several truetype characteristics to embed fonts in the a 'TrueDoc' document. The embedded character outlines would already have their hints applied to them, which means they are distorted in such a way that they can be rasterised without having to interpret the hints a second time. This might cause fonts that are used in relatively small sizes to look really bad when they are resurrected as fonts and used in a different. But this does not work as a safety feature because it relies on the user's discretion to not use the font in large enough sizes, which are not bothered by hinting distortion at all.
A quote from a message posted to the w3 fonts discussion, from Glen Rippel, a person at BitStream who's involved with these matters:
"2. Type 1 fonts from foundries are sold with licensing models that are all over the map. Some allow for embedding of fonts for viewing only, some allow for viewing and printing, and some do not allow either. The point is that there is no embedding status data contained in Type1 fonts that can be used by authoring and viewing software to make these important legal decisions. 3. Subsetting a font to include only the necessary characters is a good idea only if done carefully. Removing characters and maintaining the original font format, name, and hint data clearly violates most font foundries software licenses. The only legal method we know is to allow the font to be executed by the font scaler, refit the curves, and recast the data into a neutral non-infringing data type."
Legal? Perhaps in the United States, where only the font software is protected by copyright. TrueDoc ignores the embedding bit set in TrueType fonts. Although this bit is not much of a protection, it is a signal from the manufacturer that there are restrictions on embedding the font. Just embedding the outlines is legal zigzagging that's only there to muddy the issue.
They argue that by taking the outlines out of the context of the original font, they would not be infringing on copyright. Only American copyright law. This approach can get them into serious legal problems in Europe and other places outside the US, where character outlines are protected by various forms of copyright, and embedding them this way can be judged illegal.
Pre-hinted outlines also have a limited use: fix the outlines in a document for 72 dpi, it will look good on screen (if you are using professionally hinted fonts), but it will print lousy. The other way around: fix the outlines for a printing resolution, and there won't be any hints left for screen, making it look lousy. Now what were the reasons to include outlines again?
Although the TrueDoc file format is proprietary, chances are it will be made public in order to support wider use of the format. Like the other proposals, BitStream's system only includes the outlines of those characters that are actually used in the text, thinking that this would make the font practically unusable for other documents.
Any scheme that embeds only the used characters can be foiled by automation. On internet it is quite easy to build and operate an application that searches the web in order to find documents that contain a specific font, and to collect all the characters it needs. If not all characters are present in one document it will continue to look in another document until it has found all, or enough characters. This might be too much work for a human web user to do, but this is a perfect task for a program. If a file format that includes fonts gets used widely in a networked environment, it means that a fair number of complete character sets can be collected at any time.
Building a 'font crawler' is not a major engineering task and there will be enough incentive to build one. Including proprietary outlines in documents on the web will create the environment that makes wholesale, massive font theft possible. A Font Crawler provides instant fonts!
Microsoft's free fonts
.. for use on the web are not a solution either. Although it is a good alternative to provide users with a method of building portable documents, the author can be sure the reader has access to the fonts, it is naive to assume all web authors would only be using these particular fonts, and it gives a signal to the users it is all right to include fonts and distribute them freely.
Furthermore, type plays an important role in building a distinct typographic identity and applying the right typefaces to the right job is a profession all by itself. If anything, the web will increase the need for individualised typography and special fonts.
SoftQuad
(publishers of HotMetal, html authoring software) have an approach that makes it relatively easy to buy fonts on-line. Although this scheme does not directly include outlines, it does shed a light on the webfonts discussion. They appear to have a scheme in which browser-server interaction determines the fonts needed in a webpage, and compares that to the fonts present at the client side. If the document uses fonts that the client does not have, the SoftQuad browser allows the user to go to SoftQuad's font site to buy the font. Although this could be an interesting way to sell font on the net, it does not solve the problem of how to use fonts in webpages, and providing consistent typography. There are some faults in their reasoning.
First, it seems unlikely that this will arouse the interest of major font foundries. This scheme could give SoftQuad a virtual monopoly on typefaces: foundries would be forced to sell their fonts through SoftQuad in order to provide them to web users, since the SoftQuad browser would of course only link to their own font site. Unless SoftQuad states otherwise, this could very well be their goal. If a reference to a font in HTML is allowed to contain a URL to the appropriate shop of fonts, more libraries might be interested. This would also be a good idea to allow people to buy outline versions of bitmap fonts.
The second fault in their reasoning is to assume that web consumers are willing to pay for the license to use specific fonts. Publishers have something to gain by designing their information well., but advertisers have even more reason to make sure their stuff looks the way they want it. Can a web advertiser expect their users to pay for the fonts used on a page? Of course not. The page will not be visited, because people will not accept a $10 charge for a font in order to look at advertising. SoftQuad might have argued that if it is easier to buy a font than to steal one, people will be inclined to buy fonts. But nobody is willing to steal the right fonts to see a advertisement correctly either.
The third fault is to assume that the typefaces included in a page would be for sale. For instance, a situation in which a new movie is launched, the author of the website for this movie would like to use characters from a custom designed typeface, without having to offer it on sale via SoftQuad. In fact, it is very likely the author of such a website would not want SoftQuad to be involved at all for all sorts of reasons. Also the web author might consider using fonts that are not supported by SoftQuad's library.
The only way right now to prevent the worldwide web from becoming a massive instant font-supermarket is to devise a scheme in which no outline information is needed at the www client's side. Only the strongest RSA encryption will prevent the outlines from being accessed and opened by the users, precluding their usefulness, since applications cannot get to them either. For typography's sake we should look at the alternatives.
There is clear need to be able to use fonts on webpages. The current typographic capabilities of HTML are limited, and need to be improved drastically in order to be able to present information and data in ways designers and users have grown to expect from print media as well as CD-ROM and application interfaces.
A global need for fonts in webpages illustrates the value of fonts, it is not an argument to make them free.
How can the need for type and typography be combined with a safe lacking of outline fonts? A well designed interactive website would be more like an application than a picture of a piece of paper. Developments like Sun's Java and Apple's Cyberdog increase interactivity by adding chunks of code and interaction. A webpage is thus really an interface for one or several applications on screen with perishable and context sensitive information and should be designed accordingly.
There is a huge difference between design for screen and page layout. The last decade of desktop publishing has trained a lot of people to look at the image of a paper page, on screen. But previewing a page that is going to print is not the same as looking at a screen which is designed to be a screen. Information on screen needs to look different from information on paper, because they are different physical things. Drop one of each on your feet and find out.
Web page or document?
The different physical properties make it virtually impossible to have one document format that looks good in both worlds, only by making massive compromises to shape and layout, and that was never way to successful design. The web publisher should identify the various classes of information in the material they wants to offer and format them accordingly. Certain things can benefit from being printed, others do not.
The web publisher wants to apply layout, type and structure to webpages. This process is called typography and has been time proven procedure to make information more accessible. A well designed document allows users to find things quicker, it helps understanding, and it is aesthetically pleasing. Users might decide not to agree and change layout, fonts and design once the document arrives at the browser, however it will be a small group since once adequate design is possible by the publisher, quality will go up dramatically, and the need to personalise decreases.
Machines
The web publisher should be allowed to make some assumptions about the machine the webpages will be viewed on.
It is necessary to have an idea of a basic suite of capabilities of a users machine in order to decide what features to include and what to avoid. This is normal within application development as well: on one hand you want to reach as many people as possible, on the other you want to use the newest technology, excluding users with older machines. Somewhere in the middle, and also based on the aim of the application, a minimum configuration of software and hardware is found. With the minimal machine defined, users with different requirements that differ from the minimal machine can be identified and helped if necessary. For instance websites that include lots of graphics might choose to have alternative pages for people with text only browsers. Users with special visual needs, for instance extra large letters, already have the capability to select their own fonts in larger sizes, and view webpages in that manner. Otherwise a server based rasterizer would be sufficient.
Like it or not, this minimal machine already exists: most websites are build for a 640 by 480 pixels 256 colour screen. The minimal browser is capable of showing in-line gif and jpeg images and a variety of text styles. The minimal machine does not include a printer. Although most users might have one, there is nothing in HTML that defines anything about a printer, and therefore this is an issue that should be left open for the user themselves.
The fonts-in-html discussion is part of the definition of future versions of the HTML standard. When extending HTML with new functionality, one must assume that everything that is part of the existing definition is already implemented in browsers.
The three main arguments to embed outline fonts in documents are scalability, antialiasing and printing. Each can be argued about, but none of the arguments are persuasive, and can easily be matched by some implementation of bitmap fonts.
Scalability ?
Supporters of outline fonts in web pages emphasise the different resolutions of computer screens. A web document should display on a 100 dpi screen as well as on a 72 dpi screen. If a document is tuned for a 72 dpi screen it will be 'too small' on the hires monitor if displayed on a 1 to 1 pixel ratio. Including outline fonts will allow the 100 dpi screen look 'better'. The fact that all other graphics on the page, images, buttons, icons, as well as embedded movies, and shockwave presentations are _not_ scalable does not seem to matter. Is it more appropriate to see characters on a screen to be exactly the same size, i.e. a cap H is 7 millimetres high on a lo-res as well as a hi-res monitor and thus changing layout, or to see the document the way it was designed? It seems a matter of preference rather than a technological requirement.
There are ways to deal with the various resolutions of screens. Let some browser-server interaction determine the resolution of the screen and let the server generate the appropriate bitmap fonts for one particular user. A smart caching scheme at the server side will prove that there aren't really that many different resolutions after all, so server speed won't be much of an issue. One single scale/resolution value will handle all people with special needs in the large type sector, as well as all with 150 dpi monitors. To prevent people from using the server as a hires rasterizer there can be a maximum resolution set at the server (which can of course be increased if average screen-resolution does), for instance 150 dpi. This scheme lets protects the outlines, and still pleases the hi-res monitor crowd.
Antialiasing ?
Bitmapped fonts can be anti-aliased, or the browser can create an anti-aliased character from bitmap at a larger size, using whatever rasterizer, fontformat and antialiasing scheme is appropriate
Printing ?
There are no figures about the number of people that actually print webpages on paper, but generally there seems to be no sense in printing most bits of perishable information on the web, since newer and fresher stuff can be found when needed. Documents that contain information that is suitable for printing, for instance long technical documentation, this is something the publisher can recognise and provide a printable version of the document as well. This can be an enhanced document with indexes, appropriate layout and adjusted font usage, for instance only public domain typefaces.
Furthermore, a more basic question, how printable should a web document be? If you include outline fonts, the images will look pixeled. So include hi-resolution images as well? 100dpi? 300 dpi? and then only RGB or better, CMYK with colour correction?
The more interactive a webpage gets, the less likely it is the page needs to be printed since the information on it is more likely to change. A web publisher's concern should therefore mainly be to get an optimal representation of information on screen. Should users want to print they are free to do so, but not necessarily in copyrighted fonts. High quality typefaces are and will be available commercially to everybody. If you don't like the type, buy one yourself!
Colour !
GifWrap, not a real solution as far as the implementation goes, but it shows that bitmap fonts allow coloured characters to be created and put in a fontformat. Note the four colour typeface that would be incredibly complicated to create with outline technology.
Bitmap fonts also allow the creation of image-databases. Buttons can be stored in a font and used when necessary, they can even be created in such a way. Dingbats, icons, small images can all be stored and appropriated in a simple font
Dumb Browsers !
Another thing to keep in mind is that outline based technology assumes that all people use heavy multimedia machines filled with ram and multiple processors. But at the same time web browsing technology is built into television sets, PDA's, telephones, etc. Do we expect tv's and set-top boxes to be able to deal with postscript? Surely not. A simple bitmap format might be all they can deal with.
Typography !
One of the basic ideas of HTML is to divide information in different classes, and to leave the typographic display of these classes up to the browser. The biggest flaw of this idea is that it ignores the fact that the form and position of information, also contains information. For instance, a group of numbers is meaningless. but when they are put in rows and columns, they form a structure that is can be understood. The same is true for text. One solid column of text is difficult to get into. When hierarchy and structure is applied to the text, the information is made accessible.
If HTML wants to evolve according to the principles it started from, support of basic typographic techniques is necessary. The beginning is already there. Table support allows basic layout, (for instance this page itself is a table), why stop at basic support for fonts?
Economy !
No matter how well outline based documents compresses, they will become massive documents, for the very simple reason that it's so easy. Plus the fact that the companies supporting this technology will be handing out browsers, updates, rasterizers and whatnot to make sure their system gets the most users. What percentage of net traffic is the mass of software that people download to use the net?
Public opinion on the matter of embeddable fonts is deceptive. Most users would not protest if they were to be given scalable outline fonts in public webpages, nobody would resist a free gift, in fact there are users flatly demanding fonts to be free. There are alternatives to giving away fonts, and still guarantee typographic quality and income.
There is need for a system that allows publishers to apply typographic design to a web page whilst protecting proprietary outlines from exposure. This is possible.
Bitmap fonts, either black and white or anti-aliased, pre rasterised or done on the fly at a server, can provide excellent type on screen, and leave the document text in tact should users want to reformat the document in any special personal way.
Support for bitmap fonts within HTML can be pretty straightforward. A font can function as one single image that contains all letters, including some information about size, original name and widths. Together with some HTML tags, and browser support, web pages can contain any typeface in any size and colour. During design of a website proprietary outline fonts can be used to generate bitmap fonts that can be sent along with the document, and cached or generated at the browser. Although the end result might not be optimal for all users, it will be good enough for most. Furthermore it will be free and legal, and it will do until there are systems that allow free distribution of fonts, and still guarantee their usage paid for.
Further reading on ideas on implementing bitmap font formats in HTML can be found at http://reality.sgi.com/grafica/webfonts/ Bitmap fonts for webpages only need to live within the browser at the time a page is viewed. Although there might be some advantages for web authors, there is no real need for Bitmap webfonts to resemble local system bitmap formats, or to be compatible with them. All font switching on- and off can be taken care of by the browser. Compared to having to have some proprietary rasterizer installed next to the browser, implementing a bitmap format is pretty straightforward. There won't be that much extra traffic due to fonts either: in practice the data for the bitmap font could come in the place of imagery that contains headlines and buttons etc.
In practice this is how it could work. Browser sends request for a page to some server. HTML gets sent to browser, which parses the data. Links to images are sought out and requests for these are sent to the server. Next to a request for an image there is a request for a named font. Browser requests the font just like an image. Browser receives font data, and starts formatting the text and the images. Font is cached, but this does not interfere with local installed fonts, and is not accessible to other applications. It might be possible to extract the bitmap font and convert it to some local bitmap format, but that does not matter that much.
One step more complex could be an optimisation: a line of text is set in big type, but only a few characters are needed, the font can contain only those images. Although this resembles the font-protection attempt mentioned earlier, the only reason to do so here is to save some bytes. It does require more administration since the browser not only has to remember what fonts it downloaded, but also what characters.
A bitmap font format can also be used to contain icons, dingbats and whole buttons, or fonts where groups of characters form button shaped objects.
A bitmap based font system allows web publishers and designers to use commercially available fonts, or their own, without the danger of exposing outline information. Just like desktop publishing, the people that publish information are the ones that have a good reason to pay for their fonts, to make their pages look good. The user of a page only gets the result of a font, one particular bitmap, but when the typography of a screen is good, one is all you need.
The ability to include bitmaps can be added to the standard repertoire of font tricks most browsers seem to have mastered by now. That allows the publisher to do more, with type without much cost, and it allows the user to do the same as what he can do currently: view the document with his own fonts in case it's no good. It can only get better.
This document is put together after trying to understand the way the various font embedding schemes proposed by parties involved. Should any part of this information be factually wrong, this was done unintentional, and if the arguments are good enough we are willing to change it.
LettError is not involved with any of the document making companies.
This document can be copied or linked to, but keep the document in tact, with names and addresses, as well as version numbers. Check this page regularly for updates
Printing this page does not invalidate its points.