In forty-something BC, when Julius Caesar was Rome's capo di tutti capi, Cicero was the empire's best wordsmith.  Much of Cicero's work has survived intact for more than 2000 years, and 500 years ago some fractured text drawn from Cicero's writing emerged as the classic block of Greeked type called lorem ipsum or lipsum.  The words of today's orators may not enjoy the same longevity, because while we can preserve the material, a war is about to break out over just how to do it.

Cicero and Julius Caesar tolerated each other but didn't always agree, and this seems to have started a tradition among Caesars.  In living memory, Sid Caesar built his empire with the help of greats like Mel Brooks, Carl Reiner, Woody Allen and Neil Simon.  The relationship between the modern Caesar and his team of writers was as highly charged as the old one, but never as deadly.  Cicero outlived Julius Caesar, who was stabbed to death on a publisher's birthday.  When the dust cleared in the forum, Mark Antony came to power and he didn't get along too well with Cicero, either.  Cicero soon became much quieter, particularly after his head was removed and put on display.

Addressing in Julius Caesar's salad days

Sid Caesar was dangerous in his own way, accidentally killing a horse with a deliberate punch in New York's Central Park, an act immortalized by reference in Mel Brooks' film Blazing Saddles, but he never killed a writer, or at least never killed one anybody complained about missing.  Still, writers are jealous creatures and would never rat on anyone who killed a rival writer, so we'll never know for sure.

As improbable as all these Caesar stories may be, we know they are true, or approximately true, because they have been passed to us in writing or other media that we can understand through our own senses.  That is not necessarily the case with tales told today, including this one.  You may be reading words drawn on a screen, printed on paper, or played to you as a stream of sound, but the underlying medium preserving the work before you is not directly available.  It's a string of bits recorded in a memory device somewhere, and it is only available to you because some contraption has made it appear in a form you can understand.  The fact that a few hundred million people with equipment similar to the gadgets you are using can also see the text does not make it more durable any more than a vast mayfly hatch filling the sky over a stream promises longevity for a particular emerger.  A plethora of data recording and storage formats may enable vast bodies of text to survive, like some species of mayfly, but says nothing about the durability of a particular document.  If anything, a Babel of purported archives may make it easier for important works to be lost, because it fosters an unjustified faith that not merely one but probably many copies of a particular item have been safely stashed away.

What is true about the prospects for survival of this insignificant essay and others like it is also true of many things of greater importance, including the data trails of governments, artists, doctors, and scientists.  We've gone digital, but we can't put our digits on anything anymore.

Unlike Cicero's words, copied by scribes two millennia ago and reproduced by printers 500 years ago, today's data are recorded in media far removed from human senses.  To bring the data back to life, you need some particular equipment and software.  Under the very best circumstances, people can access the data using technology and techniques that are widely available.  Internet technologies provide one widely used set of tools for storing, moving, and displaying documents in human readable form.  There are alternatives, notably one standard that rivals the Internet in availability and popularity, the Portable Document Format, PDF, developed by Adobe Corporation.  Unlike the HTML code commonly used to format web pages, which usually adjusts the appearance of a document based on a number of factors, PDF formatting tries to preserve the appearance as well as the content of a document.  PDF technology is closely related to another Adobe development, PostScript, a language used to govern page printers.

Sid Caesar
Writers, not horses, in his stable

Now there's going to be another entrant in the portable document standards race, something from Microsoft called Metro, named after the transit system that performs the dual function of moving people around greater Washington and redistributing the wealth of iPod owners.  Metro will have some of the characteristics that make PDF popular, such as information that guides the appearance printed documents, and it will use some technologies plucked from the Internet universe, notably XML, which can carry data about the data, such as links and associations.  But its most important ingredient will be its provenance.

The Metro concept hasn't rattled Adobe, but the fact that the idea is coming from Microsoft sure has.  Every player in the software business remembers the browser wars.  Netscape had talent and glamour and great ideas, plus a pretty nice pile of money, but in the end all that was not enough.  Microsoft won by including a good product in its operating system packages and burying the cost of the browser.  Netscape had no way to hook its browser to a big revenue stream, and in the end AOL picked up the pieces, only to give them away.  Before the browser wars, there were the productivity suite wars, which ended with Microsoft Office on top, while Lotus, which pioneered the productivity package concept, ended up in IBM's arthritic hands.

While the wrestling for winners among Windows applications was widely publicized, there was a lot of pressure in the parallel world of Macintosh computing, too.  This played to Adobe's advantage, and in 1994, as Microsoft readied Windows 95 and Office 95, Adobe absorbed a company that had a very sophisticated word processing program called PageMaker; that company was called Aldus.  Aldus was named after Aldus Manutius, who arguably the most important literary scholar, printer, publisher, and typographer in the Venice of his day, which was around 1500.  Then, as now, publishing technology was in constant flux, but the contributions of its outstanding minds have been preserved; by contrast, the Aldine press, as a going business, lasted only two generations.

The life work of Aldus Manutius was in some ways the exact opposite of that of the Aldus Corporation and later Adobe and Microsoft.  Modern printing, characterized by the use of movable type rather than full page engravings, came to life in Europe around 1450.  (It appears that similar technology was independently developed in China before it arose in Germany.) The book that is recognized as the first classic published with this technology is the Gutenberg Bible.  While bibles published by Johannes Gutenberg were large, bulky, and expensive compared to what would soon come, they were revolutionary in 1455.  Before Gutenberg, bibles were copied by hand, and making one copy could take 20 years; it might have been the life work of a monk.  Gutenberg wasn't able to make a fortune off his technology, and probably made a living printing documents used in the sale of indulgences, a practice that later helped foster the Protestant Reformation when the sale of these blessed privileges was soundly criticized by the chronically constipated theologian, Martin Luther.  By the time Luther published his 95 Theses in 1517, the printing industry was in full swing, and copies of Luther's work quickly swept across Christendom.

Aldus Manutius
Put the best face on things

Gutenberg's technology helped turn his city, Mainz, into the industrial center of printing in northern Europe.  Further south, Lyons, another bustling city, became the capital of printing in France.  At the same time Venice was leading Italy into the Renaissance, and emerged as another center of printing and publishing, even as rival Florence was blossoming into the artistic and intellectual capital of Italy under the patronage of the Medici.

What Aldus did to earn his place in history, and to push Venice to the forefront of the printing industry, was to create what has come to be called Italic type, the somewhat cursive alternative to the vertical Roman type that until then had dominated publishing.  Aldus also created modern typography, using white space and simple layouts to break the last tenuous ties calligraphers and illustrators had on the printed word.  By passing power to the words, Aldus brought the mind of the writer into closer contact with the mind of the reader than any prior publisher.  He also revolutionized the whole printing business by developing smaller, less costly books.  It is no wonder that scholars believe the best edition of the Florentine Dante's Divine Comedy is the one printed by Aldus.  In his own, time, however, Aldus was less well known for his Dante than for his affordable editions of classics, printed not in Italian or Latin, but in Greek.  In fact, many of the people who worked for Aldus were educated Greeks, and this helped his publishing company maintain very high standards of accuracy.  It is quite possible that love the Greek alphabet, with its own cursive qualities, inspired Aldus' vision of books using the Roman alphabet printed entirely in an Italic typeface.

In Aldus' time, as today, typographers and printers promoted their capabilities with the help of typeset samples.  To enable customers to focus on the appearance of the printed word without distraction from the content, the printing industry used what is called Greeked type, meaning a block of type that looks like it has meaning but in fact does not.  Somehow, somewhere, but probably in Lyons around 1500, a particular chunk of fractured Latin that begins "loren ipsum" became the benchmark used to compare various typefaces and printing presses.  Like Gutenberg's Bible, like the beautiful editions of the Aldine press, examples of the old standard block of Greeked type persist in their original renaissance form.  Belatedly, Microsoft added the phrase to the Help section of its Word software beginning with the 2002 edition of the program.

Johannes Gutenberg
Printers with a Mainz frame

Incidentally, while you may say that an incomprehensible phrase is Greek to you, and the same phrase in print is called Greeked type, the Greeks say that something that they can't understand is Chinese to them, an inadvertent reference to the culture that beat Europe to the technology of moveable type and, of course, pasta.

And then there's the unsung hero who combined the concepts and invented the noodles used in alphabet soup.  Today, we need the opposite, because in computing, it's often more important to be able to uncombine concepts than to combine them.  And maybe, just maybe, Microsoft will do this with Metro in a way that's more accessible than how it's done in a PDF file.  Microsoft has already done some of this in Internet Explorer 6 and will probably do an even better job in the next release of IE.  Without various deconstructive processes operating behind the scenes, the content of documents and web pages would be a lot less useful.

Web browser architecture, which increasingly adheres to standards developed and published by W3C, provides a relatively simple example.  Each web page is a document.  The document has content, such as words and graphics and maybe sound.  It also has structure that allows for headlines, paragraphs, tables, lists and a lot of other bits and pieces.  The structure of a typical web page is described in HTML.  The details of its appearance, such as the size of letters or the typeface used for a headline, can be hooked into the HTML but are most likely put in a separate file called a style sheet.  The whole thing is probably tied together by a general set of rules called a Document Type Description, or DTD.  The HTML and style sheet are interpreted according to the rules of the DTD and, where there are gaps, flaws, or uncertainties in the recipe for a web page, by assumptions built into the browser program.  Additional instructions for presenting the information might be included to account for various uses of the web page, for instance, the page coding can specify how the printed appearance of a document should differ from its screen appearance.

Inside the browser, the web page is taken apart, the content separated from formatting advice.  Each element of the page has been organized into a structure based on what is called the Document Object Model.  The browser can deal with all the pieces of the page individually and also in various groupings.  The upshot is that a web page, even one that's is just loaded with formatting instructions, can easily be searched based on its content.  It can also be searched based on its formatting, but there is little demand for that from Internet users.  But none of this information, however well organized it may be, is available to a human being until it has been displayed or printed.  It is only through the use of published standards and the developments that have made web browsers free that it is possible for people to get access to web documents at all.  Take away free browsers, unplug the computer or snip the network cable and it's all over.  The public can no longer get to the information.  And if the public can't get to the information when software, hardware or communications craps out, how are people going to get at it in, say, 2,000 years.  They may have an easier time reading 4,000-year-old essays in Latin by Cicero than much more recent documents describing the rise of genetic engineering or the fall of the New York Yankees.

The Aldine Logo
A speedy dolphin paired with an anchor

PDF files, for now at least, are even tougher nuts to crack.  Adobe does not want you to get at the content in a PDF file without using its tools.  There is some open source software, called Ghostscript, that can create PostScript and PDF files, and which also can help extract content from formatted PostScript and PDF documents.  But Ghostscript is a poor cousin of the commercial products offered by Adobe.  It has never been given the kind of funding and support that other open source packages, such as Firefox or Linux, have received.  Still, even in its present state of development, it does provide tools that enable the public to access documents archived in PDF.

The latest version of Open Office can turn documents into PDF files, but it does not have a rich set of tools for reversing the process.  Nonetheless, like Ghostscript, it does suggest that PDF technology can become less proprietary.  But putting more PDF technology into the public domain makes it harder for Adobe to sell its costly PDF creation software, and Adobe might not be too happy with that.  If free or cheap software for working with PDF files becomes good enough, Adobe won't be able to sell to many copies of its well regarded PDF tools.  Its name will be mud.

Microsoft isn't too big on giving away its technology either, but in order to get Metro out the door it will have to publish quite a bit of information about the internal structure of Metro documents.  It will also have to embark on a campaign to make its concept a standard accepted outside its own world.  If it doesn't, all the company's efforts may yield nothing.  Users of its future operating systems and productivity suites, which are expected to have Metro technology woven into them, will stick to Internet or PDF standards and confine Metro to a small role.  On the other hand, if Microsoft does succeed in promoting Metro, the issues surrounding the preservation and transfer of documents, already made very difficult by existing technologies, will become even more complex and confusing.

What may be shaping up is a situation in which the only viable option available to Adobe will be to put a lot of its PDF tools in the public domain much the way AOL put its Netscape code into the hands of the Mozilla project.  Adobe might not have to do this but it seems to be preparing for the possibility that one of its revenue streams will soon dry up.  Adobe acquired Macromedia, which is the owner of the Flash technology used to animate web pages along with a large collection of related products including ColdFusion web server technology.  There is some overlap in the two firms' product lines, but the result, even if some of the combined companies' products are terminated, will be an outfit that's bigger and stronger and safer than either component company seems to be standing on its own.

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.  Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.  Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur heshoruti.  Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
Fractured but longlasting

If some or all Adobe's PDF technology did become publicly available, Microsoft's Metro might get derailed even before it left the depot.  Alternatively, Microsoft might have to follow the same route with Metro that it did with Internet Explorer, making it free right from the start and also making it available to people who are not using the latest Microsoft software.

In one sense, it almost doesn't matter how a war over document interchange standards plays out.  What is most important is that the consolidation of document formatting that has been fostered by open Internet standards and the de facto acceptance of PDF by so many individuals and organizations could suffer a setback when Microsoft unleashes Metro.  It could become more difficult to share the written word (and other kinds of documents) with contemporaries and with posterity.  In addition to confusing many people, any new standards movement will divert energy from ongoing standards efforts, which already need all the resources they can get.

At the extreme, if Microsoft wins over the world with its Metro scheme, PDF could become a dead end.  Many documents that have been preserved in PDF could, in just a few years, become inaccessible.  PDF files could be lost to our descendants, or most of them; they could become digital Dead Sea scrolls.  Only a fraction of what is now in PDF would be brought forward to a new interchange standard and the old documents would shed their value like obsolete banknotes.

Maybe the key players, Adobe and Microsoft, will act wisely, if not in concert at least with the public in mind.  But they have to start moving pretty soon to avert chaos.  They would be wise to adopt the motto of Aldus Manutius, "festina lente," which means "make haste slowly."

— Hesh Wiener May 2005

