Looking for Trustworthy Alternatives to Adobe PDFs
There was a day when PDFs were the safe, portable alternative to Microsoft Word documents. There was no chance of macro-virus infections, and emails to Spaf with PDFs didn’t bounce back as they did if you sent him a Word document. It became clear that PDFs adopted mixed loyalties by locking features down and phoning home. Embedded content caused security issues in PDF viewers (CVE-2007-0047, CVE-2007-0046, CVE-2007-0045, CVE-2005-1306, CVE-2004-1598, CVE-2004-0194, CVE-2003-0434) including a virus using JavaScript as a distribution vector (CVE-2003-0284). Can you call safe a document viewer that stands in such company as Skype, Mozilla Firefox, Thunderbird, Netscape Navigator, Microsoft Outlook, and Microsoft Outlook Express [1] with a CVSS score above 9 (CVE-2007-5020)? How about PDFs that can dynamically retrieve Yahoo ads over the internet [2], whereas Yahoo has recently been tricked into distributing trojans in advertisements [3]? Fully functional PDF viewers are now about as safe and loyal (under your control) as your web browser with full scripting enabled. That may be good enough for some people, but clearly falls short for risk-averse industries. It is not enough to fix vulnerabilities quickly; people saying that there’s no bug-free software are also missing the point. The point is that it is desirable to have a conservative but functional enough document viewer that does not have a bullseye painted on it by attempting to do too much and be everything to everyone. This can be stated succinctly as “avoid unnecessary complexity” and “be loyal to the computer owner”.
Whereas it might be possible to use a PDF viewer with limited functionality and not supporting attack vectors, the format has become tainted—in the future more and more people will require you to be able to read their flashy PDF just as some webmasters now deny you access if you don’t have JavaScript enabled. Adobe has patents on PDF and is intent on keeping control and conformance to specifications; Apple’s MacOS X PDF viewer (“Preview”) initially allowed printing of secured PDFs to unsecured PDFs [4]. That was quickly fixed, for obvious reasons. This is as it should be, but it highlights that you are not free to make just any application that manipulates PDFs.
Last year Adobe forced Microsoft to pull PDF creation support from Office 2007 under the threat of a lawsuit while asking them to “charge more” for Office [5]. What stops Adobe from interfering with OpenOffice? In January 2007 Adobe released the full PDF (Portable Document Format) to make PDF an ISO standard [6]. People believe: “Anyone may create applications that read and write PDF files without having to pay royalties to Adobe Systems”, but that’s not quite true. These applications must conform to the specification as decided by Adobe. Applications that are too permissive or somehow irk Adobe could possibly be made illegal, including open source ones, at least in the US. It is unclear how much control Adobe still has (obviously enough for the Yahoo deal) and will still have when and if it becomes an ISO standard. Being an ISO standard does not make PDFs necessarily compatible with free software. If part of the point of free software is to be able to change it so that it is fully loyal to you, then isn’t it a contradiction for free software to implement standards that mandate and enforce mixed loyalties?
Finally, my purchase of the full version of Adobe Acrobat for MacOS X was a usability disaster; you’ll need to apply duress to make me use Acrobat again. I say it’s time to move on to safer ground, from security, legal, and code quality perspectives, ISO standard or not.
How then can we safely transmit and receive documents that are more than plain text? HTML, postscript, and rich-text (rtf) are alternatives that have been disused in favor of PDF for various reasons which I will not analyze here. Two alternatives seemed promising: DVI files and Microsoft XPS, but a bit of research shows that they both have significant shortcomings.
Tex (dvi): TeX is a typesetting system, used to produce DVI (Device independent file format) files. TeX is used mostly in academia, by computer scientists, mathematicians or UNIX enthusiasts. There are many TeX editors with various levels of sophistication; for example OpenOffice can export documents to .tex files, so you can use even a common WYSIWYG text editor. Tex files can be created and managed on Windows [7], MacOS X and Linux. TeX files do not include images but have tags referencing them as separate files; you have to manage them separately. Windows has DVI viewers, such as YAP and DVIWIN.
However, in my tests OpenOffice lost references to embedded images, producing TeX tags containing errors (”[Warning: Image not found]”). The PDF export on the same file worked perfectly. Even if the TeX export worked, you would still have a bunch of files instead of a single document to send. You then need to produce a DVI file in a second step, using some other program.
Even if OpenOffice’s support of DVI was better, there are other problems. I have found many downloadable DVI documents that could not be displayed in Ubuntu, using “evince”; they produced the error “Unable to open document—DVI document has incorrect format”. After installing the “advi” program (which may have installed some fonts as well), some became viewable both using evince and advi. DVI files do not support embedded fonts; if the end user does not have the correct fonts your document will not be displayed properly.
Another issue is that of orphaned images. Images are missing from dvi downloads such as this one; at some point they were available as a separate download, but aren’t anymore. This is a significant shortcoming, which is side-stepped by converting DVI documents to PDF; however this defeats our purpose.
Microsoft XPS: XPS (XML Paper Specification) documents embed all the fonts used, so XPS documents will behave more predictably than DVI ones. XPS also has the advantage that
“it is a safe format. Unlike Word documents and PDF files, which can contain macros and JavaScript respectively, XPS files are fixed and do not support any embedded code. The inability to make documents that can literally change their own content makes this a preferable archive format for industries where regulation and compliance is a way of life” [8].
Despite being an open specification, there is no support for it yet in Linux. Visiting Microsoft’s XPS web site and clicking on the “get an XPS viewer” link results in the message “This OS is not supported”.
It seems, however, that Microsoft may be just as intent on keeping control of XPS as Adobe for PDFs; the “community promise for XPS” contains an implicit threat should your software not comply “with all of the required parts of the mandatory provisions of the XPS Document Format” [9]. These attached strings negate some advantages that XPS might have had over PDFs.
XPS must become supported on alternative operating systems such as Linux and BSDs, for it to become competitive. This may not happen simply because Microsoft is actively antagonizing Linux and open source developers with vague and threatening patent claims, as well as people interested in open standards with shady lobbying moves and “voting operations” [10] at standards organizations (Microsoft: you need public support and goodwill for XPS to “win” this one). The advantages of XPS may also not be evident to users comfortable in a world of TeX, postscript, and no-charge PDF tools. The confusion about open formats vs open standards and exactly how much control Adobe still has and will still have when and if PDF becomes an ISO standard does not help. Companies offering XPS products are also limiting their possibilities by not offering Linux versions, at least of the viewers, even without support.
In conclusion, PDF viewers have become risky examples of mixed loyalty software. It is my personal opinion that risk-averse industries and free software enthusiasts should steer clear of the PDF standard, but there are currently no practical replacements. XPS faces extreme adoption problems, not simply due to the PDF installed base, but also due to the ill will generated by Microsoft’s tactics. I wish that DVI was enhanced with included fonts and images, better portability, and better integration within tools like OpenOffice, and that this became an often requested feature for the OpenOffice folks. I don’t expect DVI handlers to be absolutely perfect (e.g., CVE-2002-0836), but the reduced feature set and absence of certain attack vectors should mean less complexity, fewer risks and greater loyalty to the computer owner.
1. ISS, Multiple vendor products URI handling command execution, October 2007. http://www.iss.net/threats/276.html
2. Robert Daniel, Adobe-Yahoo plan places ads on PDF documents, November 2007. http://www.marketwatch.com/news/story/adobe-yahoo-partner-place-ads/story.aspx?guid=%7B903F1845-0B05-4741-8633-C6D72EE11F9A%7D
3. Bogdan Popa, Yahoo Infects Users’ Computers with Trojans - Using a simple advert distributed by Right Media, September 2007. http://news.softpedia.com/news/Yahoo-Infects-Users-039-Computers-With-Trojans-65202.shtml
4. Kurt Foss, Web site editor illustrates how Mac OS X can circumvent PDF security, March 2002. http://www.planetpdf.com/mainpage.asp?webpageid=1976
5. Nate Mook, Microsoft to Drop PDF Support in Office, June 2006. http://www.betanews.com/article/Microsoft_to_Drop_PDF_Support_in_Office/1149284222
6. Adobe Press release, Adobe to Release PDF for Industry Standardization, January 2007. http://www.adobe.com/aboutadobe/pressroom/pressreleases/200701/012907OpenPDFAIIM.html
7. Eric Schechter, Free TeX software available for Windows computers, November 2007. http://www.math.vanderbilt.edu/~schectex/wincd/list_tex.htm
8. Jonathan Allen, The wide ranging impact of the XML Paper Specification, November 2006. http://www.infoq.com/news/2006/11/XPS-Released
9. Microsoft, Community Promise for XPS, January 2007. http://www.microsoft.com/whdc/xps/xpscommunitypromise.mspx
10. Kim Haverblad, Microsoft buys the Swedish vote on OOXML, August 2007. http://www.os2world.com/content/view/14868/1/
on Monday, December 3, 2007 at 11:40 PM