Monday, October 18, 2010

A comparison of DjVu and JPEG2000 in PDF

Some time ago, I stumbled upon DjVu, a document archiving format. I have previously scanned several printed documents into JPEG and compiled them in ZIP archive. The method is sub-optimal,  producing large JPEG files and creating non-viewable files, unless uncompressed. Placing the JPEG files in PDF, while making it viewable, increases the file size further.

DjVu presented a much better alternative. It has a better compression method, based on wavelet, that is able to achieve the same quality of the JPEG files at half the file size. The several compressed DjVu photos can be compiled into a single DjVu document. With the benefit of halving the size and ability to view the scanned files like a PDF document, I migrated to using DjVu to archive my paper documents.

Then recently, I discovered that PDF is able to make use of JPEG2000 compression. PDF has supported JPEG2000 since version 1.5/Acrobat 6.0. JPEG2000 uses similar wavelet compression as DjVu, producing similar quality images at similar file size. Due to the ability to easily comment on PDF files, I have decided to migrate to JPEG2000 in PDF. Below is a comparison of the two formats:

  DjVu JPEG2000 in PDF
Creation tools Free and open source command line tools, fi_c44 and djvm. Typing two commands convert PNG to DjVu documents.
(Better in terms of cost and keystrokes/mouse input needed)
Adobe Acrobat, with compression for imported PNG set to JPEG2000. Two steps to create PDF from PNG: create the first page using Create File from Image, and subsequent pages by Inserting Page from Image.
Viewing tools WinDjView (free)
Load extremely fast; remembers last position of page viewed; smooth scrolling
(Better in terms of speed)
PDF-XChange Viewer (free)
Load a little slower than WinDjView; remembers last position of page viewed; smooth scrolling
Annotating / commenting tools DjVu Solo (free)
Very primitive commenting, limited to highlighting and hyper-linking.
PDF-XChange Viewer (free)
Rich set of commenting tools. Add text, highlight and draw easily
Editing tools WinDjView (free)
Exports page into various formats for editing in external program.
The to place the page back, to process of creating the DjVu page has to be repeated.
(Better in terms of cost)
Adobe Acrobat
Scanned image selectable. By choosing to edit the image, Photoshop launches.
(Better in terms of requiring less steps)
Exchanging documents A DjVu viewer is required. Most people do not have one installed. A PDF viewer is required. Adobe Reader is installed on most computers.


Pramudito said...

doubt that jpg2000 compression pdf can be the same size as djvu.
Tested on 500 pages 4mb djvu. email me, and Try it for your self.

Jeow Li Huan said...

The open source tools out there only use the very basic djvu features for compression. No foreground extraction etc.
Thus I ended up with about the same size as jpeg2000. However, if you have a tool that can utilize the advanced features in djvu, then djvu is definitely smaller.