Monday, October 18, 2010

A comparison of DjVu and JPEG2000 in PDF

Some time ago, I stumbled upon DjVu, a document archiving format. I have previously scanned several printed documents into JPEG and compiled them in ZIP archive. The method is sub-optimal,  producing large JPEG files and creating non-viewable files, unless uncompressed. Placing the JPEG files in PDF, while making it viewable, increases the file size further.

DjVu presented a much better alternative. It has a better compression method, based on wavelet, that is able to achieve the same quality of the JPEG files at half the file size. The several compressed DjVu photos can be compiled into a single DjVu document. With the benefit of halving the size and ability to view the scanned files like a PDF document, I migrated to using DjVu to archive my paper documents.

Then recently, I discovered that PDF is able to make use of JPEG2000 compression. PDF has supported JPEG2000 since version 1.5/Acrobat 6.0. JPEG2000 uses similar wavelet compression as DjVu, producing similar quality images at similar file size. Due to the ability to easily comment on PDF files, I have decided to migrate to JPEG2000 in PDF. Below is a comparison of the two formats:

  DjVu JPEG2000 in PDF
Creation tools Free and open source command line tools, fi_c44 and djvm. Typing two commands convert PNG to DjVu documents.
(Better in terms of cost and keystrokes/mouse input needed)
Adobe Acrobat, with compression for imported PNG set to JPEG2000. Two steps to create PDF from PNG: create the first page using Create File from Image, and subsequent pages by Inserting Page from Image.
Viewing tools WinDjView (free)
Load extremely fast; remembers last position of page viewed; smooth scrolling
(Better in terms of speed)
PDF-XChange Viewer (free)
Load a little slower than WinDjView; remembers last position of page viewed; smooth scrolling
Annotating / commenting tools DjVu Solo (free)
Very primitive commenting, limited to highlighting and hyper-linking.
PDF-XChange Viewer (free)
Rich set of commenting tools. Add text, highlight and draw easily
Editing tools WinDjView (free)
Exports page into various formats for editing in external program.
The to place the page back, to process of creating the DjVu page has to be repeated.
(Better in terms of cost)
Adobe Acrobat
Scanned image selectable. By choosing to edit the image, Photoshop launches.
(Better in terms of requiring less steps)
Exchanging documents A DjVu viewer is required. Most people do not have one installed. A PDF viewer is required. Adobe Reader is installed on most computers.

Friday, September 03, 2010

Downloading Facebook Profile Pictures to Outlook Contacts

Microsoft Office 2010 introduced the Social Connectors. With the Facebook Connector, emails now come with faces, thanks to the connector downloading profile pictures from Facebook. Opening contact items shows faces as well.

The next logical step is to use that contact item’s profile picture as the business card picture. However, no matter how I click-and-drag the picture over to the placeholder, nothing happens. Searching the Internet yields a program called OutSync. However, it managed to only download 3 profile pictures to my Outlook. Looking at its source code revealed the reason – it compares the full name on Facebook and Outlook. Since most of my Facebook friends registered only with their first names or initials, OutSync failed to match them to my Outlook contacts.

So now I have a problem that seems to be easily solvable. I thought I could change OutSync such that it will download email addresses of my friends and match them with my Outlook. However, Facebook always return null for emails.

Another approach is necessary. and in the end, I wrote a program – Outbook – that matches email addresses of Facebook and Outlook. It does so by running multiple search requests, downloading the information of the search results, then downloading the photos.

PS: As I’m running on Windows 7 64-bit and Office 2010 64-bit, I cannot promise that it will work on other OS and Office versions.

Download the application here: https://www.facebook.com/apps/application.php?id=143431545697835

Monday, August 30, 2010

Restoring Windows 7 Image Backup to any partition

With my C: running out of space, I deleted the recovery partition on my hard disk. However, as the recovery partition occupied the space before C:, Windows was unable to expand C: to take the space. To move C:, I used GParted Live USB. However, GParted only created the partition, totally wrecking the data and Windows can no longer boot (WinRE cannot even recognize that I had Windows 7 installed). Luckily, I had heeded the advice to backup my computer before running GParted.

Recovering from the system image is not so straight forward. Firstly, the “repartition and format drive” is checked and cannot be unchecked. Thus, there is no way for me to restore Windows to a larger partition I wanted. Secondly, I had a Fedora partition which I suspect will be wiped out by the formatting. An alternative to overcome the restrictive options is needed.

After looking at these websites, Howto: Duplicate any Windows installation to a new hard disk using only a Vista DVD (!) and How to restore VHD file backup?, inspiration came to me. I could combine both instructions to recover my system image, which is a VHD, to any partition with no restrictions.

Firstly, I booted off the Windows 7 DVD and selected “Repair Windows”. Cancelling the wizard brings me to the advanced options. Selecting “Command Prompt”, I mounted the backup file (on E:) as F: by entering the following:

diskpart
select vdisk file="E:\…\Backup Set…\Backup….vhd"
attach vdisk

Then, following the instructions on duplicating Windows installation, I typed

ROBOCOPY F:\ C:\ /e /efsraw /copyall /dcopy:t /r:0

It took an entire night for the copy to complete. Once done, I booted into the Windows 7 DVD and selected “Repair Windows” again. This time, the wizard detected a problem with the boot up and did the appropriate repairs.

I was overjoyed when I could boot into Windows again. However, Avast! antivirus prompted me for a new license key. Entering an old one made the dialog go away. Another program that I found broken was Microsoft Outlook, which responded with “Not Implemented” pop-ups when many of the ribbon buttons are clicked. Repairing Microsoft Office through Control Panel > Program Features solved the problem.

Monday, March 29, 2010

Red Black Tree Tutorial

On examining the red black tree used for the SortedDictionary class, I realized Microsoft did not use any recursion for insertion and deletion. This seems weird, because I have never come across a top-down tree balancing method. Then I found the Eternally Confuzzled - Red Black Tree Tutorial, which clearly explains how it is done. Thanks Julienne!

Sunday, March 28, 2010

Runtime Complexity of .NET Generic Collection

I had to implement some data structures for my computational geometry class. Deciding whether to implement the data structures myself or using the build-in classes turned out to be a hard decision, as the runtime complexity information is located at the method itself, if present at all. So I went ahead to consolidate all the information in one table, then looked at the source code in Reflector and verified them. Below is my result.
Internal Implement-
ation
Add/insert Add beyond capacity Queue/Push Dequeue/
Pop/Peek
Remove/
RemoveAt
Item[index]/ElementAt(index) GetEnumerator Contains(value)/IndexOf/ContainsValue/Find
List Array O(1) to add, O(n) to insert O(n) - - O(n) O(1) O(1) O(n)
LinkedList Doubly linked list O(1), before/after given node O(1) O(1) O(1) O(1), before/after given node O(n) O(1) O(n)
Stack Array O(1) O(n) O(1) O(1) - - O(1) O(n)
Queue Array O(1) O(n) O(1) O(1) - - O(1) O(n)
Dictionary Hashtable with links to another array index for collision O(1), O(n) if collision O(n) - - O(1), O(n) if collision O(1), O(n) if collision O(1) O(n)
HashSet Hashtable with links to another array index for collision O(1), O(n) if collision O(n) - - O(1), O(n) if collision O(1), O(n) if collision O(1) -
SortedDictionary Red-black tree O(log n) O(log n) - - O(log n) O(log n) O(log n) O(n)
SortedList Array O(n), O(log n) if added to end of list O(n) - - O(n) O(log n) O(1) O(n)
SortedSet Red-black tree O(log n) O(log n) - - O(log n) O(log n) O(log n) -
Note:
Dictionary Add, remove and item[i] has expected O(1) running time
HashSet Add, remove and item[i] has expected O(1) running time
Update 25 April 2010: Added SortedSet