Generate Thumbnail Images from PDF Documents in .NET

Introduction

This article presents VB.NET code to create thumbnail images from a directory of Adobe Acrobat PDF documents.

Thumbnail images

Often when looking for documents it is much easier to find what you want visually, for example seeing the cover of a document.

The application was written for a website that I was developing that needed to display links to PDF documents. Instead of just showing a little PDF icon next to each document we wanted to display the front page of the actual document.

As shown below, this gives the listings better aesthetics and also enables the users to find documents quicker if they recognise it.

PDF Icons
VS
Custom Icons

Note: please ignore the strange text, lorem ipsum is simply dummy text for this example

Hopefully people will agree that having the actual front cover displayed next to the hyperlink works better than the generic PDF icon.

Background

The web site was a Content Management System (CMS) so new PDF documents were uploaded to the site by the users. We then had this application scheduled as a batch service to run every 5 minutes and check for new files.

In the backend system the documents have metadata stored in a SQL Server 2000 database. We would then write a flag to say the thumbnail had been created and when we generated the HTML content for the page request in ASP/ASP.NET we would return the appropriate IMG tag and source as appropriate.

Using the Acrobat SDK also meant we could programmically read the PDF metadata and retrieve the number of pages in the document, which could then be displayed as well. Although the end users could have entered that information it meant less work for them and a better overall impression of the web site. Another advantage was that many users relied on the number of pages to determine how large the document was rather than the more technical Kb/Mb value.

Approach

To generate the thumbnail image for each document I used the Adobe Acrobat 5.0 SDK and the Microsoft .NET 1.1 Framework.

Note: do not confuse the thumbnails that are part of a PDF document with the .png files this application generates.

The Acrobat SDK combined with the full version of Adobe Acrobat (sadly the free reader does not expose the COM interfaces) exposes a COM library of objects that can be used to manipulate and access PDF information.

So using these COM objects via COM Interop, we can load the PDF document, get the first page and render that page to the clipboard. Then using the .NET Framework we can copy this to a bitmap, scale and combine that image and then save the result as a .gif or .png file.

Original rendered PDF page scaled down

At first I just saved the scaled down image, but then decided to “fancy” up the thumbnail with a drop-shadow and folded corner. To achieve this effect I created a transparent .gif, called pdftemplate_portrait.gif, using Macromedia Fireworks MX where the main body of the page template was transparent.

By making the bottom-left pixel transparent too we can easily set the transparent colour for a bitmap in .NET.

I keep the top-right of the image white where the corner folds over, that means I can just combine the images by drawing the transparent template directly over the PDF image to achieve the final look.

Compositing the template and rendered image together

Pre-requisites

The full version of Adobe Acrobat (the free reader does not expose the COM interfaces) which exposes a COM library of objects to manipulate and access PDF information.

The Adobe Acrobat 5.0 SDK which is a free download from the Adobe Solutions Network website (note: the site requires registration). The latest SDK for Acrobat 6.0 requires paid membership, so we will use the previous SDK version.

Link on Adobe website for the Acrobat 5.0 SDK

To quickly see if you have the full version of Adobe Acrobat installed, use regedit.exe and look under HKEY_CLASSES_ROOT for entry entry called AcroExch.PDDoc.

Check for AcroExch.PDDoc

You'll also need the .NET 1.1 Framework and some PDF files to test the solution.

The code was written in VB.NET using the .NET 1.1 Framework and Visual Studio.NET 2003 on Windows XP, but there is no reason it wouldn't work on Windows NT/2000 or .NET 1.0.

You might also like...

Comments

Contribute

Why not write for us? Or you could submit an event or a user group in your area. Alternatively just tell us what you think!

Our tools

We've got automatic conversion tools to convert C# to VB.NET, VB.NET to C#. Also you can compress javascript and compress css and generate sql connection strings.

“Java is to JavaScript what Car is to Carpet.” - Chris Heilmann