Library tutorials & articles

Generate Thumbnail Images from PDF Documents in .NET

Using the Code

The code is quite simple with a try/catch over the main body. It is purposely in one large block so it's easy to see what it happening and to step through and examine with the debugger.

Initially we create an instance of AcroExch.PDDoc using late-binding. The referenced Adobe Acrobat 5.0 Type Library (Acrobat.tlb from C:\Program Files\Adobe\Acrobat 5.0 SDK\InterAppCommunicationSupport\Headers) does not expose a COM class you can create using early-binding. By referencing the type library we can get the Intellisense and strong-typing of the other Acrobat objects.

Pass the filename of the PDF documents to be opened to the PDDoc object, which can then be accessed to get metadata on the document; GetNumPages() and GetInfo() for custom document properties.

' Create the document (Can only create the AcroExch.PDDoc object using
' late-binding)
pdfDoc = CreateObject("AcroExch.PDDoc")

' Open the document
ret = pdfDoc.Open(inputFile)

If ret = False Then
Throw New FileNotFoundException
End If
' Get the number of pages
pageCount = pdfDoc.GetNumPages()

Set a reference to the first page of the document as pdfPage, which is of type Acrobat.CAcroPDPage. From this we can get a rectangle object of the actual page dimensions. One strange point to notice here is that the Adobe Acrobat SDK documentation seems incorrect, as the PDFRect that is returned from the GetSize() method has IDispatch properties x, y but the PDFRect we need to supply to CopyToClipboard must have left, right, top, bottom.

Finally we render the PDF page to the clipboard at full size. We could have Acrobat scale the image down for us by a percentage, but we can get better visual results using the .NET scaling algorithms of the Bitmap class.

It would have been more efficient to render directly to an off-screen bitmap, and also not have overwritten what ever was previously on the clipboard, but I found the clipboard method the most stable way to get a rendered bitmap of the page using Acrobat.

Although it looks like the pdfPage object has a DrawEx method that can take an H<CODE>DC I couldn't get the method to work in a consistently successful way. Calling DrawEx in the paint event of a Windows Forms application did work but it still wouldn't write to an off-screen bitmap directly. Therefore the clipboard method is used and if the process runs on a batch server it won't cause too much worry.

Note: the Draw method is deprecated, as it only works on Win16 systems where hWnd was unique to Windows and not to each process as on NT.

' Get the first page
pdfPage = pdfDoc.AcquirePage(0)

' Get the size of the page
' This is really strange bug/documentation problem
' The PDFRect you get back from GetSize has properties
' x and y, but the PDFRect you have to supply CopyToClipboard
' has left, right, top, bottom
pdfRectTemp = pdfPage.GetSize

' Create PDFRect to hold dimensions of the page
pdfRect = CreateObject("AcroExch.Rect")

pdfRect.Left = 0
pdfRect.right = pdfRectTemp.x
pdfRect.Top = 0

pdfRect.bottom = pdfRectTemp.y

' Render to clipboard, scaled by 100 percent (ie. original size)
' Even though we want a smaller image, better for us to scale in .NET
' than Acrobat as it would greek out small text
' see http://www.adobe.com/support/techdocs/1dd72.htm
Call pdfPage.CopyToClipboard(pdfRect, 0, 0, 100)

Dim clipboardData As IDataObject = Clipboard.GetDataObject()

Grab the rendered page bitmap from the clipboard and based on the pdfRectTemp object determine if it's a portait or landscape document. Set the correct file to load as the template, and if it is landscape, switch the width and height.

Dim pdfBitmap As Bitmap = clipboardData.GetData(DataFormats.Bitmap)

' Size of generated thumbnail in pixels
Dim thumbnailWidth As Integer = 38
Dim thumbnailHeight As Integer = 52
Dim templateFile As String
' Switch between portrait and landscape
If (pdfRectTemp.x < pdfRectTemp.y) Then

templateFile = templatePortraitFile
Else
templateFile = templateLandscapeFile
' Swap width and height (little trick not using third temp variable)
thumbnailWidth = thumbnailWidth Xor thumbnailHeight
thumbnailHeight = thumbnailWidth Xor thumbnailHeight
thumbnailWidth = thumbnailWidth Xor thumbnailHeight

End If

Load the template file as as Bitmap and as an Image. We use both because the Bitmap class supports MakeTransparent and the image can easily be passed to the Graphics.DrawImage() method. It is slightly inefficent but speed isn't the primarly objective for this application.

Render the pdfImage using the GetThumbnailImage() method of the .NET Framework Bitmap class, this provides a very smooth scaled version of the image.

Next create a blank bitmap with room for the template border. Set the templateBitmap to use the bottom-left pixel of the image as the transparency colour using calling MakeTransparent(). See an article on Chris Sells website for more on transparencies in .NET.

Using the new blank bitmap, draw the rendered pdf page image to it and then the template with transparency directly over the top. Because it is transparent the main area of the page template will still appear through.

Finally, save the composited image back as a .png or .gif file, although .png does look better.

' Load the template graphic
Dim templateBitmap As Bitmap = New Bitmap(templateFile)

Dim templateImage As Image = Image.FromFile(templateFile)

' Render to small image using the bitmap class
Dim pdfImage As Image = pdfBitmap.GetThumbnailImage(thumbnailWidth, _
thumbnailHeight, _
Nothing, Nothing)


' Create new blank bitmap (+ 7 for template border)
Dim thumbnailBitmap As Bitmap = New Bitmap(thumbnailWidth + 7, _
thumbnailHeight + 7, _
Imaging.PixelFormat.Format32bppArgb)

' To overlayout the template with the image, we need to set the transparency
' http://www.sellsbrothers.com/writing/default.aspx?
' content=dotnetimagerecoloring.htm
templateBitmap.MakeTransparent()

Dim thumbnailGraphics As Graphics = Graphics.FromImage(thumbnailBitmap)

' Draw rendered pdf image to new blank bitmap
thumbnailGraphics.DrawImage(pdfImage, 2, 2, thumbnailWidth, thumbnailHeight)


' Draw template outline over the bitmap (pdf with show through the
' transparent area)
thumbnailGraphics.DrawImage(templateImage, 0, 0)

' Save as .png file
thumbnailBitmap.Save(outputFile, Imaging.ImageFormat.Png)

Write some feedback to the console as we work through each of the files.

Then actively release the reference code to the COM objects as Acrobat it isn't the best suited application to opening and closing multiple PDF documents without falling over. Luckily the code doesn't cause Acrobat to display any UI that might cause the process to hang waiting for user interaction.

Console.WriteLine("Generated thumbnail... {0}", outputFile)

thumbnailGraphics.Dispose()

pdfDoc.Close()
Marshal.ReleaseComObject(pdfPage)
Marshal.ReleaseComObject(pdfRect)
Marshal.ReleaseComObject(pdfDoc)

Comments

  1. 04 Nov 2006 at 12:20

    Hai,

             This is very use full and working exactly... But i cant use it for an Online application.....Is there any way to create thumbnail for PDF in online application....Or else Is there any possibilites to call the Existing Console application from Online application

     

    Thanks a lot

  2. 10 Jan 2006 at 08:42

    I do not seem to be able to download the v5 SDK, the only one on the link provided is the UNIX sdk

  3. 01 Jan 1999 at 00:00

    This thread is for discussions of Generate Thumbnail Images from PDF Documents in .NET.

Leave a comment

Sign in or Join us (it's free).

Jonathan Hodgson
AddThis

Related discussion

Related podcasts

  • Interview Dragos Manolescua

    Podcast (MP3): Download Hosts: Markus Guests: Dragos Manolescu Recording venue: JAOO 2006 In this Episode we discuss software architecture evaluation with Dragos Manolescu, an architect at Microsoft's patterns & practices group. We start off the discussion by...

Events coming up

  • Jul 13

    IKT-Forum für Menschen mit Behinderungen

    Linz, Austria

    ICT Forum for People with Disabilities.Deutschsprachige Konferenz mit den Themenschwerpunkten: Unterstützte Kommunikation (Technologien, Symbolsysteme, Gebärdensprache, etc.), Barrierefreies Web- und Softwaredesign, Barrierefreies Dokument- un...

Want to stay in touch with what's going on? Follow us on twitter!