Try our conversational search powered by Generative AI!

KennyG
Jun 10, 2015
  5138
(6 votes)

Automatic PDF thumbnails when uploading to EPiServer

Like any other developer or content editor I'm lazy. Not generally lazy, just when it comes to repetitive tasks. Creating and adding thumbnails for PDF files is one of these cases. I could manually create a thumbnail, upload it, and associate it but there has got to be a better way. I mean the system already creates thumbnails from full-size images.

Override Thumbnail property

So I did some reading and found that thumbnails are stored as binary data associated with IContentMedia using the Blob property type. I took Johan Bjornfot's article and adapted it to store a thumbnail for the PDF.

    [MediaDescriptor(ExtensionString = "pdf")]
    public class PdfFile : MediaData
    {

        [CultureSpecific]
        [Editable(true)]
        [Display(Name = "Description", Description = "PDF description", GroupName = SystemTabNames.Content, Order = 1)]
        public virtual String Description { get; set; }

        public override Blob Thumbnail
        {
            get { return base.Thumbnail; }
            set { base.Thumbnail = value; }
        }

        [Editable(false)]
        public virtual bool ProcessedThumb { get; set; }

    }

Now that I had a place to store it I needed to create it.

Ghostscript.NET to the rescue

I tried many different ghostscript wrapper packages without much luck. Finally I stumbled on Ghostscript.NET which luckily is available via NuGet. You will also need to have the native Ghostscript library installed on your server. I adapted the rasterizer sample to fit in theIInitializableModule example from Johan's article.

using System;
using System.Linq;
using EPiServer.Framework;
using EPiServer.Framework.Initialization;
using EPiServer.ServiceLocation;
using EPiServer.Core;
using Models.Media;
using EPiServer;
using EPiServer.Framework.Blobs;
using EPiServer.Web.Routing;
using Ghostscript.NET;
using Ghostscript.NET.Rasterizer;
using EPiServer.DataAccess;
using EPiServer.Security;
using System.IO;


    [InitializableModule]
    [ModuleDependency(typeof(EPiServer.Web.InitializationModule))]
    public class PDFThumbCreatorModule : IInitializableModule
    {
        public void Initialize(InitializationEngine context)
        {
            var eventRegistry = ServiceLocator.Current.GetInstance();

            var contentEvents = context.Locate.Advanced.GetInstance();

            contentEvents.PublishingContent += (sender, args) =>
            {
                var page = args.Content as PdfFile;
                if (page != null && !page.ProcessedThumb)
                {
                    args.Items["ProcessThumb"] = true;
                }
            };
            contentEvents.PublishedContent += (sender, args) =>
            {
                var page = args.Content as PdfFile;
                if (page != null && args.Items["ProcessThumb"] != null)
                {
                    context.Locate.Advanced.GetInstance().CreateThumb(page);
                }
            };
        }

        public void Preload(string[] parameters) { }

        public void Uninitialize(InitializationEngine context)
        {
            //Add uninitialization logic
        }
    }
    public class PDFThumbCreator
    {
        private IContentRepository _contentRepository;
        private BlobFactory _blobFactory;
        private UrlResolver _urlResolver;

        private GhostscriptVersionInfo _lastInstalledVersion = null;
        private GhostscriptRasterizer _rasterizer = null;

        public PDFThumbCreator(IContentRepository contentRepository, BlobFactory blobFactory, UrlResolver urlResolver)
        {
            _contentRepository = contentRepository;
            _blobFactory = blobFactory;
            _urlResolver = urlResolver;
        }

        public virtual void CreateThumb(PdfFile pdf)
        {

            pdf = pdf.CreateWritableClone() as PdfFile;
            pdf.Thumbnail = _blobFactory.CreateBlob(Blob.GetContainerIdentifier(pdf.ContentGuid), ".png");

            var pdfUrl = UrlResolver.Current.GetUrl(pdf.ContentLink);
            var absolutePdfUrl = UriSupport.CreateAbsoluteUri(pdfUrl);

            System.Drawing.Image img = null;

            using (var stream = pdf.BinaryData.OpenRead())
            {
                int desired_x_dpi = 96;
                int desired_y_dpi = 96;

                _lastInstalledVersion =
                    GhostscriptVersionInfo.GetLastInstalledVersion(
                            GhostscriptLicense.GPL | GhostscriptLicense.AFPL,
                            GhostscriptLicense.GPL);

                _rasterizer = new GhostscriptRasterizer();


                _rasterizer.Open(stream, _lastInstalledVersion, false);

                img = _rasterizer.GetPage(desired_x_dpi, desired_y_dpi, 1);

                _rasterizer.Close();

            }

            using (var writeStream = pdf.Thumbnail.OpenWrite())
            {
                var imgbytes = ImageToByte2(img);
                writeStream.Write(imgbytes, 0, imgbytes.Length);
            }
            pdf.ProcessedThumb = true;
            _contentRepository.Save(pdf, SaveAction.Publish | SaveAction.ForceCurrentVersion | SaveAction.SkipValidation, AccessLevel.NoAccess);
        }

        public static byte[] ImageToByte2(System.Drawing.Image img)
        {
            byte[] byteArray = new byte[0];
            using (MemoryStream stream = new MemoryStream())
            {
                img.Save(stream, System.Drawing.Imaging.ImageFormat.Png);
                stream.Close();

                byteArray = stream.ToArray();
            }
            return byteArray;
        }
    }

A few things I learned the hard way. The first was to have it rasterize only the first page! The second was that I needed to close the rasterizer after it created the image. The sample code didn't seem to do that. Otherwise, I couldn't create any more thumbnails until I killed the process. I also learned that merely uploading a PDF counts as publishing the file, then the thumbnail blob gets created and the reference is added back to the PDF object and it publishes again. I got caught in an endless loop where it generated another thumbnail everytime it updated the blob reference. This is what the ProcessedThumb flag solves.

Now I've got a thumbnail saved, how do I get to it?

You are able to route directly to a blob property on a content instance by appending the blobproperty name. In this case that would be /thumbnail. I used that in my view like so:

var thumbnailUrl = String.Format("{0}/Thumbnail", UrlResolver.Current.GetUrl(item.ContentLink));
    
<img src="@thumbnailUrl" alt="@item.Name" title="@item.Name" class="image-file" />    

I hope you found this post helpful and maybe learned something from my mistakes.

Jun 10, 2015

Comments

valdis
valdis Jun 10, 2015 07:39 PM

This approach looks interesting. I'm most probably even more lazier than you :) I would look for some 3rd party solution that does this - for instance PdfRenderer for ImageResizer. This guy should generate thumbnails on fly ;)

Please login to comment.
Latest blogs
Why C# Developers Should Embrace Node.js

Explore why C# developers should embrace Node.js especially with Optimizely's SaaS CMS on the horizon. Understand the shift towards agile web...

Andy Blyth | May 2, 2024 | Syndicated blog

Is Optimizely CMS PaaS the Preferred Choice?

As always, it depends. With it's comprehensive and proven support for complex business needs across various deployment scenarios, it fits very well...

Andy Blyth | May 2, 2024 | Syndicated blog

Adding market segment for Customized Commerce 14

Since v.14 of commerce, the old solution  for adding market segment to the url is not working anymore due to techinal changes of .NET Core. There i...

Oskar Zetterberg | May 2, 2024

Blazor components in Optimizely CMS admin/edit interface

Lab: Integrating Blazor Components into Various Aspects of Optimizely CMS admin/edit interface

Ove Lartelius | May 2, 2024 | Syndicated blog