"Remove Abandoned BLOBs" scheduled job doesn't remove all abandoned blobs

Member since: 2012

We found that our blob storage grows bigger and bigger while we do not have as many files in the media section. We are running "Remove Abandoned BLOBs" scheduled job regularly but it seems that it doesn't remove all abandoned blobs. It looks like that if there is no reference to the file in the DB, it does not remove the file. It removes only files which are not used and has a reference in the DB. I do not know why there are files without reference in the DB, but I assume that Episerver deletes the reference from the DB but fails to delete the file.

Does anyone experience same behavior? Is there a way to remove files which do not have a reference in the DB?

#179211 Jun 05, 2017 10:56
  • Member since: 2007
  • Member since: 2007

    Run at own risk! 

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using EPiServer.PlugIn;
    using EPiServer.ServiceLocation;
    using EPiServer.Web;
    using System.IO;
    using EPiServer.Framework.Blobs;
    
    namespace Gosso.Episerver.ScheduledJobs
    {
        [ScheduledPlugIn(DisplayName = "CleanUpFilesJob", Description = "This job iterates all blobs on disc and removes the ones that does not exists in CMS", SortIndex = 10105)]
        public class CleanUpFilesJob : EPiServer.Scheduler.ScheduledJobBase
        {
            public override string Execute()
            {            
                var blobpath = new FileBlobProvider().Path;
                string[] dirs = Directory.GetDirectories(blobpath, "*", SearchOption.AllDirectories);
                var tobeRemoved = GetBlobsToRemove(dirs);
                Remove(tobeRemoved);
                return "Removed blobs: " + tobeRemoved.Count;
            }
    
            private void Remove(IReadOnlyList<string> tobeRemoved)
            {
                foreach (var blobPath in tobeRemoved)
                {
                    Directory.Delete(blobPath,true);
                }
            }
    
            private IReadOnlyList<string> GetBlobsToRemove(IReadOnlyList<string> allBlobPaths)
            {
                var blobsToRemove = new List<string>();
                var mapper = ServiceLocator.Current.GetInstance<IPermanentLinkMapper>();
    
                foreach (var blobPath in allBlobPaths)
                {
                    var guid = Guid.Parse(blobPath.Split('\\').Last());
                    var map = mapper.Find(guid);
                    if (null == map)
                    {
                        blobsToRemove.Add(blobPath);
                    }
                }
    
                return blobsToRemove;
            }
        }
    }

    #179338 Jun 08, 2017 14:23
  • Member since: 2012

    Thanks, Luc! This looks great!

    Would be better if Episerver would have this scheduled job :)

    #179339 Jun 08, 2017 14:26