Try our conversational search powered by Generative AI!

Problem working with media and blobs programmatically (orphaned files)

Vote:
 

Hi,

This is done using episerver cms v9.12.3

So we have been working quite a bit with saving media programmatically in one of our projects, and pretty much every blog/documentation/code snippet around suggest this method to save a file programmatically:

var media = repo.GetDefault(somecontentlink);
var blob = blobFactory.CreateBlob(media.BinaryDataContainer, ".jpg");
            using (var stream = blob.OpenWrite()) {
                stream.Write(blobData, 0, blobData.Length);
            }
media.BinaryData = blob;
repo.Save(media, SaveAction.Publish);

Now, this is all fine and dandy for creating a media entity. Next, if you would call this same code again on an existing media entity (so instead of GetDefault you call CreateWritableClone(), a new blob would be created, and saved for the new version, excellent.

However, now is where the problems occur, if I would delete the old version, using the Episerver UI, the blob for the old version still remain in the blobs folder. Neither of the admin mode jobs will remove this orphaned file, as that rely on the content item being deleted, but its only the old version that is deleted, not the whole content item.

Over time, this adds up to quite a lot of files.. We noticed because one of these media entities we updated regularly was a 1gig zip file...

Is this intended functionality, seeing as you can not replace the actual binary data for a media entity from the ui?

If the answer above is yes, this could imo atleast be mentioned on the example code page for blobs and blob providers (http://world.episerver.com/documentation/Items/Developers-Guide/Episerver-CMS/9/blob-storage-and-providers/blob-storage-and-providers/), as a warning?

Apart from that, anyone have any experience with this and how to circumvent it? You can just use "blobFactory.GetBlob(media.BinaryData.ID)" for an existing media item, but in this case we are working with image versions, and apart from this orphaning, different binary data for different versions work perfectly fine.

I was exploring a way to run a cleanup job, but there is is no "list all blobs" functionality from what I can see, so once I delete the version holding the blob link, the link is probably gone forever, so the only option would be to look through the entire blob folder, which would work for the file provider, but not for ex. the azure blob provider.

Another option would be to delete the blob from the provider when a version is deleted, checking so that no other version of the same content item is referencing that specific blob.

#174127
Edited, Jan 19, 2017 12:05
Vote:
 

Did you ever find a solution for this?  We are having the same problem over here. Even if we save a single version and don't create multiple versions of the image in EPi we still have multiple copies in our Blob Storage. I agree I could probably do something to enumerate the directory, but what do we do when we migrate azure blob storage. 

Any help would be appreciated.

Thanks

#190056
Mar 29, 2018 22:43
* You are NOT allowed to include any hyperlinks in the post because your account hasn't associated to your company. User profile should be updated.