Timeout when deleting CatalogNodes from a large catalog

Vote:
 

Hi Episerver

When we receive a new catalog from the InRiver PIM system, we have a pre-processer that deletes CatalogNodes and CatalogEntries not present in the complete catalog from PIM. The default functionality is to update existing data, but not delete data not present in the Catalog.xml. That is ok, because we can handle the deletion ourself. Or so we hoped :)

We have this line of code:

catalogSystem.DeleteCatalogNode(node.Id);

This deletes a CatalogNode from the Commerce Catalog. However once in a while we are faced with timeout exceptions like the stack trace below. As you can see, we have injected our own method somewhere in the call stack to increase the timeout to 1 hour, but we still get timeouts. We can of course increase the timeout further, but we think that one hour for deleting a catalog node is a very long time allready.

As far as I can get from the stacktrace, then Episerver tries to get a recursive list of all CatalogEntries that live below the CatalogNode we are changing, and this can take a lot of time to find out.

Is it somehow possible to optimize this method? Add a database index or something else?

Stacktrace below

---

System.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception (0x80004005): The wait operation timed out
   at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
   at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
   at System.Data.SqlClient.SqlDataReader.TryConsumeMetaData()
   at System.Data.SqlClient.SqlDataReader.get_MetaData()
   at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString, Boolean isInternal, Boolean forDescribeParameterEncryption)
   at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task& task, Boolean asyncWrite, Boolean inRetry, SqlDataReader ds, Boolean describeParameterEncryptionRequest)
   at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, TaskCompletionSource`1 completion, Int32 timeout, Task& task, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry)
   at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method)
   at System.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior, String method)
   at System.Data.SqlClient.SqlCommand.ExecuteReader()
   at Mediachase.Data.Provider.SqlDataProvider.<>c__DisplayClass19_1.b__0()
   at EPiServer.Data.Providers.SqlTransientErrorsRetryPolicy.Execute[TResult](Func`1 method)
   at Mediachase.Data.Provider.SqlDataProvider.WithRetry[TResult](Func`1 action)
   at Mediachase.Data.Provider.SqlDataProvider.LoadReader(DataCommand command)
   at SolarB2bEpiserver.Core.Commerce.SqlDataProvider.NcSqlDataProvider.LoadReader(DataCommand command)
   at Mediachase.Data.Provider.DataService.LoadReader(DataCommand command)
   at Mediachase.Commerce.Catalog.Data.CatalogStoredProcedures.GetAllEntriesInNodes(IEnumerable`1 catalogNodeIds)
   at Mediachase.Commerce.Catalog.Data.CatalogNodeAdmin.GetCatalogEntryChanges(CatalogNodeDto catalogNodeDto)
   at Mediachase.Commerce.Catalog.Data.CatalogNodeAdmin.Save()
   at Mediachase.Commerce.Catalog.Managers.CatalogNodeManager.SaveCatalogNode(CatalogNodeDto dataset)
   at Mediachase.Commerce.Catalog.Managers.CatalogNodeManager.DeleteCatalogNode(Int32 catalogNodeId, Int32 catalogId)
   at Mediachase.Commerce.Catalog.Managers.CatalogNodeManager.DeleteCatalogNode(Int32 catalogNodeId)
   at Mediachase.Commerce.Catalog.Impl.CatalogContextImpl.DeleteCatalogNode(Int32 catalogNodeId)
   at SolarB2bEpiserver.Integrations.Pim.InRiverDeltaImportHandler.UpdateEpiNodes(EpiCommenceDbRepository epiCommerceRepository, ICatalogSystem catalogSystem, String catalogName, XElement inRiverCatalogElement)
   at SolarB2bEpiserver.Integrations.Pim.InRiverDeltaImportHandler.PreImport(XDocument catalog)

#174991
Feb 09, 2017 9:59
Vote:
 

How many items do you have in that category? When a node is deleted (without deleting the entries), the entries must still be updated as their relations changed.

#174993
Feb 09, 2017 10:31
Vote:
 

Bit more code but deleting the Childern(Products/Variants) under a node first, resolved for me. (Catalog was of size 250000+ variants), EPiServer Version was 8+

/K

#174995
Feb 09, 2017 11:08
Vote:
 

Hi K and Q  :)

We have around 300.000 products(variants) in our catalog. Around 200.000 of these products exist more than one place in the catalog. So we can't just delete an item, because it may still exist, but just in another branch.

Our catalog is constructed so that one branch of the catalog contains the full 300.000 products. The other branches contain the same products, but structured differently (they are targeted towards different customer segments)

However we have just got a new ERROR log in our import. When update a CatalogNode, the SaveCatalogNode methods seems to call the same stored procedure, and then we get the same timeout.

So I think we need to find out how to optimize this line of code:

   at Mediachase.Commerce.Catalog.Data.CatalogStoredProcedures.GetAllEntriesInNodes(IEnumerable`1 catalogNodeIds)

Quan: Should I create a support case for this, or can you see a quick fix somehow?

I have taken a look at the database, and found the stored procedure called [dbo].[ecf_CatalogNode_GetAllChildNodes]. This seems to be the stored procedure that looksup nodes in the hierarcy.

I have taken the code out of this and tried to execute it directly, but after 30 minutes I stopped it. Asking SQL Server Management Studio to get the Estimated Execution Plan, it suggests that you create an index on CatalogNode.ParentNodeId. I'm not sure that is wise from a general perspective, but here I probably need imput from your experts (which again suggests a support case :))

Regards

Anders

#174999
Feb 09, 2017 11:28
Vote:
 

You can always file a support case - if you want to guaranteed answer. I however can't promise that on this forums.

As I see it, the problem is that the DeleteCatalogNode is executed in a transaction, so it's all or nothing. One workaround is to write the code to get all entries in the node (and probably the sub-nodes - if you have any). Then update their NodeEntryRelation to remove the relation to the node. After that you can delete the Node with easy.

It's more or less same work, but if you are not constrained by the transaction scope and with some smart batching - you can avoid the time out. 

Regards,

/Q

#175000
Feb 09, 2017 11:37
Vote:
 

Hi Quan

I took the liberty of adding the index that was suggested by the Estimated Execution Plan:

CREATE NONCLUSTERED INDEX [IX_NC_ParentNodeId]

ON [dbo].[CatalogNode] ([ParentNodeId])

GO

I have added it to our test environment were we are facing the issues at the moment (trying to import and update production size catalogs)

Take a look at the query below. You may recognize it as a combination of the two stored procedures [dbo].[ecf_CatalogNode_GetAllChildEntries] and ecf_CatalogNode_GetAllChildNodes, but where I have hardcoded the CatalogNodeId that I'm testing with.

Before I added the index, this query didn't complete in 30 minutes. Now it runs in 4 seconds and returns 309.000 rows corresponding to the catalog branch with all products. The hierarchy CTE itself returns 4700 CatalogNodes with the hardcoded CatalogNodeId. 

I forgot to mention, that we actually have 4 different catalogs with between 100.000 and 300.000 products. Of course they are stored in the same database tables, so if we somewhere hit a full-table scan, it can take a long time to complete.

I'm trying to import the catalog with 300.000 products again, to see if the index above have solved my problem. I will know more soon :)

I don't want to add a new index to our production database before I have heard something from Episerver. Is it possible for you to discuss this issue with some of your database experts, to get their feedback? Perhaps it could be something that you could add in a future version, if it realy solves the problem with updating catalog nodes with many items?

---

    with all_node_relations as
    (
        select ParentNodeId, CatalogNodeId as ChildNodeId from CatalogNode
        where ParentNodeId > 0
        union
        select ParentNodeId, ChildNodeId from CatalogNodeRelation
    ),
    hierarchy as
    (
        select
            n.CatalogNodeId,
            '|' + CAST(n.CatalogNodeId as nvarchar(4000)) + '|' as CyclePrevention
        from (SELECT CatalogNodeId FROM CatalogNode WHERE CatalogNodeId = 166828) n
        union all
        select
            children.ChildNodeId as CatalogNodeId,
            parent.CyclePrevention + CAST(children.ChildNodeId as nvarchar(4000)) + '|' as CyclePrevention
        from hierarchy parent
        join all_node_relations children on parent.CatalogNodeId = children.ParentNodeId
        where CHARINDEX('|' + CAST(children.ChildNodeId as nvarchar(4000)) + '|', parent.CyclePrevention) = 0
    )
 select distinct ce.CatalogEntryId, ce.ApplicationId, ce.Code
    from CatalogEntry ce
    join NodeEntryRelation ner on ce.CatalogEntryId = ner.CatalogEntryId
    where ner.CatalogNodeId in (select CatalogNodeId from hierarchy)

#175004
Edited, Feb 09, 2017 12:27
Vote:
 

Hi,

That's nice improvement!

The official response:

We will need to look into this and consider all factors before including this change in our upcoming releases.

My personal take:

Adding an index is (most of the case) harmless - the performance for insert and delete can be affected, but I do not expect that to happen a lot in that table. If you find the index fixes the problem for you then I would not see any problem for not adding it.

#175006
Feb 09, 2017 12:45
* You are NOT allowed to include any hyperlinks in the post because your account hasn't associated to your company. User profile should be updated.