Deadlock and timeout

Vote:
 

Hi

I'm having a problem with that when I try to save pages in EPiServer i get either that the transaction was deadlocked or that it Times out on the sql-server. The same code has worked fine before but once we started to get more sites these problems started happening. The code is rather complex as we update our sites dynamicaly with xml-messages.
Ever since these problems started appearing we also started to get sql-errors while surfing the site.

Our webportal consists of 16 sites (different languages) and most of these also have 2-6 different languages depending on which region in the world they are aimed at (for example, the german site .de has german, english, polish, hungarian, russian and czech language)  

We use EPiServer 4.62B. Our EPiServer database is separated from the webserver itself.
Is there some kind of hardware requirement for a certain amount of sites, can this problem be caused by the fact that we have all these sites?
Or is it more likely to be the code which cause these problems?

#33434
Oct 13, 2009 13:31
Vote:
 
I've also experienced this in an enterprise solution we have up'n'running. We only experience this problem when doing a product import via a schedule task though. It has worked fine for almost two years, but now it we have to run it ~5 times to make it work.
#33435
Oct 13, 2009 13:53
Vote:
 

This kind of makes me wonder how optimized the 4.62B is for large sites with multiple languages. But the real problems are that neither will our products be updated on the pages since it becomes deadlocked or timed-out and therefore won't be saved, but also when these "Server Error in '/' Application" pages start to be visible for the customers that visit.
I don't experience the same kind of errors in my (very small) developing environment, but in my own I only have 3 sites with multiple languages which is no where near the real site.

So, are there any recomended hardware requirements for large web-portals, or is it back to try and optimize the code?
Since errors can occur even while just surfing and no other activity is done on the webportal, I really can't see that it is only the fault of the code-behind. 

#33441
Oct 13, 2009 15:09
jo
Vote:
 

Hi Marcus!

Could you please post a stacktrace of one of the deadlock exceptions?

Thanks,
Johan

#33443
Oct 13, 2009 15:22
Vote:
 

The Deadlock exception: 

Exception: System.Data.SqlClient.SqlException
Message: Transaction (Process ID 150) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
Source:    at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection)
   at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection)
   at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj)
   at System.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj)
   at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString)
   at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async)
   at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, DbAsyncResult result)
   at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(DbAsyncResult result, String methodName, Boolean sendToPipe)
   at System.Data.SqlClient.SqlCommand.ExecuteNonQuery()
   at EPiServer.DataAccess.AbstractCommand.ExecuteNonQuery()
   at EPiServer.DataAccess.PageSaveDB.PublishPageVersionInternal(Int32 currentSid, Int32 workPageID)
   at EPiServer.DataAccess.PageSaveDB.Save(PageData page, SaveAction action, Int32 currentSid)
   at EPiServer.DataFactory.Save(PageData page, SaveAction action, AccessLevel access)
   at EPiServerSample.MessageHandlers.Product.UpdatePage(PageData Page, String languageBranch)
   at EPiServerSample.MessageHandlers.Product.UpdatePages(PageDataCollection pages, String languageBranch)
   at EPiServerSample.MessageHandlers.Product.Populate_FindPages()
   at EPiServerSample.MessageHandlers.Product.Populate(XmlDocument doc)
   at EPiServerSample.MessageHandlers.Product.HandleMessage(Object sender, MessageEventArgs e)
   at SYSteam.Messaging.NewMessageEventHandler.Invoke(Object Sender, MessageEventArgs e)
   at EPiServerSample.Plugin.MSMQScheduler.OnNewMessage(Object source, MessageEventArgs e)
   at EPiServerSample.Plugin.MSMQScheduler.HandleMessage(Message Message)
   at EPiServerSample.Plugin.MSMQScheduler.ProcessQue(MessageQueue Queue)

####################################################

The Timeout exception:

Exception: System.Data.SqlClient.SqlException
Message: Timeout expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.
Warning: Null value is eliminated by an aggregate or other SET operation.
Source:    at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection)
   at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection)
   at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj)
   at System.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj)
   at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString)
   at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async)
   at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, DbAsyncResult result)
   at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(DbAsyncResult result, String methodName, Boolean sendToPipe)
   at System.Data.SqlClient.SqlCommand.ExecuteNonQuery()
   at EPiServer.DataAccess.AbstractCommand.ExecuteNonQuery()
   at EPiServer.DataAccess.PageSaveDB.PublishPageVersionInternal(Int32 currentSid, Int32 workPageID)
   at EPiServer.DataAccess.PageSaveDB.Save(PageData page, SaveAction action, Int32 currentSid)
   at EPiServer.DataFactory.Save(PageData page, SaveAction action, AccessLevel access)
   at EPiServerSample.MessageHandlers.Product.UpdatePage(PageData Page, String languageBranch)
   at EPiServerSample.MessageHandlers.Product.UpdatePages(PageDataCollection pages, String languageBranch)
   at EPiServerSample.MessageHandlers.Product.Populate_FindPages()
   at EPiServerSample.MessageHandlers.Product.Populate(XmlDocument doc)
   at EPiServerSample.MessageHandlers.Product.HandleMessage(Object sender, MessageEventArgs e)
   at SYSteam.Messaging.NewMessageEventHandler.Invoke(Object Sender, MessageEventArgs e)
   at EPiServerSample.Plugin.MSMQScheduler.OnNewMessage(Object source, MessageEventArgs e)
   at EPiServerSample.Plugin.MSMQScheduler.HandleMessage(Message Message)
   at EPiServerSample.Plugin.MSMQScheduler.ProcessQue(MessageQueue Queue)

As you can see, they both originate from a Function called UpdatePage which uses the Global.EPDataFactory.Save(Page, EPiServer.DataAccess.SaveAction.Publish,0) to save the information of the page.

#33451
Oct 13, 2009 15:50
Vote:
 

Just for fun, could you do a select count(*) from tblscheduleditemlog(correct spelling???)

We've seen that if it gets large, ie. exeeding 1 000 000 rows wierdness starts to happen... Like timeouts...

//Morten

#33457
Oct 13, 2009 21:32
Vote:
 

I did a count and the result was 303, so its nowhere near that much. Though I do think I cleaned it up some time ago after I read somewhere that too many rows in this table could slow down the database.

#33459
Oct 14, 2009 8:14
jo
Vote:
 

Hm, I recall a very similar situation being reported as a bug (#0045821) in 4.x, dont know however whether it got fixed for 4.62B or CMS5.

If I remember correctly it collided with the code that extracted keywords (to make them searchable), and one workaround could possibly be to mark the properties on the corresponding pagetype's as "non searchable".

If you are using SQL Server 2005 (or later) there is a sweet trace option (1222) that can be setup to pinpoint more exactly what resources are involved in the deadlocked.

See this Bart Duncans blogpost for example on its usage:
http://blogs.msdn.com/bartd/archive/2006/09/09/Deadlock-Troubleshooting_2C00_-Part-1.aspx

Regards,
Johan

#33460
Oct 14, 2009 8:47
Vote:
 

Johan! Thank you for the explanation, it solved my problem! Anyway, I hope it solved my problem, but no deadlock and the job ran ONE hour faster than before. (Was 1h23m, now 23m..)

#34134
Oct 29, 2009 14:34
Vote:
 

In versions prior to Episerver 5 R2, this might caused by a large number of page versions in combination with the SP netQuickSearchByExternalUrl. The exact problem seems to be cross-table joins with no indexes in large tables. And this happens in every request (see "Disable the simple address" in http://labs.episerver.com/en/Blogs/Fredrik-Karlsson/Dates/2008/9/A-few-quick-performance-fixes-without-changing-the-code/).

Possible fixes to this problem:

  • Create manual index on the affected tables
  • Upgrade to later Epiversion (where indexes are included)
  • Disable simple address in web.config
  • Delete obsolete page versions (use SP editDeleteObsoletePageVersions)

To list page ids with version count, you can run this SQL:

SELECT fkPageID, COUNT(*) FROM tblWorkPage
GROUP BY fkPageId
ORDER BY 2 DESC
#45496
Nov 09, 2010 13:55