Umbraco Clear Old Document Versions To Decrease Database Size And Improve Performance

Recently I had a problem with an Umbraco 4.7.1 installation that had became 10GB in SQL Server and had about 20000 published pages. Since there is no way to do something like this from the umbraco administration panel I had to create a T-SQL to clear the database.

DECLARE @createdDate Datetime = '2012-03-01'

DELETE FROM cmsPropertyData WHERE 
	versionId NOT IN (SELECT versionId FROM cmsDocument WHERE updateDate > @createdDate OR published = 1 OR newest = 1) AND 
	contentNodeId IN (SELECT DISTINCT nodeID FROM cmsDocument)
DELETE FROM cmsPreviewXml WHERE 
	versionId NOT IN (SELECT versionId FROM cmsDocument WHERE updateDate > @createdDate OR published = 1 OR newest = 1) AND 
	nodeId IN (SELECT DISTINCT nodeID FROM cmsDocument)
DELETE FROM cmsContentVersion WHERE 
	versionId NOT IN (SELECT versionId FROM cmsDocument WHERE updateDate > @createdDate OR published = 1 OR newest = 1) AND 
	ContentId  IN (SELECT DISTINCT nodeID FROM cmsDocument)
DELETE FROM cmsDocument WHERE 
	versionId NOT IN (SELECT versionId FROM cmsDocument WHERE updateDate > @createdDate OR published = 1 OR newest = 1) AND 
	nodeId IN (SELECT DISTINCT nodeID FROM cmsDocument)

Warning if you do not have any TSQL experience be careful because you could easily delete all the data in your database. Always take a backup before making any changes this way.

6 Comments

  1. Hey,

    Nice work on figure this out. Thought you might like to know about this package thought, UnVersion (http://our.umbraco.org/projects/website-utilities/unversion). It lets you set the maximum number of versions of a document to keep, so the system will perform automatic cleanup over time, making it less of a manual process.

    Keep up the good work.

    Matt

    • Nice… I’ll definitely install this in couple of my projects!

    • Hello Matt,

      I installed your package, what does “automatic cleanup over time” mean? Will it start working straight after installation?

      regards

      • pawel: if you look at the source, you’ll see that this code is tied to the “publish” trigger and checks for the DocType of the published document.

        So, if you had a doctype of “news” – all “news” documents would get their versions cleaned up when the next “news” document was published. But any other doctypes wouldn’t be touched at that point.

        There is also an “$_ALL” type that seems to run whenever any type is published.

  2. Too bad UnVersion package is not compatible U6.X.X.
    Is anyopne knows how to implement this into U6?

    • We have some SQL scripts to fix this, if you need help let us know we are located just north of Chicago and have multiple certified Umbraco developers on staff at Fyin.com.
      -David
      LinkedIn

Leave a Reply

Your email address will not be published. Required fields are marked *