Saturday, February 06, 2010

Scripted Mass Updates

This article describes a method to generate scripts to perform mass-updates of data. The purpose is to document the updates that are done, for future reference.

Let's say you have a table of customers, and each is assigned to a region based on the state being east or west of the Mississippi River. Therefore, Illinois is assigned to the East. However, in order to balance the workload, the users have asked us to move all Illinois cities to the West region. The table looks like this:

ID int identity
Customer_Name nvarchar(50)
ST char(2)
Region nvarchar(50)

You could write a simple update query like this:

UPDATE Customer
SET Region = 'West'
WHERE ST = 'IL'

However there will be no record of exactly what was updated; if new Illinois cities are added they might be incorrectly placed in the old region, and we might wonder if the update failed for that city. Or, the users might ask which cities had been moved. The solution is to build a SELECT statement that generates one update statement for each record, like this:


SELECT
'UPDATE Customer
SET Region = "West"
WHERE ID = ' +
CAST(ID AS varchar) +
' AND ST = "IL"
AND Region = "East" '
FROM
Customer
WHERE
ST = 'IL'
AND Region = 'East'

..when you run this query, you will get one "record" per city to be updated; each line being similar to this:

UPDATE Customer SET Region = "West" WHERE ID = 2266403 AND ST = "IL" AND Region = "East"

Now, you would copy the output from the results window into a new query window, replace the double-quotes " with single-quotes ' and you can execute the update & save the code to have a permanent record of precisely what was changed.

This technique can be applied to more complex scenarios. For example, you might export a spreadsheet so users can easily back fill missing data, and then import that back into the database. In that situation, you would not be able to reconstruct the changes made unless you keep a copy of the imported data in an extra table somewhere. By scripting the updates you can document precisely what changed.

One final note - this technique will work reliably only if you execute the "SELECT" query to the results window; if you select Results to Text there is a limit to how many characters are displayed, so longer statements may be truncated.