Mitchell Data Base: November 2009

Monday, November 09, 2009

SQL Data Scripts

A standard practice in SQL Server development is to make any changes to objects or data using a script and apply it to the DEV system; once verified, it is applied to QA; once verified, it is finally applied to production.

You can take this one step further, and write your scripts to be deterministic - i.e. if the script is run multiple times, the outcome will be the same.

For example, let's say you have a lookup table tblSettings like this:

ID int IDENTITY (1,1),
ItemText nvarchar(50)

the table holds two records with ID's of 1 and 2, and you wish to add a third. You could simply write

INSERT tblSettings ( ItemText ) SELECT 'application author'

but if you execute that script more than one time, you get multiple records with that same text.

Now, this is a back-end table, and you have verified that the next ID will be 3, using this:

SELECT IDENT_CURRENT('tblSettings') + IDENT_INCR('tblSettings')

The solution is to test if that record exists before you insert the record, like this:

IF (SELECT 1 FROM tblSettings WHERE ID = 3 ) IS NULL
BEGIN
INSERT tblSettings ( ItemText ) SELECT 'application author'
END

in this way, the script can be run multiple times in DEV, QA or even production, and the outcome will always be that the table holds exactly those three records.

Now, let's say that you need to change the record just added to say 'application developer'. You could simply write

UPDATE tblSettings
SET ItemText = 'application developer'
WHERE ID = 3

but why "update" a record to the same value? The solution here is simply to include the old value in the criteria, like this:

UPDATE tblSettings
SET ItemText = 'application developer'
WHERE ID = 3
AND ItemText = 'application author'

The benefit of these methods is to touch the data only where necessary, and to ensure that the outcome is precisely what is expected.

Wednesday, November 04, 2009

Reusable code

My definition of reusable code: modular code that is flexible enough that it can be used in more than one way.

For example, let's say that you have a table Orders like this:

ID int identity,
OrderDate datetime,
Quantity int

and you want the total orders for one year. Fine. All you need is something like this:

CREATE PROC Test
(
@Year int
)
AS
SELECT SUM(Quantity) AS Total
FROM Orders
WHERE
YEAR(OrderDate) = @Year

which answers the immediate need.

But if you write it like this, you can execute the sp without a parameter & get the current year:

CREATE PROC Test
(
@Year int = 0
)
AS
SELECT SUM(Quantity) AS Total
FROM Orders
WHERE
YEAR(OrderDate) =
CASE
WHEN @Year = 0
THEN YEAR( GETDATE() )
ELSE @Year END

Another approach is to add a 2nd parameter for the month, like this:

CREATE PROC Test
(
@Year int = 0,
@Month int = 0
)
AS
SELECT SUM(Quantity) AS Total
FROM Orders
WHERE
YEAR(OrderDate) =
CASE
WHEN @Year = 0
THEN YEAR( GETDATE() )
ELSE @Year END
AND
(
MONTH(OrderDate) = @Month
OR
@Month = 0
)

which allows you specify a single month, or use a zero to get the whole year.

Granted, these are very basic examples, but if you build in little hooks like these at the outset, you can save time later on.