AdventureWorksCI

AdventureworksCI Step 5 Adding tSQLt to the solution

  • Posted on: 22 July 2015
  • By: Ed Elliott

This is part of a series on how to take the Adventureworks database and bring it in line with modern standards. How do we put a legacy SQL Server database into source control? What issues do we face and how hard is it? Then how do we build a continuous integration pipeline and possibly move towards continuous deployment. This series takes a look at how we achieve this whilst also implementing some coding standards and making an imperfect database a little bit better.

See the introduction: https://the.agilesql.club/Blog/Ed-Elliott/A-reference-for-creating-a-con...

Add Unit Testing

Ok so this is going swimmingly, we have our code and static data out of the database and pushed to github, wowsers.

To build a CI process we really need unit tests, in fact we need unit tests with or without a CI process but that is another matter!

I am going to show how to add tSQLt to an SSDT project but first lets talk about why tSQLt and why not some other framework.

Why tSQLt?

tSQLt is awesome, it has loads of benefits but there is one feature that sets it apart and if all it did was this then I would still recommend it! The thing that makes it so great is the ability to fake tables which makes unit testing so simple.

What is faking?

It doesn't sound much but makes a massive difference, what it does is to rename the existing table and create a new one with the exact same name without any foreign key constraints and optionally without any constraints at all.

That doesn't sound cool, that sounds the opposite of what I want!

Consider a stored procedure that uses the SalesOrderDetail table and takes a SalesOrderId and aggregates all the LineTotal values (this is an actual table in Adventureworks), without being able to fake the table, to test the procedure you would need to create a SalesOrderId, one or more ProductId's and if to create those took a another load of record creations even before we have run our stored procedure we would need to write a whole load of setup code.

In relational databases you could easily spend most of your time writing inserts for related tables rather than just inserting the data we need into the SalesOrderDetail table, namely the columns that the stored proecure uses which is likely SalesOrderId and LineTotal.

So it is you being lazy?

Yes. No. Well sort of. No it isn't it is the correct way to do unit testing, I setup the exact data I need for the test, excercise that data and then check the results, being quicker is a nice side effect.

What if a contraint is really really important

You can get tSQLt to keep constraints or don't fake it but I would question what / how you are unit testing.

Ok i'm convinced

Awesome, leave your money in the hat and move on.

How do I get tSQLt into the SSDT solution?

tSQLt is distributed as a set of t-sql script files, grab them from http://sourceforge.net/projects/tsqlt/ and then create a new database called tSQLt and deploy those scripts to it. Now the foolhardy amongst you may well think you can just export the dacpac from SSMS and you can, but it won't work because tSQLt references master so you need to add a reference from the tSQLt project to the master dacpac but SSMS will export it without adding the reference.

Instead, create a dummy tSQLt project, import the code from the database, build and either save the project or throw it away. The next time you will need it is when you upgrade tSQLt and you may as well just create a new project and throw that away again.

When you have imported tSQLt it will still fail to build so add a reference to the system database "master" and then it should build and create a dacpac you can use in your actual project.

So where are we?

We created a database, deployed the tSQLt code to it and then created a dacpac from a temporary SSDT project, we will reference this in our test instances so the dacpac will get deployed along with our tests.

We could have imported the scripts into our main SSDT project but we don't want to modify the tSQLt code we just want to use it and putting it in a dacpac and referencing that is the perfect way to do it.

Once you have added a reference it still won't build, drop the assembly and re-add it as a script from the tSQLt deployment scripts and everything shall be fine.

Now add it to the project

In the solution I have created a test folder and created a new SSDT project called AdventureworksCI.UnitTests, I have also taken that dacpac and added it to the solution - ideally tSQLt would be a nuget package but there isn't one so we need to manually add it to our solution to share it easily.

Once I have added the dacpac to the solution I add a database reference from the unit test project to tSQLt.dacpac:

add db reference to tSQLt dacpac

The reference is added as "Same Database" so they get deployed together, don't worry they will never be deployed with the actual project. The final step is to add a database reference from the AdventureworksCI.UnitTests project to the AdventureworksCI project again marking the reference as "Same Database".

When I add the tSQLt reference to the unit test project it could no longer build as it needed a refernce to master so what we end up with is this:

add db reference to tSQLt dacpac

Let's deploy

At this stage we should check that we can deploy the test project and when we do we end up with a database that contains, all of AdventureworksCI and tSQLt. Firstly we will need to go to the properties of the UnitTest project and set the database settings to enable Trustworthy and also enable the clr in SQL. You could add an sp_configure / reconfigure to a pre/post deploy script to do it if want to be sure it will always be enabled.

If you right click on the AdventureworksCI.UnitTests project and choose "Publish..." then enter in the connection details and click "Publish", hopefully that should publish and as well as the AdventureworksCI objects you should have tSQLt, run "exec tSQLt.RunAll" - there are no tests to run but will ensure the framework is basically working:

add db reference to tSQLt dacpac

Done

So that is it, we now have a solution with a unit test project and framework that deploys, it is checked into github, to get this specific version go to:

https://github.com/GoEddie/AdventureworksCI/tree/v0.2

To get the latest version with everything in (so far) go to:

https://github.com/GoEddie/AdventureworksCI/

AdventureworksCI Step 4 Pushing to GitHub

  • Posted on: 22 July 2015
  • By: Ed Elliott

This is part of a series on how to take the Adventureworks database and bring it in line with modern standards. How do we put a legacy SQL Server database into source control? What issues do we face and how hard is it? Then how do we build a continuous integration pipeline and possibly move towards continuous deployment. This series takes a look at how we achieve this whilst also implementing some coding standards and making an imperfect database a little bit better.

See the introduction: https://the.agilesql.club/Blog/Ed-Elliott/A-reference-for-creating-a-con...

Getting a github repo

If you have been following along the series by now you should have an SSDT project with the schema and code of the database and possibly some or all of your reference data in a post deployment script. It is at this point before we make any actual changes (or earlier) that we should check the project into source control. It is likely that the database will contain invalid code so we will need to change or delete it safe in the knowledge we can get it back so we use source control to give us that confidence.

I am going to set this up on github, you can basically do the same thing with any source control system the only difference is that commands that you run. If anyone wants a particular walkthrough, visual studio online or svn for example then please contact me and I will put one up if there is enough interest.

Git

The first step is to create a github account, they are free and if you are a little wary, don't be it is a great experience - plus you don't have to grow a beard and where joggers to work, you can shave and wear a suit. As an aside, most people who work with SQL Server have a set of scripts that they have written over the years to help manage and troubleshoot SQL, if you are new to source control and you have these scripts why don't you put them on github and start using it to have some real benefit?

Getting a github account couldn't be easier, just go to https://github.com and sign up.

There are a few choices on how to get code from github, you can just download the source code from the website without ever registering but it means you can't contribute and that would be no good!

Other than by using the web site there are a few clients, including Visual Studio 2013+ but for the walkthrough we will use my favorite "git bash" which is the command line version and really simple, at least for what we want to do today. If you go to the git download page:

https://git-scm.com/downloads

If you are on windows click the "Windows" logo and it will download the windows installer:

download and install git


My personal preference is just to use the command line version of git and not clutter up my windows explorer context menu so I untick everything on the install screen:

git install remove shell bloat


There is then a bit of a scary warning about taking over everything on your computer, just choose the first option "Use Git from Git Bash only" finish off the installation and you should be ready to go.

If you now start "git bash" you should get an odd looking command prompt. What you have is MINGW and a bash session which is like you are running on linux, don't be put off it isn't too difficult instead of cd c:\blah you just do cd /c/blah and then everything else is basically the same!

As another aside, if you don't even want to install git then take a look at https://www.spoon.net/hub/git/git as they have a virtual instance of the git client you can run which saves you having to install git at all.

Github repo

Now we have git we need somewhere to check the code into, a repo! If you go back to github and click the "+ New repository", enter in a name of the repo, I will use "AdventureworksCI". I will make it a public repository, if it is your own code you should probably make it private. If you do not have a paid github account you can create private git repositories using visual studio online for free.

I will initialize it with a readme and a license, I will use the MIT license as I believe it gives everyone most choice but if anyone wants a different license the I accept pull requests with whatever license they want.

I also add a .gitignore for "VisualStudio".

Once the repo is created it should contain a few files, to download the repo onto our machines in a way that we can usefully use it is to clone it, so in git bash, change to a directory where you want to download the code to. I tend to use c:\dev so would do "cd /c/dev/" and then:

git clone https://github.com/GoEddie/AdventureworksCI

If we break this down we have:

git - awesome
clone - download into something we can check into and do a get latest on
https - the protocol, we could use the git protocol but we would have to setup private keys and what not which is for another day
github.com - the place where the repro is hosted
GoEddie - my username, any repo's you create will be under your username so we can all use the same project name
AdventureWorksCI - the project name, only has to be unique to your username

We could have also initilized an existing directory but I think it is easier to clone one and copy your code into it.

git clone fun

We put the code in

If you visit the folder in windows explorer you see that in spite of our foreign bash session it is just windows upder the hood:

git add new folder through windows explorer

Now we have a repro we need to put our code in, if you have already created an SSDT project you can copy it in or you can have visual studio create a new project in the new repo. I will create a sub directory for the source called src and create a new project:

create new visual studio project

I then do an import of the code into the new SSDT project and then check it in by going back to git bash, changing to the root directory and doing:

git add .
git commit -m "initial code of AdventureworksCI - untouched!"

What is this?

"git add ." means stage or get ready to check-in all the changed files in the current directory and all sub directories
"git commit -m .." does a commit of the files stages previously - you can keep staging files and commit in one go or commit as you go along.

When you are ready to share your code then you can go a "git push" - this takes all your commits and sends them to the shared repo.

That is it, the code is now in source control:

https://github.com/GoEddie/AdventureworksCI

Whoop whoop.

AdventureworksCI Step 3 Handling data with SSDT

  • Posted on: 7 July 2015
  • By: Ed Elliott

This is part of a series on how to take the Adventureworks database and bring it in line with modern standards. How do we put a legacy SQL Server database into source control? What issues do we face and how hard is it? Then how do we build a continuous integration pipeline and possibly move towards continuous deployment. This series takes a look at how we achieve this whilst also implementing some coding standards and making an imperfect database a little bit better.

See the introduction: https://the.agilesql.club/Blog/Ed-Elliott/A-reference-for-creating-a-con...

There is an elephant in the room and he is asking what to do about static data

SSDT is great, really great at working with schema objects such as stored procedures, tables, functions etc but there isn't any in-built capability to handle static data that is needed to support a database.

The SSDT Elephant in the ROOM

What is static data?

Just to be clear we are not talking about the actual data that is the whole raison d'etre of a database, instead we are talking about the reference data, static data, lookup data - whatever you want to call it, the data that enforces referential integrity:

create table customer_type
(
customer_type_id int not null,
customer_type_description varchar(32) not null,
constraint [pk_customer_type] primary key clustered (customer_type_id)
);

which is referenced by:

create table customer
(
customer_id int identity(1,1) not null,
customer_type_id int not null,
constraint [fk_customer__customer_type] foreign key (customer_type_id) references customer_type(customer_type_id)
)

The data in the customer table should be backed up and managed using your backup/restore policies, it is what some people call the useful part whereas the static data should be part of source control and deployable!

Where does static data come from?

There are two main sources of static data, the first is the database itself. Often when defining static data someone has a description and the application makes all requests using that description, internally the database code takes the description, looks up the id and uses that from then on - here is a classic example:

create procedure get_customers_by_type( @customer_type varchar(32))
as
declare @customer_type_id int;
select @customer_type_id = customer_type_id from customer_type where customer_type_description = @customer_type;

if @customer_type_id is null
return;

select customer_id from customer where customer_type_id = @customer_type_id;

In cases like this where only the database needs to know the id it is common to define the column as an identity and use those values.

The other source is when id's are used by applications for example in C#:

public enum customer_type {

...
}

In this case the database could still keep its own internal references and translate them when making calls and creating responses or the applications and databases can standardize on the id. The benefit of using the same id / value pair everywhere is that a value is always the same id no matter where you see it which cannot be overvalued.

I am not quite sure why but for some reason databases that use this sort of system still sometimes declare the id columns with identity making identity inserts necessary when setting up your data but that is a winge for another day.

So how do you manage this in SSDT?

SSDT manages objects like tables and stored procedures very well but doesn't do much with data so we are on our own a little bit. The good thing about SSDT is that there is the flexibility to do other "stuff" in that we get a pre and a post deployment script which we can put in any T-SQL we like.

My suggested route to managing your static data in SSDT is to put it into a post deployment script.

Idemposomethingorother

When you deploy an SSDT project the pre and or post deploy scripts are run everytime so you need to ensure they are re-runnable or idempotent. To do this, by far the easiest and I personally believe the most elegent is to use merge statements to ensure your static data tables are always correct.

If you do not want to use or cannot use merge statements ( < SQL 2005 ) then the second best is to have insert statements that join against the live table and only insert missing rows. When you need to change a value rather than insert it then you can just do an update against the old and new values to keep it current.

Whatever you do make the scripts re-runnable as if you set this all up correctly they will run lots (actually a few more times than lots).

How do I do that? I have like 100 tables to get data from and I can't write merge statements there is too much text and it makes me go blind

OK the first step, whichever approach you take merge or re-runnable inserts etc is to create a post deployment script, just right click your project choose "Add" then "Script" and finally "Post-Deployment Script". Note you can add more than one but only the last one added will have the Build action set to "PostDeploy":

add a new post deployment script to the project

Now you have a post deploy script it will run after every deployment you do. If you generate a script rather than deploying via sqlpackage.exe then the post deploy script is tacked onto the end.

But I can't write merge statements and certainly not for like a gazillion tables

Fear not my friend, you have a few options, you can:

  • Suck it up and become the world's leading expert on writing merge statements
  • Use sp_generate_merge to generate merge statements which you can then copy into the post deploy script
  • Use my MergeUI to import tables from SQL Server into your project and manage the merge statements

You can get more details on sp_generate_merge from here.

You can get more details on MergeUI here.

Whatever approach you use, merge statements aren't that hard to understand they are basically a combination (any combination you like) of a select, update, insert and delete in ont enormous sql statement. For a great overview see: http://www.purplefrogsystems.com/blog/2011/12/introduction-to-t-sql-merg...

Is it really that simple?

Yes.

AdventureWorksCI Step 2 From MDF to Dot Sql Files

  • Posted on: 22 June 2015
  • By: Ed Elliott

This is part of a series on how to take the Adventureworks database and bring it in line with modern standards. How do we put a legacy SQL Server database into source control? What issues do we face and how hard is it? Then how do we build a continuous integration pipeline and possibly move towards continuous deployment. This series takes a look at how we achieve this whilst also implementing some coding standards and making an imperfect database a little bit better.

See the introduction: https://the.agilesql.club/Blog/Ed-Elliott/A-reference-for-creating-a-con...

Getting the code from the database into an SSDT project

why?

Having the code and schema in a database is great, it (should be) backed up, anyone can access it with the correct permissions, it is really open and accessible.

What is not so great is that we don't have a history of changes to the code, we can't easily version the code and the best case scenario for working with the code and making changes is that we have SQL Server management studio and SQL Prompt (Redgate).

We really need to get the code and bring it into a more developer friendly environment and today that format is SSDT.

Requirements

We have a couple of requirements that we must have thought about (there is that damned planning again), the first is where the source control repository is that we will be using, is there an existing one or do we create one?

The second is where the project will sit, is this a database that serves many applications and should have its own solution or is it for a single application and should be a part of that solution?

Decide where the project will live and get the source control sorted, this will determine where you create your SSDT project.

How?

The first step is to create the SSDT project so open Visual Studio and create a new "SQL Server" project, if you do not have Visual Studio, download either the community or express versions depending on your situation and if you do not see the "SQL Server" database project type then download the latest SSDT iso, at time of writing it is available: https://msdn.microsoft.com/en-us/data/hh297027?f=255&MSPPError=-2147217396

Create a new SQL Server database project

I have already got an instance of Adventureworks OLTP 2012 running locally, I am not going to tell you how to do that as a) it is already pretty well documented and more importantly b) if you follow along with this you should do it with databases that you actually care about, start getting one of your databases into SSDT and source control and show some real business value.

Once you have your new project we will simply do an import so in "Solution Explorer" right click the project, choose "Import" and then "Database...". Note, if you have added any files to the project then you will not have the option to import from anything other than script files so just delete anything or create a new project and then retry:

Import from SQL Server into SSDT

Create a new connection pointing to your database, I like to tick "Include Database Settings" but you may manage those separately:

Import from SQL Server into SSDT

When you are ready click "Start", the wizard will import all of your objects and show you any errors. If you do get any errors at this point it may well be easier to use SSMS to generate scripts of the database and then try to import those, but have a look at the error and see if it is fixable first.

If you have managed to import the database into SSDT you should end up with a list of folders containing schemas and objects and possibly some other .sql files:

List of schemas and objects in the new SSDT project

Once you have the code in SSDT you should try to build the project, surprisingly for me, the Adventureworks project builds straight away. This is the first database project I have worked on where that is the case! Typically, you would build at this point and then find lots of errors - if this happens to you don't worry, keep going, it will get better!

So that is it?

Yes, pretty much, you have now taken the major step of getting your code and schema out of a SQL database and into the file system, there is a bit further to go with making it a completely deployable piece of code but it is a great start.

Shall I fix any errors now?

No, don't do that - whether you have build errors or not, check in what you have. The first thing you should do is to check in the code as you have retrieved it from the database without making any changes, don't fix a reference here or delete an old stored procedure here, check in what you have.

The reason is that typically you will need to make changes, you will likely have legacy code that doesn't even compile but it may not compile for a number of reasons so before you make any changes check in what you have, if you are checking into an existing solution maybe take a branch and check it into that and you really must label it to tag it in some way so you can get back to it at a later date should you need it.

I wouldn't normally check in broken code but this is an exception, you really must have the confidence to delete broken unused code and source control gives us this confidence so check it in and be happy with that, it will get better and you will not have to check in broken code for this database forever.

It is checked in now, should I fix any errors now?

Yes if you want to or, considering you now have the code and schema in source control you may want to go and do some sort of celebration dance.

What problems will you get

The main problems you will see are when you have exported the code and you try to compile it, you are likely to see a large number of warnings and errors.

The warnings are quite common and you should have a plan to get rid of them where possible over time, I personally prefer to treat warnings as errors and either fix them or ignore them individually - someone has gone through the effort of writing something to tell you that something isn't as it should, I like to know that one of us has a problem and fix it!

Errors will stop you building the SSDT project which will stop us generating deployment scripts to fix the database so we need to fix the errors first. If you have dependencies on other databases so you have code that looks like: "select col_name from database.schema.table", this will be an unresolved reference and so you will need to get the database that your statement references and also put it into SSDT, in fact anything you reference outside of the current project including system views and procedures in master will need a reference:

http://schottsql.blogspot.co.uk/2012/10/ssdt-external-database-reference...

You can have a single Visual Studio solution and reference each project or you can build a project and reference its dacpac from your project, this is basically how references to master and msdb work, someone at Microsoft built a dacpac for them.

If you have circular references so two databases reference each other then you will need to split out the dependencies and do something like "http://blogs.solidq.com/en/sqlserver/ssdt-how-to-solve-the-circular-refe..." or you may find it simpler to have a single project and deploy the code to multiple databases, either having redundant tables / code in each database or filter the deployments to only include the objects that you are interested in. Whatever you do, if you have cross database calls you should generally try to remove them as they make working with the code harder and their use normally points to other issues such as an architectural issue that should be solved some other way.

My favorite type of code in a new SSDT project is code that is just wrong or references objects that no longer exists and as such do not compile and would never run if called on the SQL Server, as we have our original version in source control you can safely delete the code that doesn't work, go ahead and do that now and smile as you see the bad code go.

Next

That is all for now, hopefully you now have a database project in source control - you may have plenty of errors to fix and warnings to procrastinate over but it is a great toe dip into a sea of CI awesomeness. In the next post I will talk about data and the static data that we have and how we deal with that.

AdventureWorksCI Step 1 planning

  • Posted on: 12 June 2015
  • By: Ed Elliott

This is part of a series on how to take the Adventureworks database and bring it inline with modern standards. How do we put a legacy SQL Server database into source control? What issues do we face and how hard is it? Then how do we build a continuous integration pipeline and possibly move towards continuous deployment. This series takes a look at how we achieve this whilst also implementing some coding standards and making an imperfect database a little bit better.

See the introduction: https://the.agilesql.club/Blog/Ed-Elliott/A-reference-for-creating-a-con...

Planning? Boreoff!

It might sound pretty dull but it is important that we understand firstly what it is we actually want to achieve and what the business value is of it.

Firstly, if you do not know what you want to achieve, how will you know when you are done?

Secondly, there is no point investing time and money on something that we will not see any return on.

The reason we plan is to work out if we should do it, how we can deliver real value at every step and to have a good idea that what we want is achievable within whatever constraints we are working to.

So lets plan

The first step is to take stock of why we are doing this? What is wrong with the current database:

  • There are no tests -so- it is hard to make changes without knowing if you have broken something
  • Deployments are manual -so- it takes a long time to generate the changes and they often miss something
  • The code is inconsistent, different developers have their own style -so- it is hard to understand code unless you know the persons style
  • The development databases are restores from production so contain sensitive data -so- we are at serious risk of a data breach / loss of customer trust / regulatory fines
  • Developers overwrite each others changes -so- releases cause regressions which can mean we lose code so have to re-write things we have already written
  • There is lots of unused code and commented out code -so- it makes it hard to maintain the code, sometimes you spend hours debugging only to find out that the portion of code you are looking in a ten thousand line procedure isn't even being hit

I do not know about you but this sounds like pretty much every legacy database I have ever worked on!

The second step is to work out what it is we will do, it is often good to draw this out on a whiteboard to make sure that you are not missing anything.

So this is the dream team, we would like:

  • To use git to store our source code -which- will stop developers overwriting each others code, let us clear out the old unused code and make changes knowing we can go back and see what happened if things go wrong
  • To use SSDT to write our code -which- will allow us to find unused code, improve the code using refactoring and create automatic deployment scripts
  • Use powershell to deploy changes to a local instance and run all tests -which- will give us confidence in our automatic deployments and help make sure we do not break anything else
  • Use tSQLt to write unit tests -which- will mean we can start to make more rapid changes and also spend less time debugging
  • Use Visual Studio Online to manage the work -which- means we can plan and estimate the work and decide what order we do things and how much time we should invest in this project
  • Use Visual Sudio Online as a continuous integration server -which- means we can validate code on checkin and eventually also run rules to validate that our coding standards are being met
  • Use Visual Studio Online to generate deployment scripts for other environments -which- means we do not have to manually build scripts and are always ready to deploy working code

There is more we could do such as using tools like octopus deploy to do the actual deployments but we have enough to be getting on with. The thing we should bear in mind is that we should use off the shelf tools, where we deviate from that we risk problems further down the line. For instance if we decided to use rexx (remeber that??) to write all our scripts we would have an additional dependency everywhere which we may not neessarily want.

Estimate

So now we know roughly what we want to do, we should estimate the work. This sort of work is something that should ideally be done in two work streams. First there is the upfront work to get the basics in place and then there is ongoing work to improve the process, we don't know what the ongoing work will be otherwise we would just do it upfront so just bear in mind that it doesn't have to be 100% perfect as long as you make sure you invest the time further down the line to improve the process when it needs it. Instead of just thinking about continuously integrating your work, think of it as continual improvement and the continual never stops.

I personally like to include user stories and tasks for the work on the ci process, sure it isn't something you can sell to customers but whatever software you are writing has a business value and improving the quality and fixing the things that are wrong with it does have a business benefit.

To track the work I am going to use Visual Studio online, you should hopefully already have a system for managing and prioritising work. If you don't then you should get one even if you are a single developer having a place to put things that need doing and to track your progess is vital to being organised and getting things done.

I have taken the requirements and built user stories for each, created tasks for each user story and assigned time estimates to each one:

my requirements written as user stories

Using this I have an indication of how long each task will take and I also have somewhere I can relate each checkin back to. It is really easy in development to go off on a tangent fixing and changing all sorts of things so I like to tie each checkin with a task in the current sprint, if it isn't related to a task then you have to really ask yourself whether you should be doing it. This doesn't mean that you shouldn't refactor, certainly as you work with the code you should always aim to make it a little bit better than it was before, rather that should be in conjunction with the other work you are doing.

Now we have the estimates and we should be able to pretty easily see the dependencies we can choose what we will do upfront and what we will do later. Perhaps the amount of work feels to much to do initially so to start with we will get the code in SSDT or perhaps start using tSQLt. Each project and organisation is unique so do whatever works for you but have a plan of where you want to get to and start heading in that direction.

In the next post we will be creating an SSDT project and getting the Adventureworks code out of the database and into the file system where code should be stored!

A reference implementation of a continuous integration pipeline for legacy SQL Server databases

  • Posted on: 10 June 2015
  • By: Ed Elliott

What am I doing?

Creating a publically available version of the adventureworks 2012 oltp database but by putting it under source control, implementing unit tests and improving the design and code to fit modern development guidelines.

Why?

There are two aims, firstly to show how to take an existing database and safely get it into source control. Once it is under source control we can start to document how it works and write unit tests which will allow us to make changes to it knowing that we are not breaking backwards compatibility or losing any functionality.

This will hopefully serve as a guide on how to take an existing database that has no documentation and no tests and bring it under our control.

The second aim is to show how we can build a continuous integration and possibly continuous deployment pipeline for our database, we will point out any challenges and how they were fixed as well as giving a reference guide on how you can do this yourselves on your own databases.

Why Adventureworks?

Adventureworks, as I understand it, has been designed to show off a piece of as much different functionality as possible. I know from testing my own db tools, using Adventureworks normally finds any outstanding bugs. What it doesn't show is any best practices so it should be in the same state that many legacy databases are in today, they are undocumented, untested and have no coding standards.

To be clear, this isn't a compaint about Adventureworks, this is about taking a real world imperfect database and trying to make it a little bit more perfect.

What Technology?

I am going to use SSDT to get the database in source control and use visual studio online to create the repository and build definitions, however I would love to see someone else, or maybe I will, write some guides on how to do the same thing with other tools.

I have chosen to use the SQL Server 2012 version as I think that will make it more accessible to people than SQL Server 2014 or even 2016!

What should the readers do?

Again following the aims, there are two things:

Firstly I am hoping that this can be a community project so if you have some time to improve some aspect of it, from changing the schema, to writing tests or documentation that would be amazing.

Secondly if you haven't got your databases under source control or haven't built a continuous integration pipeline then consider following along with your own databases, where I use a public visual studio online repository you can of course use your own internal or private source control.

Steps

As I publish each post I will link to them here: