Ed Elliott's blog

A virtualized CPU forced me to eat my lunch early, every day, for weeks

  • Posted on: 21 October 2017
  • By: Ed Elliott

I worked one particular contract where I was forced to take my lunch at 11:35 every day, and it was all virtualisations fault!

To set the scene it was a company who wasn't really used to having developers, they had a load of SQL analysts and some mainframe developers but SQL developers writing T-SQL, C# and SSIS code was new to them. The IT management had decided that buying actual computers wasn't necessary for development. We could use the standard PC's and RDP onto a Windows 2008 R2 server with Visual Studio installed and work on that.

I know it doesn't sound ideal but it was ok, the project was exciting, it even involved putting in a continuous deployment pipeline by accident, but we will leave that story for another day.

The setup worked pretty well but for some reason, every day at 11:35, the CPU of my machine went crazy for about twenty minutes. This happened every single day, so after a while, I asked around, and other people were seeing it but being honest I think they were probably used to taking an early lunchtime or maybe it was a late elevensies :).

When something like this happens, I normally break out task manager or "task mangler" as it is affectionately known in certain circles. The thing using the CPU was always the same it was the anti-virus updater. I don't mean an anti-virus scan but just the check for an updated definition.

I don't like this sort of thing, so I spent some time trying to work out what the updater did and when I ran it by myself it was fast, it took me a little while to realise that it wasn't the updater process that was the problem. Getting to this point may or may not have involved me disabling the anti-virus :)

With the updater disabled, I found other things that should not use lots of CPU like notepad without any text in suddenly using 100% CPU for a few minutes. So every day the virtual machine I was using and everyone else's virtual machine was slow enough to cause us to start our lunchtimes early, what the hell was going on?

The next logical place to investigate was the physical hardware, are our virtual machines on the same, hardware and we had a noisy neighbour? Who was it?

Our virtual machines were on different hardware, no cigar :(

In the end, it was the updater but only because the schedule for the updater was the same on every server and the time syncing between every machine was perfect. Every day at the same time, normally fast virtual machines which were happy to share some decent CPU's all wanted to use a little bit of CPU at exactly the same time which caused every physical CPU to run at over capacity which ended up taking ages.

The IT team staggered the schedule for checking for anti-virus updates, and the problem went away, they even inadvertently managed to re-enable my anti-virus, and I went back to taking my lunch time at a more reasonable 12:00 :)

So if you virtualize a CPU don't forget that you are sharing it and if you want to use it all, one person might win but you'll probably both lose.

Tags: 

Where do you install tSQLt?

  • Posted on: 19 October 2017
  • By: Ed Elliott

The question of where to install tSQLt is probably the most common question I get when I talk about unit testing T-SQL, so much so that I thought that it would be a good topic for a blog. I mention tSQLt in the title, but this covers all unit test code for T-SQL.

So to be more specific:

Where do you install unit tests and unit tests frameworks for SQL Server, which databases should have that code in them?

If we look at what databases might exist in a development process:

  • Local developer database
  • CI database on a build server
  • Test database for testers
  • QA database for user acceptance testing
  • Prod mirror or Pre-Production database for a production standby database
  • Production database

If we have tests written in tSQLt and the tSQLt framework itself then where do you install that and those tests and execute those tests?

TLDR: The answer is quite simple, in these two database types:

  • Local developer database
  • CI database on a build server

Not TLDR:

The first thing to say is, what are unit tests?

There are two top goals for unit tests:

The first is that unit tests are small pieces of code that validate that other small pieces of code work successfully.

The second goal is that unit tests guard against other developers breaking the actual pieces of code covered by the tests - you might well be the other developer in this case so I would always recommend not getting too worked up about this :).

In the list of databases, these two types are the only database types that are private and the type of databases that can typically be thrown away on a whim.

A local developer database should sit on the developer's machine and allow a developer to deploy and test their code locally without affecting anyone else. I have worked on projects with this setup, and it is easily the best to work with as a developer.

If you aren't able to have your database locally but have to either use a remote database or even worse, a shared database for testing, then you will likely get hit by other peoples breaking changes or other people debugging their code in some way blocking you from working.

When you have a shared database you have to coordinate all sorts of things like the best time to deploy and that itself is a massive time sink.

The second type of database, one for a CI build is typically only used for a single build at a time. Because a SQL Server database is, mostly, free to create on an existing server, there is no need to share CI databases. When you think about the fact that you can use Local DB to quickly and cheaply spin up a new clean, empty, fresh, happy database, why wouldn't you use this unless there was an excellent reason not to?

It is in these private databases that we can do things like deleting all the data in a table, setting all the dates of birth to a developer's favourite date or create a long-running transaction that blocks out all of the other users.

Once we move further right in our list of environments, we are less likely to want to do things such as deleting all the data in a table, and once we hit production, it is critical we don't leave test code like that in the database to be accidentally called.

Tests for a production database should be a series of smoke tests which can operate in and around the existing live production calls and data so they do not take extra transactions than the standard database does and they certainly don't modify data, other than their own.

These smoke tests are also likely to be carried out via the application, so a deploy happens and a set of application tests are typically executed to validate that the deploy has been a success - this is a long way from where we use unit tests to validate and guard code against breaking changes.

Anyway, enjoy!

tSQLt Test Adapter for Visual Studio 2017

  • Posted on: 3 October 2017
  • By: Ed Elliott

So Visual Studio 2017 has been released a while and I had created a version of the test adapter (which lets you run tSQLt tests from within Visual Studio with SSDT, ReadyRoll or just plan sql scripts using the test window but I held off on pushing it to the marketplace as it didn't work with Visual Studio 2015.

I spent quite a lot of time trying to work out how to build a single vsix that worked with VS 2015 and 2017 and in the end gave up and now there are two versions, one for 2015 and one for 2017 - I am not putting any fixes into the 2015 version so I would upgrade to 2017 if you want any updates.

I also fixed a couple of issues and one nice little problem with my .net tSQLt client where if you passed in the name of a non-existant schema or test, the tests would return success:

https://github.com/GoEddie/tSQLt-Clients/issues/2

These now report a failure and handle extra select's in the output from tests. Because the test adapter uses the tSQLt client, that gets the benefit of any fixes I make there (isn't hosting individual repos on github awesome for even sharing your own work!)

If you want to use visual studio test window to run your tSQLt tests and have the aweomsness of this:

Then grab it from inside visual studio "Extensions and "Updates" on the tools menu.

Learn how to unit test SQL Server T-SQL code

  • Posted on: 5 September 2017
  • By: Ed Elliott

A free email course on how to use tSQLt including the technical aspects of writing unit tests AND the art of writing repeatable, useful unit tests for even the most complicated T-SQL code

UPDATE: I thought that if I got a certain number by October 1st I would run the course but in two days I had three times the amount of people subscribe than my initial target so I have closed the first course, sign up if you want to join the wait list or the next course!

Unit testing helps us to write better code, make rapid changes to our code and has been generally seen as a good idea for about 10 years. Writing tests for T-SQL code is made much easier by using tSQLt but there is quite a high barrier to entry both in terms of the technical skills in getting tSQLt running and also how to approach large code bases of, sometimes, unfriendly T-SQL code and taming the code with unit tests.

I have successfully unit tested T-SQL code in a number of different environments including clean greenfield environments as well as legacy projects and I have written this course to help people get started with unit testing but also help them to turn unit testing into a part of their development process that they can use everyday to improve the quality of their work and the speed at which deployments can be made.

Are you any of these people?

  • An application developer experienced with other testing frameworks for testing application code?
  • A T-SQL developer with no testing experience?
  • A -TSQL developer with testing experience in other languages or frameworks?

If you are then you should sign up (https://www.getdrip.com/forms/317903840/submissions/new) and let me help you learn tSQLt unit testing for SQL Server.

Why an email course?

I thought it would be an interesting way to provide actionable information regularly and to allow a level of assistance and feedback that I don't think is possible with blogging or writing articles.

How do I sign up?

Run over to: https://www.getdrip.com/forms/317903840/submissions/new and pop in your details.

The course is going to start on the 1st of October and as it is the first one I am limiting the amount of people who can start it. If the first one is a success then I will run it again but it won't be until at least 2018.

What will be the format?

The course will be one email a week which will include an overview of the weeks topic, some detail into the parts that need it and an excercise for the week which can be done on a demo database or any SQL Server database code you have.

and it is free?

yep, gratis. I am not open sourcing yet - maybe in the future but the course itself is free, aka "no service charge"

SSIS ForEach Enumerator File Order

  • Posted on: 4 September 2017
  • By: Ed Elliott

I saw on slack recently a question about how the ssis file enumeraror orders (or more specifically doesn't order) files. I have been thinking about ssis quite a lot lately and whil I am in no hurry to start using it day to day it is quite an interesting tool.

So anyway, I saw this question that went like:

"does anyone know in what order files are processed in 'Foreach File Enumerator'?
I used to think it was alphabetically but after some testing this is not always the case?
Second part is there anyway to specify the order by say size or date?"

So how does SSIS order files or doesn't order files?

The answer to this is pretty simple and I thouhgt I knew the answer but wanted to confirm it. In my mind I thought, "how do they get a directory listing?", and my mind responses "probably using the win32 api's find file etc", my mind then wondered somewhere else before writing a quick package that:

  • 1. Has a ForEach loop and a breakpoint set at pre-execute
  • 2. Has a single task in the ForEach loop and a breakpoint set at pre-execute
  • 3. A variable to hold the file name

Pretty simple, the ssis package looked like:

Pretty simple hey :)

I set the file path for the enumerator to c:\ssisSearch and put a load of files and directories in (because the win32 find functions have a buffer and you need to call it multiple times - I wanted to make sure we covered cases where there were multipl find calls). Then I reached for my favorite tool of all procmon.exe (I say favorite, it used to be then I had a job where I used it literally every single days for hours and hated it so stopped using it but now i'm back with it!). In procmon I set a filter on the c:\cssisSearch folder and also DtsDebugHost.exe and ran my package - the files were returned in alphabetical order.

I then went into procmon and to the properties of the "QueryDirectory" operation on that folder and (when the symbols had loaded) I could see that the call ssis was making was from the ForEachFileEnumerator.dll (native not .net so we can't grab reflector) and that calls "FindFirstFileW".

A quick hop skip and jump to msdn and FindFirstFile discusses the ordering of files here:

The FindFirstFile function opens a search handle and returns information about the first file that the file system finds with a name that matches the specified pattern. This may or may not be the first file or directory that appears in a directory-listing application (such as the dir command) when given the same file name string pattern. This is because FindFirstFile does no sorting of the search results. For additional information, see FindNextFile.

FindNextFile has this:

The order in which the search returns the files, such as alphabetical order, is not guaranteed, and is dependent on the file system. If the data must be sorted, the application must do the ordering after obtaining all the results.

So basically ntfs is alphabetical, fat date but don't rely on either.

Just a final thought, ssis runs on linux so no idea the order there :)

TSQL Tuesday - Databases and DevOps

  • Posted on: 13 June 2017
  • By: Ed Elliott

DevOps isn't running SQL Server in a container and pushing code to it from Jenkins

When we talk about DevOps we envision that we have the ability to check-in code, spin up a new environment, deploy, test and push that code to production and be in the pub at 4.

We know that by having the right tooling in place we can make releases more reliable and more frequent enabling us to deploy the changes that the business want when they want them rather than every x days/weeks/months/years/decades. This outcome is best for everyone, no one loses and the path to fun and profit is that, fun and profitable.

So what do we need to do, run SQL Server in containers and write and deploy our code using SSDT? Yes do it, but you don't need to you can do DevOps and work on doing frequent releases with a standard sql server instance and manually written deploy scripts that are emailed around.

So what is DevOps if you can do it without source control?

DevOps is about enabling frequent releases - that is the core element of it and to enable frequent releases we need:

  • A way to deploy code (a DBA wearing out the F5 key in SSMS is a way to deploy code)
  • A way to be confident about the changes we are about to make (hint tests, lots of them)
  • A way to know when there is a problem with production (monitoring and alerting)
  • The ability to identify bottlenecks, work together and make improvements to the process

The last point is most important, for me it stems from kanban and the kaizen approach of identifying bottlenecks and working together to remove the bottlenecks.

If you look at your existing approach to making changes what are your bottlenecks? How can these be improved? When you deploy changes and they go wrong what stopped you finding out about those problems earlier? When you look at the different stages of a change from business analysis to troubleshooting issues reported by customers, how many of those and how much time and money could have been saved by not having that issue or by identifying it in developer tests or when it was rolled out rather than when the user complained about it.

If you truly start looking at bottlenecks in your entire release process it will more than likely lead you to an end position of a DevOps culture and practices including the tools required to do it but without the underlying kaizen approach, to continually remove bottlenecks in your processes, you will simply pay for tooling you don't need and covering your laptop with stickers but not deliver the value that the business needs.

Which one of these are you?

or

SSDT: Unable to connect to master or target server.

  • Posted on: 12 June 2017
  • By: Ed Elliott

error in sssdt: Unable to connect to master or target server. the server displayed isn't the server but the database

Every now and then I come across this error in SSDT, normally when trying to publish and the odd thing is that the description never matches the actual cause (if you desperate for the cause it is because you can't connect). The thing I like about the description is the way it tries to tell you what is wrong and what server you are connecting to but it fails at both and instead tells you about an unrelated error and a database name instead of a server name.

What we have is:

"Unable to connect to master or target server '{0}'. You must have a user with the same password in master or target server '{0}'.\"

Twice it tries to tell you the server name but both time it actually tells you the database name. I thought I would dig into it a little with reflector to try and see where the error is coming from and whether or not it would ever show the servername. So in reflector I found the error and found where it was used. If we look in, what is surely to be everyone's favourite ssdt dll Microsoft.Data.Tools.Schema.Sql.dll, we can see that the error message is used in SqlDeploymentEndpointServer.OnInit and we have something like:


catch (ExtensibilityException exception)
{
Tracer.TraceException(TraceEventType.Verbose, TraceId.CoreServices, exception, "Error loading DSP families");
throw new DeploymentFailedException(string.Format(CultureInfo.CurrentCulture, DeploymentResources.InvalidServerEndpoint, new object[] { targetDBName }), exception);
}

they are indeed passing targetDBName into the InvalidServerEndpoint error message so yes indeed the error message will only ever show the database name.

I had a quick look at what can cause this and it is something to do with opening the SqlConnection which is wrapped in lots of retry logic that is different for Azure compared to other types of SQL - lots of interesting stuff maybe for another post but basically SSDT wasn't able to open a connection - check the server name, ports, database etc (i.e. maybe your default database is not available to that user), connect via ssms then when it works there come back to ssdt.

Footnote, I did think about raising a connect but then couldn't be bothered, if anyone does i'll be happy to vote for it!

SSDT DevPack + Visual Studio 2017

  • Posted on: 8 May 2017
  • By: Ed Elliott

I have upgraded the ssdt dev pack to support visual studio 2017 and fixed a couple of minor annoyances and have started to think about how to improve it going forward.

The first feature of sorts is the ability to clear the connection used by the quick deployer without having to restart visual studio.

Secondly I am pretty convinced that the main thing people use it for is the quick deploy, the tSQLt test builder and I use the keyword formatters so I have moved everything else to underneath a menu item saying "deprecated" - if anyone really wants one of those features then let me know and I will keep it but I will likely remove them at some point.

I am planning on bringing SQLCover into it at some point and I haven't really been using the mergeui part, I think the better thing is to use sp_generate_merge to generate a merge statement it is much more reliable. If you have a simple table then MergeUi might still be useful to you.

I will publish it to the market place so updates happen automatically, if you want a copy of it go to:

https://visualstudiogallery.msdn.microsoft.com/435e7238-0e64-4667-8980-5...

Before I published it I realised that when I wrote the code (a couple of years ago!) that I had taken a couple of shortcuts, one was to always use the 120 version of the parsers instead of whatever the project was set to so I decided to fix that first and then publish - looking over old code is never good ha ha.

It is now published so you should be able to install for vs 2017 and existing installs on 2015 will be upgraded (hope it doesn't break for anyone, there are lots more users than when I last published it!)

SSDT Dev in Visual Studio Code

  • Posted on: 27 April 2017
  • By: Ed Elliott

I have been quite interested by vs code and have been using it more and more recently. I use it for all my GO (#golang FTW) work and also powershell and I have been toying with the sql tools team's sql extension which is great. For a long time I have thought about bringing the SSDT experience to other IDE's like Jetbrains IntelliJ but because I have been using vscode quite a lot recently and separately I have been doing more and more javascript and typescript I thought it would be interesting to see how hard it would be to write a vscode extension that lets me build dacpac's.

The general goals of this are not to re-created the ssdt experience in visual studio but to provide a lighter, faster way of developing sql code, if I can have an extension that:

  • is fast
  • is light weight - memory is important for dev machines and 2 gb for a large db project is limiting
  • gives us the important things like pre,post deploy scripts, refactoring and the ability to generate dacpac

I am not really interested in providing ui's like the schema compare - for that use SSDT or spend some money on the Redgate tools.

I am also not interested in replacing what the sql tools team are doing, I am happy to leave them to do the harder, important but less interesting things to me like t-sql formatting so with that in mind I have started a new project that is very hacky at the moment, more an experiment to see if it will work but a vs code extension that builds dacpacs:

https://github.com/GoEddie/vscode-ssdt/

This is basically just a wrapper around the DacFx so there shouldn't be anything too hard and also because it is windows only for now (until the DacFx is cross platform it will only ever be windows, but I hold out hope for cross platform DacFx one day!).

This works similarly to the sql tools team extension in that there is a .net app that is called by the vs code extension (typescript running on node.js) so if you wanted to try this, download the repo, run the SSDTWrap exe (not under a debugger or you will face t-sql parsing first chance exception performance hell). Then in vs code open the folder "src\s2" and the "extension.ts" file and F5 - this will open a new vs code window - open a folder with your .sql files and it will create a t-sql model and report any errors.

If you do ctrl+shift+p to open the command pallette and then do "build dacpac" it will generate a dacpac for you from the sql files. You will need to put this ssdt.json file in the root of the directory you open in vscode:


{
"outputFile": "c:\\dev\\abc.dacpac",
"SqlServerVersion": "Sql130",
"references":[
{
"type": "same",
"path": "C:\\Program Files (x86)\\Microsoft Visual Studio 14.0\\Common7\\IDE\\Extensions\\Microsoft\\SQLDB\\Extensions\\SqlServer\\140\\SQLSchemas\\master.dacpac"
}
],
"PreScript": "",
"PostScript": "",
"RefactorLog": "",
"ignoreFilter": "*script*",
"PublishProfilePath": "C:\\dev\\Nested2\\Nested2.publish.xml"
}

It doesn't really support a lot of things at the moment but I will add in the support needed to build a dacpac including refactorlog.xml, pre/post deploy scripts and references that we all know and love.

I tested with a folder of 2000 procedures, I tried testing 10,000 but I couldn't get ssdt to load them into a project without crashing (on a top of the range i7, 32gb ssd etc laptop) - in the end I settled for 2000 procs and to build the dacpac the times were:

App time (milliseconds)
Visual Studio / SSDT 5630
VS Code 2051

so as well as handling larger projects it is faster as well, a small project (one proc/table) was about 17 seconds to build the dacpac.

Anyway, it is all a bit of fun and pretty hacky at the moment but I like using vs code anyway and am finding it much more light weight than visual studio so will likely invest some more time in it.

If you feel like trying it out, good luck :)

Updating TSqlModels (DacFx)

  • Posted on: 26 April 2017
  • By: Ed Elliott

This one is for the DacFx nuts out there, it can't be a very big club but judging from the occasional emails I get about it, the quality is very high ha ha.

If you have a TSqlModel and you want to make a change to it, you have a couple of choices:

- Create a new model and copy over everything
- Use AddOrUpdateScript

AddOrUpdate can only update scripts that you add (i.e. Know the name of) so if you get the model from a dacpac you are out of luck.

I recently wanted to remove a file from a model I created so what I ended doing was to re-add a script with this contents:


--

I didn't try an empty file but this let me remove objects from the model quite nicely.

On another note I found something quite interesting that I missed when building a model.

When you parse a script you get a list of parse errors and the script object, you can add the script object to the model and call validate and even though you got parse errors the model validate might still pass (or parse waa haa haa haaaa).

The total errors are script parse errors plus model validation errors which seems obvious now but I missed it.

Hope it helps someone.

Happy DacFx'ing everyone!

Tags: 

Pages