Learn how to use tSQLt to professionally unit test T-SQL code. To enroll in the free self-paced course visit here

Re-Imagining an SSIS IDE

  • Posted on: 4 February 2018
  • By: Ed Elliott

Re-Imagining the SSIS development environment.



SSIS is a powerful and versatile tool for extracting, transforming and loading data into or out of SQL Server. The development environment, BIDS and now SSDT allow developers to create, edit and debug SSIS package. I struggle with the actual IDE. The thing that I find is that the IDE is designed around a series of dialog boxes, tabs and a graphical map of the package as a whole. SSIS packages let you embed either c# or scripts and to open one of these scripts a whole separate instance of visual studio is started and I have worked on projects where opening scripts can literally take 5 minutes.

Taking 5 minutes to see what a script looks like and then realizing you need to do something else like look at another script or which variables are being passed into the script you are looking at and having to close down the second copy of visual studio, go find what you want and then go back to open the second copy of visual studio means that using SSIS can be frustrating.

It isn’t just scripts that are a struggle to work with, knowing whether to look an objects properties, the edit dialog or even the advanced editor dialog means struggling to find the information you want. I saw a recent discussion in slack where someone couldn’t work out why a column was being truncated to 1 char. It turned out that in an advanced dialog on one of the outputs the column width was set to 1. A good SSIS dev spent more than a day trying to find that option.

SSIS has a series of event handlers and to see these you need to click on each object then the drop down list of handlers to find out where code might potentially run. When you are under pressure, and you don’t know why something is happening, then event handlers can be easily and frustratingly missed.

The package itself, the .dtsx file, contains all the metadata about things like files that it needs to be able to import data with the correct file types but if you open one of the dialog boxes that requires a file, if the file isn’t there or is inaccessible you get a delay and an error message.

The main part of the UI is a graphical map of the objects in a package. I have spent quite a lot of time trying to think of a better way of showing the elements that make up a package, and it is hard to argue with the workflow approach, all the ETL tools I found do the same thing and visualizing packages as a series of interconnected tasks is useful but maybe not everything and the only way.

Because of all these issues I have really tried to stay away from SSIS development and looked for other solutions such as using biml to write packages or writing code in other languages but I keep coming back to the fact that the SSIS engine is great at what it does and if you have a lot of data or a lot of files/tables then writing a similar framework would be a lot of wasted effort. So, how can we use the SSIS engine but make better, more efficient use of our development time?

I would always advocate using biml to create packages, even simple ones as it is much easier to see what a package is doing and it is much easier to version control the biml XML than the .dtsx file. This gets us so far but doesn’t help in environments where biml hasn’t been used or to examine packages which have been deployed and the biml lost.

What I decided to do was to write a tool that could be used alongside BIDS/SSDT or even BIML with the one overarching goal of having any value, property or script, in fact, anything within a .dtsx file available with a single click, with no more than a 1-second delay at most. With that in mind, I have written K-SSIS-ed which is a tool to view .dtsx files to help make using BIDS/SSDT more efficient and less frustrating.

You can download it using the buttons at the bottom and as it is 2018 obviously it is cross platform! (i've only tested on windows but have also built linux and mac versions.

I hope someone else finds it useful as a package reader, there are no editing capabilities but exploring and checking packages should be less frustrating.
I am using TypeScript to parse the dtsx xml and there are a few versions of the dtsx format I have tried to cater for all but if you have a package that doesn't work, create a sample and open a GitHub issue here

Because SSIS can be extended with custom components and these can supply their own dialogs to configure the component, loading a .net dll and displaying a dialog goes against my main rule: one click, one second access to the information you need so I fall back to displaying the XML for most tasks and configuring custom displays for things like script tasks and SQL tasks.

This tool is definitely for someone experienced with SSIS and not for a beginner, and you will be exposed to lots of internal data and be expected to know what it means but this is a tradeoff that I would be willing to make.


For support, please raise an issue:


More Download Versions:

SSIS ForEach Enumerator File Order

  • Posted on: 4 September 2017
  • By: Ed Elliott

I saw on slack recently a question about how the ssis file enumeraror orders (or more specifically doesn't order) files. I have been thinking about ssis quite a lot lately and whil I am in no hurry to start using it day to day it is quite an interesting tool.

So anyway, I saw this question that went like:

"does anyone know in what order files are processed in 'Foreach File Enumerator'?
I used to think it was alphabetically but after some testing this is not always the case?
Second part is there anyway to specify the order by say size or date?"

So how does SSIS order files or doesn't order files?

The answer to this is pretty simple and I thouhgt I knew the answer but wanted to confirm it. In my mind I thought, "how do they get a directory listing?", and my mind responses "probably using the win32 api's find file etc", my mind then wondered somewhere else before writing a quick package that:

  • 1. Has a ForEach loop and a breakpoint set at pre-execute
  • 2. Has a single task in the ForEach loop and a breakpoint set at pre-execute
  • 3. A variable to hold the file name

Pretty simple, the ssis package looked like:

Pretty simple hey :)

I set the file path for the enumerator to c:\ssisSearch and put a load of files and directories in (because the win32 find functions have a buffer and you need to call it multiple times - I wanted to make sure we covered cases where there were multipl find calls). Then I reached for my favorite tool of all procmon.exe (I say favorite, it used to be then I had a job where I used it literally every single days for hours and hated it so stopped using it but now i'm back with it!). In procmon I set a filter on the c:\cssisSearch folder and also DtsDebugHost.exe and ran my package - the files were returned in alphabetical order.

I then went into procmon and to the properties of the "QueryDirectory" operation on that folder and (when the symbols had loaded) I could see that the call ssis was making was from the ForEachFileEnumerator.dll (native not .net so we can't grab reflector) and that calls "FindFirstFileW".

A quick hop skip and jump to msdn and FindFirstFile discusses the ordering of files here:

The FindFirstFile function opens a search handle and returns information about the first file that the file system finds with a name that matches the specified pattern. This may or may not be the first file or directory that appears in a directory-listing application (such as the dir command) when given the same file name string pattern. This is because FindFirstFile does no sorting of the search results. For additional information, see FindNextFile.

FindNextFile has this:

The order in which the search returns the files, such as alphabetical order, is not guaranteed, and is dependent on the file system. If the data must be sorted, the application must do the ordering after obtaining all the results.

So basically ntfs is alphabetical, fat date but don't rely on either.

Just a final thought, ssis runs on linux so no idea the order there :)

Site Search with Duck Duck Go