SQL Server – Are SQL Unit Tests Supposed to Be Long

best practicessql servert-sqltsqltunit test

I am writing stored procedures with some non-trivial business logic. I am trying to unit test them, but the actual tests end up being quite long (shorter ones starting at 40-50 LoCs, using commonly 4 different tables), which doesn't seem very "unit". (Admittedly, I format my code in a way where it takes a lot of space.)

In context of "normal" programming languages I've heard the advice to refactor the complex procedure into smaller chunks. But I don't want to do that here because:

  1. I don't want to pollute "global namespace" by small routines called from one place only.
  2. Passing around tables from and to stored procedures is cumbersome.
  3. Custom functions can have negative effects on performance.

Am I wrong about this reasoning?
I am new to unit testing, so perhaps I am just writing my tests wrong?
Is SQL longwinded language and thus it's unit tests are longer as well?


(I am using SQL Server with tSQLt framework, but I believe the question is system-agnostic.)

Best Answer

Am I wrong about this reasoning? I am new to unit testing, so perhaps I am just writing my tests wrong? Is SQL longwinded language and thus it's unit tests are longer as well?

I think this is a great (though somewhat opinion-based) question. The past few years I've attempted to implement tSQLt unit testing in a personal OSS project (https://dba-multitool.org/) to attempt to answer it via real world application. Unit testing for any type of SQL definitely breaks from classic norms and patterns. My tests definitely don't feel like unit tests and are usually more complex/longer than I'd prefer, but they have saved me a bunch of times from introducing breaking changes.

The stored procedures in my example are quite complex and long, so I came to a similar impasse as you: should I suck it up and deal with very complex "unit" tests, or fundamentally modify how the stored procedure works so that it is more amicable to unit testing in general? As with many database questions, the ultimate answer, I believe, is it depends. Performance and maintainability take a front seat here, in my experience.

I found that I wasn't able to justify largely changing the complex stored procedures to benefit from unit testing - adding more objects, as you noted, can increase complexity and in this case I was wary of asking end users to house a bunch of single-use objects in their database instead of offering an all-in-one utility stored procedure.

One technique that I did come up with was using optional parameters to induce test cases better. For example, I have parameters @IsExpress, @SqlMajorVersion, and @SqlMinorVersion that end users should not use (they are fetched at runtime if not passed) but that my tests utilize to make sure I'm catching errors I want around unsupported versions. Adding a lot of errors will increase the overall reliability of the code without having to necessarily explicitly test the more complex logic areas that may be hard/not reasonable to do. This marginally affects the stored procedure's length and complexity, but lets me easily isolate functionality and reliably test for it. You could take this further and use parameters to trigger select portions of the stored procedure that need testing the most, or to achieve better code coverage percentages by avoiding a messy area that isn't worth covering or is out of scope.

With this approach I was able to get a fairly high code coverage percentage (98%+) and while I definitely don't test all functionality, I know that almost all of the code gets executed, errors when I need it to, and will parse on execution.

To close with a quote, Wikipedia has unit testing defined as (emphasis mine):

In computer programming, unit testing is a software testing method by which individual units of source code—sets of one or more computer program modules together with associated control data, usage procedures, and operating procedures—are tested to determine whether they are fit for use.

I won't say that Wikipedia (or the book being quoted by Wikipedia) is the end-all for technical definitions, but I do like this definition since it focuses on the end goal of unit testing, which I think tSQLt can reasonably achieve.

Besides, some unit tests are better than none and having lines of code that aren't covered by unit testing defined may be very valuable in the future. To expand on the latter:

In 3 years, you've left the company and Employee #948 is trying to address a bug in the stored procedure. They look and see that only lines 50-65 out of 300 aren't covered by a unit test, otherwise all other functionality has been tested. They now have a pretty good shot at spending time on a likely problematic area rather than reverse engineering the entire 300 lines. Or, the next intern/new hire can be asked to fill in missing unit test coverage gaps as a learning exercise. Knowing what you aren't testing yet can help prioritize future work and diagnose problems faster.