Fast unit tests with databases: Implementation of our solution

In the first two parts of this series I wrote about our huge performance gains for database unit tests, as well as an introduction to our testing framework.

After that more theoretical part, let’s get down to how we actually set up our tests and how we got that nice performance.

Autor: Johannes Hartmann
Datum: 13. Juli 2022
Lesedauer: 7 Minuten

What has happened so far?

If you haven’t read the first parts, we’re talking about a test runtime improvement from 90 minutes down to 40 seconds. This was for a medium sized project containing 3400 tests, 1200 of which are database tests. Those 1200 tests are running against an actual PostgreSQL database server:

You’ll find a link to a running example project in the addendum, which shows how simple and powerful our testing framework is. You might want to check that out, too.

Database with InMemory data

This particular point doesn’t have to do anything with our .NET testing framework libraries. It is just an often overlooked trick which gives you a huge performance boost for database tests and can be applied for many database services out there.

A huge performance bottleneck is database services writing data to a physical disk. Even on high speed NVMe this brings a very noticeable slowdown to the test runtime.

If you have virtualisation techniques like docker at hand, the quickest and easiest way to gain a lot of performance is by running the test database server with its data directory mounted to an in memory file system. Writing directly to RAM is much faster than writing to a disc. You start the service and slightly change your connection string in the tests. That’s it.

For postgresql we start an own docker service for local development using the following docker-compose file:

version: '3.8'
 
services:
  postgres_test:
    image: postgres:13
    ports:
      - "5433:5432" # Port mapping to 5433 so it does not conflict with the normal postgres instance
    volumes:
      - type: tmpfs
        target: /var/lib/postgresql/data
      - type: tmpfs
        target: /dev/shm
    environment:
      POSTGRES_PASSWORD: postgres

You simply start this with

docker-compose -f postgres_test.yml up

For our build servers, which are using GitLab, we also configured those volumes to use tmpfs. So in the build-pipeline we start an own PostgreSQL server (using gitlabs pipeline services) with that tmpfs mount, run our tests and have the container stopped again afterwards. This might sound like a massive overhead, but starting the server, running all 3400 tests / 1200 database tests, generating and uploading coverage reports still takes less than one and a half minutes in total.

You can apply the same trick to other databases. Examples:

For MySQL you need to mount /var/lib/mysql
MSSQL requires a special flag on the file system it uses, which isn’t set in tmpfs. Check the addendum to find out how to get MSSQL working with a tmpfs volume.

Creating the test database template

The first step of running a set of database tests is creating the database template for those.

This point is now specific to PostgreSQL. While this is the only database supported by the framework currently, it can be easily extended to support any other database.

Here we’re using the library Fusonic.Extensions.UnitTests.Adapters.PostgreSql, providing the PostgreSqlUtil. This simply handles some matters with a PostgreSQL database, like creating and dropping databases, getting rid of old test databases and so on.

For creating the template we implement an ITestDbTemplateCreator in the test assembly, which looks like this:

public class TestDbTemplateCreator : ITestDbTemplateCreator
{
    public void Create(string connectionString)
    {
        //The connection string contains the test db name that gets used as prefix.
        var dbName = PostgreSqlUtil.GetDatabaseName(connectionString);
 
        //Drop all databases that may still be there from previously stopped tests.
        PostgreSqlUtil.Cleanup(connectionString, dbPrefix: dbName!);
 
        //Create the template
        PostgreSqlUtil.CreateTestDbTemplate<AppDbContext>(connectionString, o => new AppDbContext(o), seed: c => new TestDataSeed(c).Seed());
    }
}

If you run this with a database name example_test, this name is used as the prefix. Each test gets a copy of that database with a random string attached to it, like example_testX8vCj4wVDEyRfI5nQD24Tw. This only happens when the test actually accesses the database. The database gets dropped after running the test.

When creating the template, all migrations are applied and the given test data seed gets called.

You can run this easily programmatically, but you usually also want to run this in the CLI of a build server or locally. For this we have the dotnet tool Fusonic.Extensions.UnitTests.Tools.PostgreSql. You can install and run it like this:

dotnet tool install -g Fusonic.Extensions.UnitTests.Tools.PostgreSql
pgtestutil template -c "Host=localhost;Port=5433;Database=example_test;Username=postgres;Password=postgres" -a "src/Example.Database.Tests/bin/Debug/net6.0/Example.Database.Tests.dll"

The call to pgtestutil takes the connection string to the template database, which does not have to exist at this point, and the path to the assembly where ITestDbTemplateCreator is located.

Configuring the database test project

The test base classes for database projects aren’t too complicated either:

public abstract class TestBase : TestBase<TestFixture>
{
    protected TestBase(TestFixture fixture) : base(fixture) { }
}
 
public abstract class TestBase<TFixture> : DatabaseUnitTest<AppDbContext, TFixture> where TFixture : TestFixture
{
    protected TestBase(TFixture fixture) : base(fixture) { }
}
 
public class TestSettings
{
    public string ConnectionString { get; set; } = null!;
    public string TestDbPrefix { get; set; } = null!;
    public string TestDbTemplate { get; set; } = null!;
}
 
public class TestFixture : DatabaseFixture<AppDbContext>
{
    public TestSettings TestSettings { get; } = new();
 
    protected override IConfiguration BuildConfiguration()
    {
        var configuration = base.BuildConfiguration();
        configuration.Bind(TestSettings);
        return configuration;
    }
 
    protected override void ConfigureDatabaseProviders(DatabaseFixtureConfiguration<AppDbContext> configuration)
        => configuration.UsePostgreSqlDatabase(TestSettings.ConnectionString, TestSettings.TestDbPrefix, TestSettings.TestDbTemplate)
                        .UseDefaultProviderAttribute(new PostgreSqlTestAttribute());
 
    protected sealed override void RegisterCoreDependencies(Container container)
    {
        base.RegisterCoreDependencies(container);
        // SimpleInjector configuration
    }
}

The biggest change is ConfigureDatabaseProviders. The database provider handles creation of the test database (from the template), dropping that database again after the test and configuring the DbContext for a test accordingly. The provider for PostgresSQL only creates the database if the database is actually accessed. Nothing happens on the database otherwise.

The UseDefaultProviderAttribute tells it to always use PostgreSql if nothing else is configured. This makes more sense when multiple providers are configured.

With the database provider attributes you can specify per test (or per test class) which database provider should be used, or even ensure that the tested code never accesses the database.

Example:

public class SomeServiceTests : TestBase
{
    [Fact]
    [PostgreSqlTest]
    public async Task DbTest()
    {
        // This test uses EntityFramework configured for using PostgreSQL database
    }
 
    [Fact]
    [InMemoryTest]
    public async Task InMemoryTest()
    {
        // This test uses EntityFramework configured for using EntityFrameworkCore.InMemory as database
    }
 
    [Fact]
    [NoDatabase]
    public async Task NoDatabase()
    {
        // This is for tests where you want to ensure that there is no database access. Any database access throws an exception.
    }
}

You do not have to always specify an attribute. UseDefaultProviderAttribute allows you to define which configuration should be used, if no database provider attribute is set.

For projects where we use a virtualized database server with an in-memory filesystem, we don’t use the EntityFramework.InMemory provider at all, as it cannot add too much additional performance gain compared to the negatives it brings with it.

So we just have the PostgreSQL database provider configured, as in the example above, and do not explicitly add the PostgreSqlTest-Attribute anywhere, as it is already the default.

No virtualisation - InMemory during the day, Database during the night

For projects where the trick with the in-memory Database server isn’t used (yet), we partially use the InMemory database of EntityFramework Core. It does not act like a real database, and thus shouldn’t be completely relied on for database tests. For example, using this InMemory library you won’t find errors where your query cannot be translated to SQL, because it simply does not translate it. There are many more pitfalls - I want to link to Jimmy Bogards blog about this once more, so you shouldn’t rely on this method.

However, using the InMemory-provider allows you to test your EF queries quickly without needing a database, thus being considerably faster. With Fusonic.Extensions.UnitTests.Adapters.InMemoryDatabase you get support for exactly that.

The fancy touch here: Using our framework you can mitigate the downsides of running InMemory by running some tests as InMemory during the day (build pipeline, local development) and easily switch them to real database tests in a nightly build.

To do this, change the database provider configuration:

protected override void ConfigureDatabaseProviders(DatabaseFixtureConfiguration<AppDbContext> configuration)
{
    configuration.UsePostgreSqlDatabase(TestSettings.ConnectionString, TestSettings.TestDbPrefix, TestSettings.TestDbTemplate)
                 .UseInMemoryDatabase()
                 .UseDefaultProviderAttribute(new InMemoryTestAttribute());
 
    if (bool.TryParse(Environment.GetEnvironmentVariable("NIGHTLY"), out var isNightly) && isNightly)
        configuration.UseProviderAttributeReplacer(_ => new PostgreSqlTestAttribute());
}

To make some sense of it, let’s have a look at a test class:

public class SomeTest : TestBase
{
    [Fact]
    [PostgreSqlTest]
    public void ThisTestAlwaysUsesPostgres()
    {}
 
    [Fact]
    public void ThisDefaultsToInMemory()
    {}
}

So, what did we configure?

We added support for the InMemory-Database from EF Core. This means, you can mark your tests (or your test classes) with the [InMemoryTest]-Attribute to tell the framework that it should use the EF Core InMemory database. But you don’t need to as it is already configured as default.

Our rule of thumb is always: Always add at least one PostgreSqlTest per test class, so you catch query translation errors also during development and in the build pipelines.

Now you can build test classes where most of the tests run InMemory, with a good feeling that the code also runs on a real database, because at least one test always runs against the database.

But you also should ensure that all your InMemory-Tests also run against a database. And here comes the switch for the nightly tests. Set the environment variable NIGHTLY (or whatever you prefer) to true in your nightly build pipeline. With the UseProviderAttributeReplacer configuration you basically tell the framework: No matter what’s configured, use the PostgreSQL database.

That’s it. You get security at night and the speed of InMemory during the day, when you need it most.

Addendum

Fusonic Extensions project

You can find the extended documentation for our test framework on our github project for the fusonic extensions.

Example project

You can find a simple sample project on GitHub:

It contains an example .NET project with two test projects, one using the database and one without any database, as well as a docker configuration for PostgreSQL using an InMemory data volume.

MSSQL with a tmpfs mount

MSSQL requires a flag called O_DIRECT on the used file system. This isn’t set on tmpfs or ZFS. For ZFS support there’s also an old, open ticket on GitHub. Luckily there’s a workaround for ZFS. It drops the need for that flag and can also be used for tmpfs.

In order to get MSSQL running with tmpfs we need to build our own image. First, create a Dockerfile:

FROM mcr.microsoft.com/mssql/server
USER root
RUN wget https://raw.githubusercontent.com/t-oster/mssql-docker-zfs/master/nodirect_open.c
RUN apt update
RUN apt install -y gcc
RUN gcc -shared -fpic -o /nodirect_open.so nodirect_open.c -ldl
RUN apt purge -y gcc
RUN apt clean
RUN echo "/nodirect_open.so" >> /etc/ld.so.preload

And a yml for docker-compose:

version: "3.9"
services:
  mssql_test:
    build: .
    environment:
      SA_PASSWORD: "SuperSecret123!"
      ACCEPT_EULA: "Y"
      LD_PRELOAD: "/nodirect_open.so"
    ports:
      - "1434:1433"
    volumes:
      - type: tmpfs
        target: /var/opt/mssql/data
      - type: tmpfs
        target: /var/opt/mssql/log

Which you then run with

docker-compose -f mssql_test.yml build
docker-compose -f mssql_test.yml up

Dev Diary

Decorating Commands

3. August 2022 | 2 Min.

Fast unit test_introduction to our framework