Using a pre-populated Database with Testcontainers

June 22, 2020

This is the third post on a series of posts about Testcontainers. This post expands on the second post so it’s recommended that you read it before this one since we’re building on the application that was created on previous posts.

The Problem

Let’s suppose you have a database that’s very big. That requires a huge amount of data and it takes a lot of time to prepare it before each test. To maintain consistency over tests you could running keep all this setup before each test class, however it can slow your tests dramatically.

Other option would be to not dispose of the database container after each test class. This way the setup would be done only once for all the tests. This approach however, comes with a great drawback, you have to manage the state of the database so that your tests remain consistent.

A third option, which I’ll explore here is to have all the setup data already created inside the docker image and then you can quickly spin up this clean state before each test class without compromising speed.

So, let’s take the simple add_users.sql:

CREATE TABLE users(
    id INT,
    first_name VARCHAR,
    last_name VARCHAR
);

INSERT INTO users (id, first_name, last_name) VALUES (2, 'Addrianne', 'Lenker'), (3, 'Phoebe', 'Bridges');

And try to build an image with this data already created inside it.

Using Custom Entrypoints

The default PostgreSQL image has something similar to initEntrypoint of Testcontainers. During startup it will scan .sql and .sh files on /docker.entrypoint.d and it will run those files during the container startup. The trick here is to find a way to run this entrypoint during the build part of the container.

However that’s not exactly what we want here. We want to be able to run those .sql files at the build phase of the container. Searching online, this issue here provides some clues to achieve what we want. It describes a way to check if a database already exists when we’re building our container, and if there’s not one we can start the database server and insert data on it.

That sounds promising, so we’re going to extend the postgres docker image for that and add this custom entrypoint which will create our data.

create a database folder on the root of your project. Create the file custom-entrypoint.sh containing the following code which was extracted from the issue mentioned above:

#!/usr/bin/env bash
set -Eeo pipefail

# Example using the functions of the postgres entrypoint to customize startup to always run files in /always-initdb.d/

source "$(which docker-entrypoint.sh)"

docker_setup_env
docker_create_db_directories
# assumption: we are already running as the owner of PGDATA

# This is needed if the container is started as `root`
#if [ "$1" = 'postgres' ] && [ "$(id -u)" = '0' ]; then
#	exec gosu postgres "$BASH_SOURCE" "$@"
#fi

if [ -z "$DATABASE_ALREADY_EXISTS" ]; then
	docker_verify_minimum_env
	docker_init_database_dir
	pg_setup_hba_conf
	
	# only required for '--auth[-local]=md5' on POSTGRES_INITDB_ARGS
	export PGPASSWORD="${PGPASSWORD:-$POSTGRES_PASSWORD}"
	
	docker_temp_server_start "$@" -c max_locks_per_transaction=256
	docker_setup_db
	docker_process_init_files /docker-entrypoint-initdb.d/*
	docker_temp_server_stop
else
	docker_temp_server_start "$@"
	docker_process_init_files /always-initdb.d/*
	docker_temp_server_stop
fi

exec postgres "$@"

Then, also inside the database folder, let’s separate add_users.sql in two files as it’s more sense to separate the creation of the table and the insertion of test data. Create a file named 01_create_tables.sql containing:

CREATE TABLE users(
    id INT,
    first_name VARCHAR,
    last_name VARCHAR
);

and another file named 02_add_test_users.sql containing:

INSERT INTO users (id, first_name, last_name) VALUES (2, 'Addrianne', 'Lenker'), (3, 'Phoebe', 'Bridges');

The naming is important here because the files are read on alphabetic order. Now to bring everything together create a Dockerfile with the following:

FROM postgres:12.3-alpine

ENV POSTGRES_USER=postgres
ENV POSTGRES_PASSWORD=postgres
ENV POSTGRES_DB=test

COPY ./custom-entrypoint.sh /usr/local/bin/custom-entrypoint.sh
COPY ./01_create_tables.sql /docker-entrypoint-initdb.d/
COPY ./02_add_test_users.sql /docker-entrypoint-initdb.d/

RUN chmod +x /usr/local/bin/custom-entrypoint.sh

USER postgres

RUN custom-entrypoint.sh postgres || exit 0

ENTRYPOINT [ "custom-entrypoint.sh" ]

# using this CMD works so that docker run ... doesnt require a command override, 
# only downside is it calls SET for this parameter on every psql issued to the container
CMD [ "-c", "max_locks_per_transaction=512" ]

The Dockerfile first sets the credentials that we’re going to use on our tests. Then it copies our script to the bin folder of the container and our .sql scripts to the /docker-entrypoint-initdb.d folder.

Then it runs our custom entrypoint, which will run those scripts if a database is not already created, which on build time it is not (unless you mount a volume with the data, but that’s not the case here).

Finally, build the image so we can use on our code by running docker build -t demo/testdb .

You should see on the logs the sql code being executed like this:

Updating tests to use the new image

Since we’re using a custom image, we need to specify that on the code. Edit DatabaseTest.kt and replace the line

class KPostgresContainer: PostgreSQLContainer<KPostgresContainer>()

with

class KPostgresContainer: GenericContainer<KPostgresContainer>("demo/testdb:latest")

We also don’t need to specify the username and password since we’re already doing that on the Dockefile. So change the container initialization to just:

@Container
val container = KPostgresContainer()
        .withExposedPorts(5432)

That’s it for the code changes. If you run your tests now they should pass and the sql scripts are run only once on the creation of the Dockerfile itself, that can save a lot of time if have a lot of scripts to run. You can check the code for this post here. Just clone and checkout the custom-entrypoint tag.

Hope you enjoyed these series of posts! Testcontainers is a very handy tool.