There's only one way to do integration tests with DBs right, use a real instance of the DB. If you need to speed up the setup/teardown process, so be it. If you need to fashion your tests to avoid uniqueness constraints, so be it. But every other approach such as running in transactions and then rolling back, intercepting calls like this library does, mocking, etc will inevitably result in missing some quirk with your SQL. FWIW, at my job, I've written hundreds of tests that use a single local instance of CRDB harmlessly and there aren't any performance issues of note.
This looks more like a caching solution to me, not a way to skip integration tests with your DB. I can't seem to find how you invalidate the saved interactions, but I assume it must either detect changes or you manually clear the cache.
This seems like a reasonable solution to speed up integration tests when you need to run them frequently. Particularly useful when adopting legacy code that that only supports integration testing due to poor design.
That's essentially what this is. It looks like they just spin up cockroach in a docker container and intercept queries as they're sent for some of the analytics features.
I am curious how many people mock their SQL queries and how they decide to mock vs not mock at that level.
From the perspective of the services I'm knee deep in at the moment, SQL feels like the wrong layer to mock at. Instead, when we do want to mock, we usually pass around mocked out data access objects. That way we work at the application layer of `GetN` rather than `SELECT a, b, c FROM N`.
It feels like a lot of the value of the tests we write that do execute sql is a result of them actually running it. Granted, it is slower than mocking, requires synchronous rather than parallel tests, and we need to wipe the impacted tables for every test. However, the value we get is that we know that a query actually has the intended semantic.
In unit tests, I like to only test a single method in isolation, with all of it’s dependencies mocked. So in a domain layer method, that depends on some infrastructure layer code that itself does the DB access, I mock the DB access layer. And the DB access layer already hides the SQL, so you’re mocking “normal” objects/methods, not SQL.
However, when unit testing the DB access code itself, I like to connect to a real-but-embedded database (for example with JVM projects, these guys have some great embedded DB libs: https://github.com/flapdoodle-oss), with a different DB per test class (different test classes run concurrently), but no cleanup between test cases (which run sequentially). This lets me test the SQL queries themselves.
Integration tests are a different story, they run against actual deployed versions of services, with real everything (including databases), nothing is mocked.
Typically there is no issue with running parallel tests against a real database. I just wrap every test case in a transaction and turn the transactions into subtransactions. This way you can just rollback at the end of each test.
We typically have multiple options in our tests - some SQL logic is testable in sqlite itself, while others (crap like arrays and datediffs) we try to run on an actual postgres instance. Snowflake has been the most annoying, since it doesn't support transactions and is also relatively slow anyways.
Yep, this is the pattern I usually use. It's useful to put tests that hit a real DB behind a build flag, for situations where a DB might not be present. Also since I prefer to bake migrations into my application, those get thoroughly tested as well.
We require that CI builds are self-contained. So if you need a database, run it. Others mentioned Docker. Specifically there is Testcontainers. If you need cross-service integration, those don't belong in your CI builds. (So the tech/setup can also be a bit different.)
Not the person you're asking this, but N connections to sqlite :memory: backend results in N separate database instances.
This can be worked around in some scenarios, but not always & reliably. Most importantly, you can't reach the same in-memory database from multiple processes.
I have used libeatmydata to help with this, but you just made me realize that in my current project I could just dump the test db to RAM. Thanks for the tip!
For integration testing I'm sure people do all sorts of "interesting" things. For unit tests in an automated CI build, it is often more desirable to have a smaller subset of data, which presumably is less slow, since tests there also focus on more isolated workflows.
There is a better solution: you run your whole test in a database transaction, and then roll back when the test is completed. This means you only have to set up the database once and the overhead per test is minimal.
Fwiw I personally find this approach very frustrating, b/c when a test fails, all of the test data is gone, so I can't open psql/mysql CLI and look at "what did the data actually look like".
Granted, there are pros/cons (you also can't test code that issues txn begins/commits, although that is rare), but personally that particular con outweighs the pros imo/for me.
Yeah, I am not sure that I trust the average database with transactional schema modification or nested transactions. You have to be very careful about your database's settings and choose the correct isolation level. (This is more important for your actual application than whatever your tests do, but the tests are where you're doing weirder things whose implementation details are likely to raise your eyebrow.)
Personally, what I do is create a new database at the beginning of the tests, and have the tests use that. There are some downsides; if you use a random database name, you can run as many tests as you like in parallel, but some percentage of test runs will never run the cleanup code and you will have stale test databases around that need to be cleaned up. (You can simply not retain files written by your database after the CI run, of course, but then you can't debug the failing tests as easily.) If you use a fixed database name, then you will only ever have one copy of the data and won't need to clean anything up. But you can't run tests in parallel -- "go test ./..." or whatever does run tests in parallel by default, so you will see conflicts between tests this way. (In the past, I've done things the second way, but upon writing this comment, I would probably do things the first way in the future.)
Overall, I strongly agree with the advice to run your tests against an as-real-as-possible database. You detect a lot of issues this way -- queries that don't parse, database settings that affect the results (things as dumb as time zones, for example), and you get actual error codes. For example, I used to run all of my queries in a block that retried them 3 times on retryable errors, and gradually built up a list of non-retriable error codes; "syntax error", "column doesn't have a default", etc. This list is easy to build while directly developing against a database with the same settings as production, but will never work when running against mocks or sqlite. (In theory, your database vendor should provide some client library that does things like this... but nobody does.)
The test running in a transaction would mean my external tools like psql/etc would not be able to see the data b/c it's uncommitted, even if I pause/catch the test before it finishes.
I did this for a while, but it masks subtle issues related to transaction boundaries. If your database usage is very straightforward, it may be adequate. But if you are, to name an example on the other end of the spectrum, in a monolith application that has pulled all the stops on fancy stuff like JPA callbacks, this should not be your approach. Also, don't rely on a local database, but rather on something that is reliably reproducible in your build, which will often make the discussed approach less attractive anyway. And if your database usage is really that straightforward, an in-memory database like H2 could serve one well.
But then you need to have an actual database running. I find the ability to mock the DB entirely very useful to reduce the amount of scaffolding an average developer needs to have in order to run the test suite.
Compared to just spinning up a DB in a container, I find the amount of setup/scaffolding required to mock the DB to be far greater.
For years I took the mocking route, but it means there is a whole class of bugs that your tests won't catch, and the scaffolding was often fragile. Few years back I switched to just running a real DB in a container - very happy with this way.
Not sure about how viable it is in Go, but our Python tests automatically `initdb` and start Postgres as part of the test suite setup, and burns it all down during teardown. Each test inside the suite gets an isolated database inside the cluster.
Nothing to manage for the regular application/test developer, no persistent background services, nothing exposed outside of the local machine, no dependency on Docker. You just need to have Postgres on your $PATH, and even that is handled for you by Nix.
We use the same approach to transparently also run the same tests against MSSQL, and it is mostly transparent (except for all the ugly hacks involved to run an isolated MSSQL instance outside of Docker container).
Yes, there is some scaffolding involved, but it contains zero model logic, and ~never has to be touched again once it works.
Sure, done by a different persona; and that's crucial:
Let's imagine (I think) a common scenario: a simple Go project, composed of a number of modules owned by different team members or small groups, and shared packages etc.
Now, I need to add some stateful component and I reason whether to use, say, MySQL or something else.
If I choose MySQL and I need a real MySQL to run tests against it, now I need to prepare some docker-compose or equivalent scaffolding and shove it down ever other team member's throat.
I want them to be able to run all tests without having to stop and think what tests their change (possibly in a shared package) might affect.
Sure, CI will catch things. But deferring all tests failure detection to the CI stage adds latency, and often troubleshooting issues that happen on CI is hard if it's hard to rerun the same thing locally.
I witnessed a "pressure" towards preferring pure-go solutions so that the team doesn't have to switch to a more "complex" build/test harness.
Granted, this is only a problem if you managed so far to do all you needed to do with the pure Go build system (which I have to say, I like very much and I do need to have a pretty good reason before I abandon/"upgrade" to something else).
I think the scaffolding is worth it. You ask people on your team to run "docker run --restart always --name mysql -e MYSQL_ROOT_PASSWORD=foobar -p 3306:3306 -d mysql:5.whatever", and then you never think about it again. (Also be sure to firewall that off so only localhost can get to your database with a predictable root password -- lots of bots out there that will "hack" this thing in 24 hours. Try it and see!)
Compared to maintaining mocks that don't actually detect issues like invalid queries, or missing defaults, or mapping semantics between language types and database types, this is a lot simpler. You just need one thing running, and you only need to set it up once. Every feature that is available to production code is now available to your tests.
You can also have your tests launch a database container for you. I found this slow, and that the setup overhead of one instruction in the README was worthwhile.
Finally, you might not need the database for every test. If you have Service 1 that depends on the database and Service 2 that only depends on Service 1, writing fakes for Service 1 for the Service 2 tests to use to avoid the database is productive. Service 2 will want to test the error cases for Service 1 anyway, so you will have to have the provision for things like "make the next call to service 1 hang indefinitely", etc. so you will be writing that anyway. Some people like all integration tests to go all the way to the bottom of the stack, but I prefer testing only the boundary in depth and making simpler end-to-end tests. YMMV.
Then you change from MySQL to postgres and you have to tell everybody to start another container. Then you have 5 projects, are they all using the same password?
Here we're talking about record/replay mocks, which are based on a real database. So, of course somebody will have to run that container with MySQL so that the tests will be run in record mode (e.g. when developing the tests). But, then if your teammates don't develop those tests, they don't technically need a real database. You still want them to run your tests because those tests might depend on some course your teammates touch (e.g. some common library code)
As long as those tests can be done within a single block of code, that works fine. A lot of ORMs and connection pools do not play nicely with complex transactions. It is surprisingly difficult to get a single transaction to work correctly if the transaction happens across multiple API calls chained together, as would happen in an integration test. It would be nice if SQL databases and clients all had support for named transactions that can be created, accessed, updated, and finished all on different connections or sessions.
Besides what others have said about missing subtleties with nested transactions, CRDB adds an extra level of complexity. Transactions for CRDB must sometimes be restarted client side on a certain error type to resolve concurrency issues. Therefore transactions are handled inside the application layer as retry loops. So you often can't use generic transaction libs without at least adding some special handling for this, which is rarely worth it.
This is cool, and I applaud the author’s effort. For certain projects this is likely the right way to go.
However, another way to solve this which is safe to use when testing packages in parallel, and let’s you work against your exact database (driver/protocol/version etc)
Each package under test copies the database schema to a new database with the name “{package}_{uuid}”.
TestMain in each package is responsible for:
-cleanup of old databases with the same prefix
-create a new test database to work against.
Tests within a package are run sequentially, and are responsible for wiping the package’s test DB at the start of each test.
The pattern of “defer cleanup” is avoided in favor of deleting at the start to avoid the edge cases of crashing in the middle of a test run.
Interesting, this is essentially a record/reply framework at the db driver level.
Usually I think record/reply testing is an anti-pattern b/c it's typically used against external systems (i.e. REST/RPC API calls) where the current state of the external system is not captured in the recording itself.
(I.e. someone clicked around in the vendor's UI to make it "look like this", then ran record. However 6 months later when you need to re-record your test, the state in vendor system no longer "looks like this" so it's a nightmare to know if the test is failing due to a real regression or just changed data.)
That said, copyist records the test data being inserted/setup into the db as well...so...huh, it might actually be a good idea.
What a nice work! I wrote my thesis about this approach couple years back. Instead of generating test code, though, I saved queries in a serialized graph which was extended/changed if necessary.