Giter Site home page Giter Site logo

copyist's Introduction

copyist

Go Reference Latest Release License

Mocking your SQL database in Go tests has never been easier. The copyist library automatically records low-level SQL calls made during your tests. It then generates recording files that can be used to play back those calls without connecting to the real SQL database. Run your tests again. This time, they'll run much faster, because now they do not require a database connection.

Best of all, your tests will run as if your test database was reset to a clean, well-known state between every test case. Gone are the frustrating problems where a test runs fine in isolation, but fails when run in concert with other tests that modify the database. In fact, during playback you can run different test packages in parallel, since they will not conflict with one another at the database level.

copyist imposes no overhead on production code, and it requires almost no changes to your application or testing code, as long as that code directly or indirectly uses Go's sql package (e.g. Go ORM's and the widely used sqlx package). This is because copyist runs at the driver level of Go's sql package.

What problems does copyist solve?

Imagine you have some application code that opens a connection to a Postgres database and queries some customer data:

func QueryName(db *sql.DB) string {
	rows, _ := db.Query("SELECT name FROM customers WHERE id=$1", 100)
	defer rows.Close()

	for rows.Next() {
		var name string
		rows.Scan(&name)
		return name
	}
	return ""
}

The customary way to test this code would be to create a test database and populate it with test customer data. However, what if application code modifies rows in the database, like removing customers? If the above code runs on a modified database, it may not return the expected customer. Therefore, it's important to reset the state of the database between test cases so that tests behave predictably. But connecting to a database is slow. Running queries is slow. And resetting the state of an entire database between every test is really slow.

Various mocking libraries are another alternative to using a test database. These libraries intercept calls at some layer of the application or data access stack, and return canned responses without needing to touch the database. The problem with many of these libraries is that they require the developer to manually construct the canned responses, which is time-consuming and fragile when application changes occur.

How does copyist solve these problems?

copyist includes a Go sql package driver that records the low-level SQL calls made by application and test code. When a Go test using copyist is invoked with the "-record" command-line flag, then the copyist driver will record all SQL calls. When the test completes, copyist will generate a custom text file that contains the recorded SQL calls. The Go test can then be run again without the "-record" flag. This time the copyist driver will play back the recorded calls, without needing to access the database. The Go test is none the wiser, and runs as if it was using the database.

How do I use copyist?

Below is the recommended test pattern for using copyist. The example shows how to unit test the QueryName function shown above.

func init() {
	copyist.Register("postgres")
}

func TestQueryName(t *testing.T) {
	defer copyist.Open(t).Close()

	db, _ := sql.Open("copyist_postgres", "postgresql://root@localhost")
	defer db.Close()

	name := QueryName(db)
	if name != "Andy" {
		t.Error("failed test")
	}
}

In your init or TestMain function (or any other place that gets called before any of the tests), call the copyist.Register function. This function registers a new driver with Go's sql package with the name copyist_<driverName>. In any tests you'd like to record, add a defer copyist.Open(t).Close() statement. This statement begins a new recording session, and then generates a playback file when Close is called at the end of the test.

copyist does need to know whether to run in "recording" mode or "playback" mode. To make copyist run in "recording" mode, invoke the test with the record flag:

go test -run TestQueryName -record

This will generate a new recording file in a testdata subdirectory, with the same name as the test file, but with a .copyist extension. For example, if the test file is called app_test.go, then copyist will generate a testdata/app_test.copyist file containing the recording for the TestQueryName test. Now try running the test again without the record flag:

go test -run TestQueryName

It should now run significantly faster. You can also define the COPYIST_RECORD environment variable (to any value) to make copyist run in recording mode:

COPYIST_RECORD=1 go test ./...

This is useful when running many test packages, some of which may not link to the copyist library, and therefore do not define the record flag.

How do I reset the database between tests?

You can call SetSessionInit to register a function that will clean your database:

func init() {
    copyist.Register("postgres")
    copyist.SetSessionInit(resetDB)
}

The resetDB function will be called by copyist each time you call copyist.Open in your tests, as long as copyist is running in "recording" mode. The session initialization function can do anything it likes, but usually it will run a SQL script against the database in order to reset it to a clean state, by dropping/creating tables, deleting data from tables, and/or inserting "fixture" data into tables that makes testing more convenient.

Troubleshooting

I'm seeing "unexpected call" panics telling me to "regenerate recording"

This just means that you need to re-run your tests with the "-record" command line flag, in order to generate new recordings. Most likely, you changed either your application or your test code so that they call the database differently, using a different sequence or content of calls.

However, there are rarer cases where you've regenerated recordings, have made no test or application changes, and yet are still seeing this error when you run your tests in different orders. This is caused by non-determinism in either your application or in the ORM you're using.

As an example of non-determinism, some ORMs send a setup query to the database when the first connection is opened in order to determine the database version. So whichever test happens to run first records an extra Query call. If you run a different test first, you'll see the "unexpected call" error, since other tests aren't expecting the extra call.

The solution to these problems is to eliminate the non-determinism. For example, in the case of an ORM sending a setup query, you might initialize it from your TestMain method:

func TestMain(m *testing.M) {
	flag.Parse()
	copyist.Register("postgres")
	copyist.SetSessionInit(resetDB)
	closer := copyist.OpenNamed("test.copyist", "OpenCopyist")
	pop.Connect("copyist-test")
	closer.Close()
	os.Exit(m.Run())
}

This triggers the first query in TestMain, which is always run before tests.

The generated copyist recording files are too big

The size of the recording files is directly related to the number of accesses your tests make to the database, as well as the amount of data that they request. While copyist takes pains to generate efficient recording files that eliminate as much redundancy as possible, there's only so much it can do. Try to write tests that operate over smaller amounts of interesting data. For tests that require large numbers of database calls, or large amounts of data, use a different form of verification. One nice thing about copyist is that you can pick and choose which tests will use it. The right tool for the right job, and all that.

Limitations

  • Because of the way copyist works, it cannot be used with test and application code that accesses the database concurrently on multiple threads. This includes tests running with the "-parallel" testing flag, which enables tests in the same package to run in parallel. Multiple threads are problematic because the copyist driver code has no way to know which threads are associated with which tests. However, this limitation does not apply to running different test packages in parallel; in playback mode, this is both possible and highly encouraged! However, in recording mode, there may be problems if your tests conflict with one another at the database layer (i.e. by reading/modifying the same rows). The recommended pattern is to run test packages serially in recording mode, and then in parallel in playback mode.

  • copyist currently supports only the Postgres pq and pgx stdlib drivers. If you'd like to extend copyist to support other drivers, like MySql or SQLite, you're invited to submit a pull request.

  • copyist does not implement every sql package driver interface and method. This may mean that copyist may not fully work with some drivers with more advanced features. Contributions in this area are welcome.

copyist's People

Contributors

1lann avatar andy-kimball avatar chrisseto avatar dan-j avatar hectorj-thetreep avatar jeffswenson avatar jolg42 avatar otan avatar petermattis avatar rohankmr414 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

copyist's Issues

Query arguments are not stored or verified

The arguments to a query are not stored when creating a recording, and not verified when playing back recording (obviously because there is nothing to verify with). This means that it is possible to change the arguments to a query without failing existing tests, as long as the sequence of queries does not change.
As an example, consider this line in the simple query test:

rows, err := db.Query("SELECT name FROM customers WHERE id=$1", 1)

If we change the id argument to 2 or 50 the result should change to a different name or no rows in the result set respectively. However running the test with playback (I used pqtest but I think it should be the same for all) results in a pass with no warning that the tests need to recorded again. If you update the test to look for the correct name or error then the test fails and gives a message suggesting you might need to redo the recording.
I know this example is a bit silly because you are changing the test without changing the expected result or redoing the recording, but the same thing could happen if you make changes in your application logic which ends up affecting the arguments to a query.

Is there a reason to not include the query arguments in the recording? I don't think serialization should be an issue because the library already handles that for the results.

What is the inspiration for the design of copyist?

Just curious to hear more about what inspired the design of copyist, whether another library or CS concept. Never come across a mocking library explained in these terms so I'm interested to delve deeper.

Flag to disable the use of ConnQuery method

We found out that some drivers don't implement QueryContext method and this case is not covered in copyist right now.

What happens is that on record, the driver returns ErrSkip if the underlying driver does not support that method and fallbacks to ConnPrepare method instead (as documented here). ConnPrepare is correctly recorded. The issue is that on replay the proxyConn always supports ConnQuery (and expects it in VerifyRecordWithStringArg() which fails with unexpected call to ConnQuery), but it should really fallback to ConnPrepare as recorded.

Initially I was trying to fix the proxyConn.QueryContext method to check what's implemented on the driver, but it's impossible to check if the driver supports the conn if we don't have the instance of it when replaying.

This behavior was observed for drivers sqlserver, azuresql, trino.

In this commit I added additional flag to turn off the use of ConnQuery used by proxyConn.QueryContext introduced in #9. Would be nice if this gets upstream.

Unable to run migrations with gorm and pgx v5

Using the following dependencies:

github.com/cockroachdb/copyist v1.6.0
gorm.io/driver/postgres v1.5.2
gorm.io/gorm v1.25.2
github.com/jackc/pgx/v5 v5.3.1 // indirect

Utilising gorm's migrator interface, i.e. db.Migrator.AutoMigrate(&MyStruct{}), results in an error like the following:

expected 0 arguments, got 1

It's took a bit of digging, but what's happening is a query like so is being made with 1 argument: SELECT * FROM "table" LIMIT 1, which is invalid since there isn't any ? delimiters in the query.

This works fine without copyist, and what seems to be happening is that pgx handles special arguments which implement github.com/jackc/pgx/v5.QueryExecMode, to determine how the query is executed and then pops it off the args list.

Within the database/sql package, DB.queryDC(...) makes a call to driverArgsConnLocked(...) which converts the args slice to a []driver.NamedValue. This function checks if the underlying connection implements driver.NamedValueChecker and if so, delegates handling of each argument. If the connection does not implement this interface, the defaultCheckNamedValue(...) function is used, which uses the reflect package to convert the QueryExecMode value to a raw int64.

Now pgx receives a []driver.NamedValue and doesn't recognise the arguments implement QueryExecMode and it passes the arg with the query, resulting in the error.

Proposed solution

Implement the driver.NamedValueChecker interface and delegate to the underlying connection if it also implements it, otherwise perform the same behaviour of the database/sql package.

Examples, please

The solution seems interesting and useful; however, as a golang newbie, I honestly have no idea how to use it. Any chance anyone has some examples? Readme unfortunately doesn't help.

And, to truly show off my noobiness, can I use this with mocked context.Context?

Error type in the replay mode is not the same as the error type in record mode

Hi,

I'm using copyist with Postgres. I have an error-handling code similar to the example below. When running in the recording mode it works as expected, errors.As can find an error that is assignable to the *pq.Error.

The problem is when running in the replay mode there is no *pq.Error in the chain. I expected it to behave the same in the replay and record mode.

db, err := sqlx.Connect(driver, url)
if err != nil {
    var pqErr *pq.Error
    if errors.As(err, &pqErr) {
        return pqErr.Code == "28000"
    }
}

-record flag example

Hi I was trying to record the output for a test and I got the following

$ go test -run TestDictWordList -record
Foo {} no recording exists with this name: TestDictWordList/Foo

PASS

I also tried with a double dash between

$ go test -run TestDictWordList -- -record
Foo {} no recording exists with this name: TestDictWordList/Foo

PASS

The only way I could get it to work was

COPYIST_RECORD=1 go test -run TestDictWordList

FTR

$ go version
go version go1.17.5 darwin/arm64

go.mod has

github.com/cockroachdb/copyist v1.4.1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.