Proposal for Test data fixtures in Grails
This proposal aims to aid unit tests that depend upon persistent domain objects. The idea is inspired by Rails' test fixtures.
Background
Unit testing is important. But when you're developing in a dynamic language/framework like Groovy/Grails, where there isn't an explicit compilation phase to flag up silly errors, and there isn't the same refactoring support of modern IDEs for statically-typed languages, it's even more important to unit test.
Unit tests are only useful if the data they run on does not change between test runs. If your tests run on data that is not under the control of the testing mechanism itself and that data changes, it can break your tests, even without touching a line of code! Then what use are your tests? It's better to have a dedicated test database and test data.
Requirements
A "test fixtures" mechanism with the following characteristics:
- Each domain model class has an associated list of test data fixtures
- The test fixture data is human-readable "serialized" instances of the domain model objects (could use groovy's literal Map (propertyName/value) or List (table columns) for example)
- Each unit test class (or method) declares which domain model class's fixtures it requires
- As part of each test-method-lifecycle, the database is cleaned and the required test fixture data is loaded (could hook in to, startup/teardown)
- The test fixture data is placed within the Grails application directory structure
- The relationships of dependent domain objects can be expressed, ie, the data of domain model's test fixtures can depend upon another model's test fixtures
Possible additional nice to haves
- Ant target to run a single test class
- Ant target to copy the development schema to the test datasource
- Ant target to export test/development database data to fixtures file
An example
Let's say I am modelling places and I have Country, Region an City domain models.
In order to test instances of these classes I would have separate text fixtures files for each model class: one for Country, one for Region, one for City. My Country test fixtures would include "USA", "England", "Spain", etc. My Region text fixtures might include an instance for "New York" and "Connecticut" and my City test fixtures might include a "New York City", etc.
In my domain models, City belongs to a Region, and Region belongs to a Country. It is possible to determine the relationships of specific instances of text fixtures from the test fixtures data alone, for example, that the "New York City" City fixture belongs to the "New York" Region fixture and that belongs to the "USA" Country fixture, by foreign keys or some shared symbol.
Now I have some complex logic in each of these classes I want to test so I want to write a unit tests for each.
My Country class only depends on itself (not Region or City), at least for the sake of this argument, so I only need to declare that my Country tests use the Country fixtures.
Now because the code in the City class I want to test does reference it's parent Region and Country, in the tests for City I declare that I need the Country, Region and City fixtures.
Benefits
- Tests don't have to worry about loading or cleaning data - they just test. Likewise the test framework takes care of loading and cleaning test data. This is good separation of concerns.
- Each test runs on clean data, so tests are independent. Thus order is unimportant and it be possible to run a single test class or even method on its own.
- Test fixtures files are version controlled along with the tests and the code they test
- Test fixture data is human readable and editable in a text editor along with the code and tests
Objections
Why not just maintain the data in a test schema?
- It's fragile. Teams need to constantly share and update data from each other, which can cause a data merge pain.
- Each test does not run on clean data, unless there's an explicit cleanup part of the test, which is not good separation of concerns and error-prone.
- It's not easily version controlled.
- It's not easily searchable and causes a mental "context switch" going from text editor to database client.
- Grails is doing a good job of abstracting the database - do we want to force people to resort to the database client afterall?
Why not just run tests on the development database?
- It's fragile for the reasons as above. Plus, the data almost certainly will change and break tests.
Isn't going to be pain maintaining both development and test data?
- Well there is an overhead for sure, but you only need as much test fixture data as you have functionality to test and if you've already seen the unit testing light, then you'll know it's worth it.
What about this or that Java/Groovy framework that already does this kind of thing?
- Great - maybe we can use it?
Solution discussion
This area is a whiteboard for discussing possible solutions
We need to agree on
- A term for the fixtures
- A file type and name convetion
- A syntax
- A directory
- Test case data declaration
- Implementation details
Name
We think "dataset" may be a better name than "fixture"
File type and name convention
Probably groovy files named domain model name + "Dataset".
Syntax
Maybe something like
class CountryDataset {
def dataset = {
[ italy: [name: "Italy", code: "IT"],
france: [name: "France", code: "FR"]
]
}
}
class RegionDataset {
def dataset = {
[ tuscany: [name: "Tuscany", country: italy],
provence: [name: "Provence", country: france ]
]
}
}
class CityDataset {
def dataset = {
[ florence: [name: "Florence", region: tuscany],
marseilles: name: "Marseilles", region: provence]
]
}
}
Data can be auto-generated:
class UserDataset {
def dataset = {
def users = []
for (i in 0..100) {
users << [name: "user_${i}", password: "dontcare"]
}
return users
}
}
Questions
What about the version and id properties? Can they be explicity stated in the datasets. Test cases will certainly obtain known instances by id in test cases.
How are dependencies resolved? Is it necessary to prefix references with domain model name like
[ florence: [name: "Florence", region: Region.tuscany],
or can the Region dataset "instances" be somehow exposed to the City dataset instances, in this example?
Should we go with Map literals or are constructors better? In other words,
florence: [name: "Florence", region: tuscany]
is less verbose but pretty close to
florence: new Region(name: "Florence", region: tuscany)
A directory
We could reorganise the directory structure to house all test atrifacts under grails-tests (like Rails):
grails-app -- for development/production env grails-tests -- for test env unit webtest (ie, functional) datasets
or is there a Maven2-style structure?
Test case data declaration
Test case classes can declare that they require some or all fixtures, eg:
class CountryTests extends GroovyTestCase {
def requireDataset = Country
}
or
class CityTests extends GroovyTestCase { def requireDataset = [ Country, Region, City ] // or lazily as "def requireDataset = ALL" }
Implementation details
We discussed DbUnit, but feet it's going to be more integrated and conceptually cohesive with Hibernate.
Comments (7)
Oct 02, 2006
graeme says:
Marc and I have kind of been debating the need for this. Is it overkill? Can you...Marc and I have kind of been debating the need for this. Is it overkill? Can you put forward the arguments as to why you think this is necessary and can't just be done using an enviornment specific bootstrap class?
Otherwise with regards to the syntax, using the map syntax is a little ugly. I would do it with Groovy's builder syntax instead. Also I think class names are unnecesary and scripts should be used instead:
dataset { tuscany(name:"Tuscany", country:italy) provence(name":Provence", country:france) }dataset { florence(name:"Florence", region: tuscany) marseilles(name:"Marseilles", region: provence) }Oct 02, 2006
graeme says:
And for dynamic construction that would be: dataset { def users = for (i ...And for dynamic construction that would be:
dataset { def users = [] for (i in 0..100) { user(name:"user_${i}", password: "dontcare") } }Oct 02, 2006
Marc Palmer says:
Just to clarify Graeme's comments about bootstrap, we could do this trivially us...Just to clarify Graeme's comments about bootstrap, we could do this trivially using the same conventions we use elsewhere and no new DSL:
class TestBootstrap { def init = { new Country( name: "Italy").save() } }This would be run only when in the Test environment. This has the huge benefit that we use coding style we already have, we use all the same concepts, and we use the same mechanism for references between data entities as you do when working with the domain classes everywhere else in the application.
Oct 02, 2006
graeme says:
The main downside that I can of having a single TestBootStrap class is that it l...The main downside that I can of having a single TestBootStrap class is that it limits you to a single "dataset" in terms of the test data you're working with.
Whilst with the dataset approach you can say:
def dataSets = [First,Second, Third]
To load an arbitrary data in. I'm not sure how severe a limitation this is.
Oct 02, 2006
Maurice Nicholson says:
I'm not really emotionally attached to the original syntax, and the builder styl...I'm not really emotionally attached to the original syntax, and the builder style looks good.
However there are serious limitations to doing it in an environment specific bootstrap class.
Unit testing is all about testing "units" - that is small discrete units of code. It should be possible to run tests in any order, or in isolation, or just delete some and run the rest. If that is not possible, then you are no longer testing units and your unit tests can become a fragile mess.
Now consider how it would work if there was a single set of data created once at the start of ALL tests, for ALL tests to share. One of two things happens:
either
1) your tests do ZERO maintanence of the test data itself. So tests become highly coupled, because prior tests are maniuplating data that subsequent tests will use. They have to run in order, which is not always that easy. You always have to run them all, rather than being able to run them one at time. What happends when I add new tests, which breaks the existing order and has a side effect on a subsequent test ... hmmm not very unit-y.
or
2) you maintain the test data within each test, maybe setting-up or resetting it at the start or end of tests. Ok this is a bit better, at least tests are isolated from each other, but the code has low cohesion, so again not great.
The obvious solution is to abstract this setup/cleanup code cleanup out into the test method lifycycle (using say the setUp/tearDown methods), which leaves the test methods themselves soley concentrating on testing, and importantly, run in a known state every time.
Furthermore, the granularity of delcaring a domain class's dataset in separate files means that tests only need to use the data that they require. In other words, some tests will only require one or two tables populated whereas others will require more.
I really don't think its overkill, but I will admit that its not a feature that comes bundled with many MVC frameworks. That said, though many developers still do not write tests (shame
), comprehensive baked in support for testing could be the deciding factor for some organisations.
Of course it could always be a plugin if you don't think it belongs in core.
Oct 02, 2006
Marc Palmer says:
Why can't we use TestBootstrap, and the bootstrap is run for every test against ...Why can't we use TestBootstrap, and the bootstrap is run for every test against a blank DB?
OK there might be a bit of a performance hit but it's a simple solution. If you are sanely testing against a proven in-memory DB implementation, coupled with Hibernate's caching, it shouldn't be much of an issue.
I suppose there is the hibernate init overhead, but couldn't we have some smarts in there to keep the same hibernate config going across all tests, but have it drop all data and re-run the TestBootstrap for each test?
Oct 02, 2006
Maurice Nicholson says:
Well, that is kind of what I'm talking about, although not dedicated to testing....Well, that is kind of what I'm talking about, although not dedicated to testing. It could could work, but what about when there is other stuff in the bootstraper class? All I am interested in is the persistent domain models.
Actually I would not be suprised if people begin to ask for this functionality outside of Grails, like they are with GORM, because it's a common requirement and Groovy seems to be a popular tool for unit testing traditional Java applications. But I'm not saying that should be a requirement or anything.
People should be able to choose the database using TestDataSource.groovy (or their chosen env). Personally I prefer a dedicated test database of the same implementation as my dev and production, because I know that way I won't be suprised by case-(in)sensitivy issues when it's time to go live, for example.