Everybody knows why duplication is bad. Usually, anyway. In tests
I’m often comfortable with duplication if it makes the tests more
readable. There has to be a line somewhere though, right? Certain
kinds of duplication are good while some are bad. It’s not always
easy to tell the difference between the good and bad kinds.
Identifying duplication as good or bad is interesting, but it’s not
what I want to focus on right now. More interesting to me is how bad
duplication is a sign that your code could use some design
improvements. Consider the following code:
Even in such a trivial example, we can improve our code’s design with
a little refactoring:
Okay, you’re right. Not all that interesting.
Duplication in production code is an obvious sign that the design can
be improved. It’s a smell in its own right, and the single best thing
you can do at that point is to get rid of it.
What is interesting to me is that duplication in test code can be a
sign that your design can be improved. Let’s take a look at an
example that came up at work the other day (names changed to protect
the NDA-bound). It’s an action that shows a list of published posts to
unauthenticated users, and allows authenticated users to view either
all posts or only certain filtered posts.
Not too bad. What do the specs look like? In order to fully test
this action, we need to write six examples:
- one when not logged in
- one when logged in, no filter specified
- one when published filter is specified
- one when pending filter is specified
- one when rejected filter is specified
- one when an invalid filter is specified
I haven’t made up my mind if that spec is dreadful or not. The whole
reason I’m writing this blog is that there was something about it I
didn’t like, but it didn’t jump out at me.
(Please note that the problem I’m illustrating here has nothing to do
with my use of mocks. You’d have to write all these tests when
state-based testing as well. The only difference is that your way
requires more setup and runs slower :P)
Like I said, the duplication wasn’t obvious to me. I have a
feeling that it won’t be obvious to you either, so let me take a
moment to exaggerate the hell out of the code:
That case statement is essentially what’s going on in our test code.
When we pass a certain param, a particular method gets called.
Repeat for every possible case. Ugh.
My solution was to move the filter selection to the model. It’s
actually a very simple change all around:
This reduces the number of required examples from 6 down to 2:
Of course, we still need to write specs for Post.filtered_repo, so we
haven’t reduced the total number of specs. So what’s the point?
I wish I could give you some hardcore, compelling reasons as to why
this is better. The best I can do this very moment is point to the
resulting code and hope that you agree it’s a significant
improvement. Still, here are a couple points:
Skinnier controller, fatter model
The controller already has a bunch of work to do. It has to populate
instance variables for the view, render the proper template, and
return an HTTP status code. It’s almost always a good idea to assign
a task to another object if possible.
Higher signal-noise ratio in tests
All of the controller’s responsibilities need to be tested, as well.
In fact, the full spec would look like:
Including examples for each filter only increases the spec’s length
and muddles its intent. The short, sweet spec above clearly expresses
its intent. Adding four more examples that look nearly identical
would make my eyes glaze over.
Focused tests on the model
This is a combination of the first two points. I’m sure you can
imagine the examples you might write for Post.filtered_repo. Because
Post.filtered_repo has one responsibility, the spec for it would be
small, cohesive and easy to understand.
Conclusion
I hope that gives you a bit of insight into the kinds of design
improvements you can make when you listen to your test code. What
other examples do you have?