Unit Tests for Python CouchDB Views
I recently wrote about how to write CouchDB views in Python, because I couldn’t find any documentation online explaining a good way to do it. Today I’d like to tackle a similarly neglected topic: writing unit tests for your Python CouchDB views.
Unit Tests
If you read my previous post, then you’ve already got CouchDB using real Python code for your views (and you’re not using Python code stuffed inside string literals). CouchDB doesn’t let you call any outside functions from your views, so your views necessarily won’t have any dependencies to worry about. This makes it easy to get your views into a test harness.
The only difficult thing about testing your views is that the CouchDB’s view API is highly contractual: it calls your map and reduce functions multiple times in a specific order. Ideally, we should emulate this behavior inside of our tests in order to get tests that are helpful, easy-to-read, and thorough. This emulation behavior can be implemented as a superclass for unit tests.
from collections import defaultdict
import unittest
class MapReduceTest(unittest.TestCase):
def simulate_map(self, class_, documents):
map_results = list()
for document in documents:
for map_result in class_.map(document):
map_results.append(map_result)
return map_results
def simulate_reduce(self, class_, map_results, group=True):
map_results.sort()
map_dict = defaultdict(list)
reduce_results = dict()
if group:
# Group the map results by key:
for map_result in map_results:
key = map_result[0]
value = map_result[1]
map_dict[key].append(value)
# Now call reduce for each key:
for key, values in map_dict.iteritems():
reduce_results[key] = class_.reduce(keys=None, values=values, rereduce=False)
else:
# Call reduce once for all values:
values = [map_result[1] for map_result in map_results]
reduce_results[None] = class_.reduce(keys=None, values=values, rereduce=False)
return reduce_results
The emulation of map
is pretty easy. Instead of calling map
a single
time on a single input, we want to call map
in a loop, because each call to
map
can actually emit multiple results. Moreover, we want to loop over a
list of documents as well, so that we can take the output from map
and feed
it into reduce
.
The reduce function is a little more interesting. CouchDB actually has a
pretty complicated contract for reduce
:
- It can give you input records in one big group or many small groups.
- It might group inputs by key or it might not. (It can also group by various parts of a compound key.)
- It may pass in partially reduced results along with the map results. This operation is known as rereduce.
In my implementation, I have picked the low-hanging fruit. My simulated reduce
passes in all of the input records in one call, and it has limited options for
grouping. While I’d like to implement the map
/reduce
contracts more
fully, it should be possible to start writing tests against this abstraction now
and then improve the abstraction later on, without unnecessarily breaking the
tests that I wrote in the interim.
Example
Here’s an example test:
from couchview import MapReduceTest
from couchview.stats import CountTypes
class TestStats(MapReduceTest):
def test_count_types(self):
documents = [
{"doc_type": "foo"},
{"doc_type": "foo"},
{"doc_type": "foo"},
{"doc_type": "bar"},
{"doc_type": "bar"},
]
expected_map_results = [
("foo", 1),
("foo", 1),
("foo", 1),
("bar", 1),
("bar", 1)
]
actual_map_results = self.simulate_map(CountTypes, documents)
self.assertListEqual(expected_map_results, actual_map_results)
expected_reduce_results = {
'foo': 3,
'bar': 2
}
actual_reduce_results = self.simulate_reduce(CountTypes, actual_map_results)
self.assertEqual(expected_reduce_results, actual_reduce_results)
The test is a bit verbose because of the data structures used for inputs and expected outputs, but it is also easy to read, in my opinion. I’ve written a half dozen view tests so far, and I find that they are fairly easy to write and genuinely helpful for catching and fixing errors.
There is still a lot of work that could be done here, but even this basic implementation has been useful to me in my work. If you have spent any time writing Python unit tests for views, please leave a note in the comments!