Best Practices for Python Unit Testing
First of all, let's distinguish unit testing from integration testing:
- A unit test verifies a unit of code in isolation.
- An integration test verifies several units of code in conjunction.
Crucially, unit tests should not make network requests, modify database tables, or alter files on disk unless there's an obvious reason for doing so. Unit tests use mocked or stubbed dependencies, optionally verifying that they are called correctly, and make assertions on the results of a single function or module. Integration tests, on the other hand, use concrete dependencies and make assertions on the results of a larger system.
You might disagree on the finer points, but this distinction works well for me.
I won't go over unit testing basics, like how to run pytest or how to use the @patch
decorator, assuming you're familiar already.
Avoid module-level globals
Imagine we have a class defined in validator.py:
import os
import fastjsonschema
import requests
schema_url = os.environ['SCHEMA_URL']
class Validator:
def __init__(self):
schema = requests.get(schema_url).json()
self.validator = fastjsonschema.compile(schema)
def validate(self, event: dict) -> dict:
return self.validator(event)
And we have a module named my_lambda.py that uses the class:
from validator import Validator
validator = Validator()
def handle_event(event: dict, context: object) -> dict:
validator.validate(event)
return {'body': 'Goodbye!', 'status': 200}
And we want to test the module:
from unittest.mock import Mock, patch
from my_lambda import handle_event
@patch('my_lambda.Validator', autospec=True)
def test_handle_event(mock_validator: Mock):
result = handle_event({'message': 'Hello!'}, None)
assert result == {'body': 'Goodbye!', 'status': 200}
We're patching the Validator
class and not the validator
property because the former will allow us to prevent the network request when Validator()
is called, whereas the latter would only allow us to act afterwards.
As it's written, this test will fail during collection when my_lambda.py is imported, before the patch is even applied:
test_my_lambda.py:3: in <module>
from my_lambda import handle_event
my_lambda.py:1: in <module>
from validator import Validator
validator.py:6: in <module>
schema_url = os.environ['SCHEMA_URL']
/usr/local/Cellar/python@3.9/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/os.py:679: in __getitem__
raise KeyError(key) from None
E KeyError: 'SCHEMA_URL'
We could set SCHEMA_URL
in our .env file, but that's an extra step that would be required for every developer to run the test. Instead, let's try setting it right before the import:
import os
from unittest.mock import Mock, patch
os.environ['SCHEMA_URL'] = 'foo'
from my_lambda import handle_event
This will also fail during collection:
test_my_lambda.py:7: in <module>
from my_lambda import handle_event
my_lambda.py:3: in <module>
validator = Validator()
validator.py:10: in __init__
schema = requests.get(schema_url).json()
...
E requests.exceptions.MissingSchema: Invalid URL 'foo': No scheme supplied. Perhaps you meant http://foo?
Even though we're patching the Validator
class, it's still trying to make a network request because Validator()
is called when my_lambda.py is imported, before the patch is even applied. If I sound like a broken record, it's because this can be confusing:
Module-level globals are assigned when the module is imported, which makes them difficult to patch.
The correct way to fix this problem is to eliminate the global from my_lambda.py:
from validator import Validator
def handle_event(event: dict, context: object):
validator = Validator()
validator.validate(event)
Now the test passes, but it's kind of ugly. Let's remove another global, this time from validator.py:
import os
import fastjsonschema
import requests
class Validator:
def __init__(self):
schema_url = os.environ['SCHEMA_URL']
schema = requests.get(schema_url).json()
self.validator = fastjsonschema.compile(schema)
def validate(self, event: dict) -> dict:
return self.validator(event)
This allows us to simplify the test, restoring it to its original form:
from unittest.mock import Mock, patch
from my_lambda import handle_event
@patch('my_lambda.Validator', autospec=True)
def test_handle_event(mock_validator: Mock):
result = handle_event({'message': 'Hello!'}, None)
assert result == {'body': 'Goodbye!', 'status': 200}
Globals aren't always bad
If you must use a global for some reason, set it to None
initially:
from typing import Optional
from validator import Validator
validator: Optional[Validator] = None
def handle_event(event: dict, context: object):
global validator
if not validator:
validator = Validator()
validator.validate(event)
The key is to prevent any expensive computation or network requests when the module is imported to facilitate patching.
Use dependency injection
Next, let's test the class:
from validator import Validator
def test_validator_init():
result = Validator()
assert result.validator is not None
This test will fail unless SCHEMA_URL
is defined in the environment. As previously mentioned, we could add it to our .env file, or we could set it directly like we did before with os.environ['SCHEMA_URL'] = 'foo'
. It's better to use the @patch.dict
decorator because it will restore the original value after the test exits:
import os
from unittest.mock import patch
from validator import Validator
@patch.dict(os.environ, {'SCHEMA_URL': 'foo'})
def test_validator_init():
result = Validator()
assert result.validator is not None
Either way, this test will fail because it's trying to make a network request to an invalid URL:
validator.py:9: in __init__
schema = requests.get(schema_url).json()
...
E requests.exceptions.MissingSchema: Invalid URL 'foo': No scheme supplied. Perhaps you meant http://foo?
Again, we want to prevent the network request. This is where dependency injection, also known as inversion of control, comes in handy. Let's refactor validator.py to accept a few arguments:
import os
from typing import Callable, Optional
import fastjsonschema
import requests
class Validator:
def __init__(
self,
validator: Optional[Callable[[dict], dict]] = None,
schema: Optional[dict] = None,
schema_url: Optional[str] = None,
):
if validator:
self.validator = validator
return
if schema:
self.validator = fastjsonschema.compile(schema)
return
if not schema_url:
schema_url = os.environ['SCHEMA_URL']
schema = requests.get(schema_url).json()
self.validator = fastjsonschema.compile(schema)
def validate(self, event: dict) -> dict:
return self.validator(event)
It's worth a moment to understand what's happening here when Validator()
is called:
- If called with a
validator
argument, that argument is assigned toself.validator
. Any other arguments are ignored. - If called with a
schema
argument, that argument is used to createself.validator
. - If called with a
schema_url
argument, the schema is requested from the URL, the response is parsed and used to createself.validator
. - If called without any arguments, the schema is requested from the URL specified in the environment.
This allows us to easily prevent the network request when running the unit test:
import os
from unittest.mock import Mock, patch
from validator import Validator
def test_validator_init_with_validator():
mock_validator = Mock()
result = Validator(validator=mock_validator)
assert result.validator == mock_validator
The downside is we need to cover the additional complexity in the constructor. Fortunately, it can be accomplished with some straightforward patching:
@patch('fastjsonschema.compile', autospec=True)
def test_validator_init_with_schema(mock_compile):
mock_validator = Mock()
mock_compile.return_value = mock_validator
result = Validator(schema={'type': 'string'})
assert result.validator == mock_validator
@patch('fastjsonschema.compile', autospec=True)
@patch('requests.get', autospec=True)
def test_validator_init_with_schema_url(mock_get, mock_compile):
mock_schema = Mock()
mock_get.return_value.json.return_value = mock_schema
mock_validator = Mock()
mock_compile.return_value = mock_validator
result = Validator(schema_url='foo')
assert result.validator == mock_validator
mock_get.assert_called_once_with('foo')
@patch('fastjsonschema.compile', autospec=True)
@patch('requests.get', autospec=True)
@patch.dict(os.environ, {'SCHEMA_URL': 'bar'})
def test_validator_init_with_no_args(mock_get, mock_compile):
mock_schema = Mock()
mock_get.return_value.json.return_value = mock_schema
mock_validator = Mock()
mock_compile.return_value = mock_validator
result = Validator()
assert result.validator == mock_validator
mock_get.assert_called_once_with('bar')
def test_validator_validate():
mock_validator = Mock()
mock_validator.return_value = 42
validator = Validator(mock_validator)
result = validator.validate({})
assert result == 42
This last unit test is dirt simple because we don't really care how fastjsonschema is implemented. It has its own unit tests.
This article was cross-posted on Dice.com.