NautilusTrader
Developer Guide

Testing

Our automated tests serve as executable specifications for the trading platform. A healthy suite documents intended behaviour, gives contributors confidence to refactor, and catches regressions before they reach production. Tests also double as living examples that clarify complex flows and provide rapid CI feedback so issues surface early.

The suite covers these categories:

  • Unit tests
  • Integration tests
  • Acceptance tests
  • Performance tests
  • Property-based tests
  • Fuzzing
  • Memory leak tests

Property-based testing

Property testing verifies that logic holds for all valid inputs, not just hand-picked examples. We use proptest in Rust to enforce invariants.

  • Use cases: Core domain types (Price, Quantity, UnixNanos), accounting engines, matching engines, and state machines.
  • Example invariants:
    • Round-trip serialization: parse(to_string(value)) == value
    • Inverse operations: (A + B) - B == A
    • Transitivity: If A < B and B < C, then A < C

Fuzzing

Fuzzing introduces unstructured or malicious data to the system to verify it fails gracefully.

  • Use cases: Network boundaries, exchange data parsers (JSON, FIX, WebSocket feeds), and complex state machines.
  • Goal: The system returns a Result::Err and never panics, hangs, or leaks memory when encountering malformed data.

When building or modifying core types, write property tests to cover the mathematical boundaries.

Performance tests help evolve performance-critical components.

Run tests with pytest, our primary test runner. Use parametrized tests and fixtures (e.g., @pytest.mark.parametrize) to avoid repetitive code and improve clarity.

Running tests

Legacy Python tests

The legacy test suite lives under tests/ at the repository root and tests the Cython-based package. From the repository root:

make pytest
# or
uv run --active --no-sync pytest --new-first --failed-first

Python tests

The Python test suite lives under python/tests/ and tests the Rust-backed PyO3 package. It requires a built extension module (make build-debug-v2) and uses its own virtualenv under python/.venv/.

make pytest-v2

The Makefile target isolates certain test modules in separate pytest processes to avoid global Rust state conflicts. Use make pytest-v2 rather than invoking pytest directly.

For performance tests:

make test-performance
# or
uv run --active --no-sync pytest tests/performance_tests --benchmark-disable-gc --codspeed

The --benchmark-disable-gc flag prevents garbage collection from skewing results. Run performance tests in isolation (not with unit tests) to avoid interference.

Rust tests

make cargo-test
# or
cargo nextest run --workspace --features "python,ffi,high-precision,defi" --cargo-profile nextest

Testing with optional features

Use EXTRA_FEATURES to include optional features like capnp or hypersync:

# Test with capnp feature
make cargo-test EXTRA_FEATURES="capnp"

# Test with multiple features
make cargo-test EXTRA_FEATURES="capnp hypersync"

# Legacy shorthand for hypersync
make cargo-test HYPERSYNC=true

# Test specific crate with features
make cargo-test-crate-nautilus-serialization FEATURES="capnp"

IDE integration

  • PyCharm: Right-click the tests folder or file → "Run pytest".
  • VS Code: Use the Python Test Explorer extension.

Test style

General

  • Name test functions after what they exercise; you do not need to encode the expected assertions in the name.
  • Add docstrings when they clarify setup, scenarios, or expectations.
  • Group assertions when possible: perform all setup/act steps first, then assert together to avoid the act-assert-act smell.
  • Use unwrap, expect, or direct panic!/assert calls inside tests; clarity and conciseness matter more than defensive error handling here.
  • Do not capture log output to assert on log messages. Log capture in tests is fragile because loggers are global state, test execution order is non-deterministic, and the assertions break when log wording changes. Instead, verify the observable behavior (return values, state changes, side effects) that the log message reflects.

Python tests (python/tests/)

Use pytest-style free functions and fixtures. Do not use test classes.

  • Write each test as a standalone def test_*() function.
  • Use @pytest.fixture for shared setup (instruments, engine instances, data). Prefer yield fixtures when teardown is needed (e.g., engine.dispose()).
  • Use @pytest.mark.parametrize to cover multiple inputs without duplicating test bodies.
  • Import model types from nautilus_trader.model, not from nautilus_trader.core.nautilus_pyo3.
  • Test providers live in python/tests/providers.py. Use TestInstrumentProvider and TestDataProvider for common instruments and data.
  • Mark tests that depend on unfinished features with @pytest.mark.skip(reason="WIP: <description>") rather than deleting them.

Legacy Python tests (tests/)

The legacy test suite uses a mix of test classes and free functions. New tests added to this suite may follow either pattern, but free functions with fixtures are preferred for new files.

Rust

For Rust-specific test conventions (module structure, #[rstest], parameterization), see the Rust guide.

Waiting for asynchronous effects

When waiting for background work to complete, prefer the polling helpers await eventually(...) from nautilus_trader.test_kit.functions and wait_until_async(...) from nautilus_common::testing instead of arbitrary sleeps. They surface failures faster and reduce flakiness in CI because they stop as soon as the condition is satisfied or time out with a useful error.

Mocks

Prefer hand-written stubs that return fixed values over mocking frameworks. Use MagicMock only when you need to assert call counts/arguments or simulate complex state changes. Avoid mocking the objects you're actually testing.

Code coverage

We generate coverage reports with coverage and publish them to codecov.

Aim for high coverage without sacrificing appropriate error handling or causing "test induced damage" to the architecture.

Some branches remain untestable without modifying production behaviour. For example, a final condition in a defensive if-else block may only trigger for unexpected values; leave these checks in place so future changes can exercise them if needed.

Design-time exceptions can also be impractical to test, so 100% coverage is not the target.

Excluded code coverage

We use pragma: no cover comments to exclude code from coverage when tests would be redundant. Typical examples include:

  • Asserting an abstract method raises NotImplementedError when called.
  • Asserting the final condition check of an if-else block when impossible to test (as above).

Such tests are expensive to maintain because they must track refactors while providing little value. Keep concrete implementations of abstract methods fully covered. Remove pragma: no cover when it no longer applies and restrict its use to the cases above.

Debugging Rust tests

Use the default test configuration to debug Rust tests.

To run the full suite with debug symbols for later, run make cargo-test-debug instead of make cargo-test.

In IntelliJ IDEA, adjust the run configuration for parametrised #[rstest] cases so it reads test --package nautilus-model --lib data::bar::tests::test_get_time_bar_start::case_1 (remove -- --exact and append ::case_n where n starts at 1). This workaround matches the behaviour explained here.

In VS Code you can pick the specific test case to debug directly.

Python + Rust Mixed Debugging

This workflow lets you debug Python and Rust code simultaneously from a Jupyter notebook inside VS Code.

Setup

Install these VS Code extensions: Rust Analyzer, CodeLLDB, Python, Jupyter.

Step 0: Compile nautilus_trader with debug symbols

cd nautilus_trader && make build-debug-pyo3

Step 1: Set up debugging configuration

from nautilus_trader.test_kit.debug_helpers import setup_debugging

setup_debugging()

This command creates the required VS Code debugging configurations and starts a debugpy server for the Python debugger.

By default setup_debugging() expects the .vscode folder one level above the nautilus_trader root directory. Adjust the target location if your workspace layout differs.

Step 2: Set breakpoints

  • Python breakpoints: Set in VS Code in the Python source files.
  • Rust breakpoints: Set in VS Code in the Rust source files.

Step 3: Start mixed debugging

  1. In VS Code select the "Debug Jupyter + Rust (Mixed)" configuration.
  2. Start debugging (F5) or press the green run arrow.
  3. Both Python and Rust debuggers attach to your Jupyter session.

Step 4: Execute code

Run Jupyter notebook cells that call Rust functions. The debugger stops at breakpoints in both Python and Rust code.

Available configurations

setup_debugging() creates these VS Code configurations:

  • Debug Jupyter + Rust (Mixed) - Mixed debugging for Jupyter notebooks.
  • Jupyter Mixed Debugging (Python) - Python-only debugging for notebooks.
  • Rust Debugger (for Jupyter debugging) - Rust-only debugging for notebooks.

Example

Open and run the example notebook: debug_mixed_jupyter.ipynb.

Reference

Data type testing

Each data type flows through multiple layers of the platform. The table below shows where existing types are tested, so new types can follow the same pattern.

Test layer matrix

LayerLocationWhat it covers
DataEngine subscribecrates/data/tests/engine.rsEngine processes subscribe/unsubscribe commands correctly.
DataEngine publishcrates/data/tests/engine.rsEngine routes published data to the message bus.
DataActor subscribecrates/common/src/actor/tests.rsActor subscribes and receives data via typed publish.
DataActor unsubscribecrates/common/src/actor/tests.rsActor stops receiving data after unsubscribe.
PyO3 actor dispatchcrates/common/src/python/actor.rsRust handler dispatches to Python on_* method.
Python Actor subscribetests/unit_tests/common/test_actor.pyPython actor subscribes; command count increments.
Python Actor unsubtests/unit_tests/common/test_actor.pyPython actor unsubscribes; subscription list clears.
Backtest clientnautilus_trader/backtest/data_client.pyxBacktest client overrides base subscribe/unsubscribe.
Adapter live testsdocs/developer_guide/spec_data_testing.mdLive data acceptance tests (DataTester).

Coverage per data type

The following table shows which layers have test coverage for each data type. Use this as a checklist when adding a new type.

Data typeEngineActor (Rust)PyO3 dispatchActor (Python)Backtest clientAdapter spec
InstrumentAny
OrderBookDeltas
OrderBook
QuoteTick
TradeTick
Bar
MarkPriceUpdate
IndexPriceUpdate
FundingRateUpdate
InstrumentStatus
InstrumentClose
OptionGreeks
OptionChainSlice--
CustomData-

OptionChainSlice is assembled by the DataEngine's OptionChainManager from per-instrument greeks and quote subscriptions. It does not have its own engine subscribe command or backtest client override.

Adding a new data type

When introducing a new data type, add tests at each layer:

  1. DataEngine (crates/data/tests/engine.rs): Add test_execute_subscribe_<type> and test_execute_unsubscribe_<type> tests. Follow the pattern in existing subscribe tests: register client, build command, call engine.execute, assert subscription list.

  2. DataActor Rust (crates/common/src/actor/tests.rs):

    • Add received_<type>: Vec<Type> field to TestDataActor.
    • Implement the on_<type> handler in the DataActor trait impl.
    • Add test_subscribe_and_receive_<type> and test_unsubscribe_<type> tests.
    • Use the typed publish function (msgbus::publish_<type>), not publish_any, for types that use TypedHandler routing.
  3. PyO3 actor dispatch (crates/common/src/python/actor.rs):

    • Add dispatch_on_<type> method that calls py_self.call_method1("on_<type>", ...).
    • Add on_<type> in the DataActor trait impl that calls the dispatch method.
    • Add #[pyo3(name = "on_<type>")] method in the #[pymethods] block.
    • Add on_<type> to RustTestDataActor wrapper and the inline Python test class.
    • Add handler test and dispatch test.
  4. Python Actor (tests/unit_tests/common/test_actor.py):

    • Add test_subscribe_<type> and test_unsubscribe_<type> tests.
    • Assert actor.subscribed_<type>() returns expected entries after subscribe and is empty after unsubscribe.
  5. Backtest client (nautilus_trader/backtest/data_client.pyx): Override subscribe_<type> and unsubscribe_<type> if the base MarketDataClient raises NotImplementedError for the method.

  6. Documentation: Add entries to actors.md callback table, strategies.md handler signatures, adapters.md subscribe method stubs, and spec_data_testing.md test cards.

Search for an existing type like instrument_close or funding_rate across all six layers to find concrete examples of the patterns described above.

On this page