Backtesting
We are currently working on this guide.
Backtesting with NautilusTrader is a methodical simulation process that replicates trading
activities using a specific system implementation. This system is composed of various components
including the built-in engines, Cache
, MessageBus, Portfolio
, Actors, Strategies, Execution Algorithms,
and other user-defined modules. The entire trading simulation is predicated on a stream of historical data processed by a
BacktestEngine
. Once this data stream is exhausted, the engine concludes its operation, producing
detailed results and performance metrics for in-depth analysis.
It's important to recognize that NautilusTrader offers two distinct API levels for setting up and conducting backtests:
- High-level API: Uses a
BacktestNode
and configuration objects (BacktestEngine
s are used internally). - Low-level API: Uses a
BacktestEngine
directly with more "manual" setup.
Choosing an API level
Consider using the low-level API when:
- Your entire data stream can be processed within the available machine resources (e.g., RAM).
- You prefer not to store data in the Nautilus-specific Parquet format.
- You have a specific need or preference to retain raw data in its original format (e.g., CSV, binary, etc.).
- You require fine-grained control over the
BacktestEngine
, such as the ability to re-run backtests on identical datasets while swapping out components (e.g., actors or strategies) or adjusting parameter configurations.
Consider using the high-level API when:
- Your data stream exceeds available memory, requiring streaming data in batches.
- You want to leverage the performance and convenience of the
ParquetDataCatalog
for storing data in the Nautilus-specific Parquet format. - You value the flexibility and functionality of passing configuration objects to define and manage multiple backtest runs across various engines simultaneously.
Low-level API
The low-level API centers around a BacktestEngine
, where inputs are initialized and added manually via a Python script.
An instantiated BacktestEngine
can accept the following:
- Lists of
Data
objects, which are automatically sorted into monotonic order based onts_init
. - Multiple venues, manually initialized.
- Multiple actors, manually initialized and added.
- Multiple execution algorithms, manually initialized and added.
This approach offers detailed control over the backtesting process, allowing you to manually configure each component.
High-level API
The high-level API centers around a BacktestNode
, which orchestrates the management of multiple BacktestEngine
instances,
each defined by a BacktestRunConfig
. Multiple configurations can be bundled into a list and processed by the node in one run.
Each BacktestRunConfig
object consists of the following:
- A list of
BacktestDataConfig
objects. - A list of
BacktestVenueConfig
objects. - A list of
ImportableActorConfig
objects. - A list of
ImportableStrategyConfig
objects. - A list of
ImportableExecAlgorithmConfig
objects. - An optional
ImportableControllerConfig
object. - An optional
BacktestEngineConfig
object, with a default configuration if not specified.
Data
Data provided for backtesting drives the execution flow. Since a variety of data types can be used, it's crucial that your venue configurations align with the data being provided for backtesting. Mismatches between data and configuration can lead to unexpected behavior during execution.
NautilusTrader is primarily designed and optimized for order book data, which provides a complete representation of every price level or order in the market, reflecting the real-time behavior of a trading venue. This ensures the highest level of execution granularity and realism. However, if granular order book data is either not available or necessary, then the platform has the capability of processing market data in the following descending order of detail:
-
Order Book Data/Deltas (L3 market-by-order):
- Providing comprehensive market depth and detailed order flow, with visibility of all individual orders.
-
Order Book Data/Deltas (L2 market-by-price):
- Providing market depth visibility across all price levels.
-
Quote Ticks (L1 market-by-price):
- Representing the "top of the book" by capturing only the best bid and ask prices and sizes.
-
Trade Ticks:
- Reflecting actual executed trades, offering a precise view of transaction activity.
-
Bars:
- Aggregating trading activity - typically over fixed time intervals, such as 1-minute, 1-hour, or 1-day.
Choosing data: Cost vs. Accuracy
For many trading strategies, bar data (e.g., 1-minute) can be sufficient for backtesting and strategy development. This is particularly important because bar data is typically much more accessible and cost-effective compared to tick or order book data.
Given this practical reality, Nautilus is designed to support bar-based backtesting with advanced features that maximize simulation accuracy, even when working with lower granularity data.
For some trading strategies, it can be practical to start development with bar data to validate core trading ideas. If the strategy looks promising, but is more sensitive to precise execution timing (e.g., requires fills at specific prices between OHLC levels, or uses tight take-profit/stop-loss levels), you can then invest in higher granularity data for more accurate validation.