Backtesting: Python is The Way

Matthias Frank, Head of Engineering at Sigtech presents his reasons why Python will remain the number one choice for Quants and Traders

Python is set to remain the programming language of choice for backtesting investment strategies, as new research reveals the world’s most popular systematic trading language is set to become even better. Guido Van Rossum, Python’s creator, revealed in a recently published paper that the language can become up to five times faster, which makes backtesting with Python the optimal solution for quants.

Since its inception in the early 90s, Python has taken the finance world by storm. Initially, it was used by only a handful of finance companies and often Python’s usage was reduced to scripting or glue programming, holding different applications together.

Over time though, Python has grown steadily in popularity due its ease of accessibility, and the huge number of qualities open-source packages available, such as pandas and NumPy, as well as its machine learning, data science and AI applications. Today, Python is the number one programming language for modern fintech companies and will soon surpass C++ as the second most popular language in finance generally.

Figure 1.: The most popular programming languages for finance and FinTech

However, Python does have its drawbacks. In one word: speed. Developers and researchers transitioning from more traditional compiled programming languages like Java, C++ and C often criticize Python’s runtime performance – but with careful optimization, speed need not be an issue.

Back in 2013, SigTech decided to build its systematic investment strategy platform from the ground up in Python. To ensure optimal runtime of our platform, we continuously monitor strategy runtime performance benchmarks for our codebase. These benchmarks are critical in maintaining the current performance status quo and to protect the framework against sub-optimal code changes.

Since the rise of cloud computing, we see even more compelling reasons to continue to run our framework in Python. Some parts previously coded in Python have been moved to a cloud computing infrastructure or were optimized using Cython. This has already improved SigTech’s backtesting engine performance by 2x across the board, and in some areas, even greater improvements have been seen. For example, in equity universe construction where the cloud compute delivers 10x improvement compared to an equivalent Python implementation.

Mark Shannon, one of Microsoft’s most senior programmers who works alongside Guido van Rossum has worked on Python performance for some time, with previous projects, such as HotPy for a just-in-time compiler for CPython.

He has his own Faster CPython repository where he wrote that “we want to speed up CPython by a factor of 5 over the next four releases.” Although Shannon envisages a JIT compiler eventually, this would not come until Python 3.12 in his plan. Python 3.10, currently in beta, is scheduled for release in October this year. The release schedule is roughly annual, so we might expect 3.11 in October 2022, and 3.12 in October 2023.

SigTech users have already seen a pronounced improvement in terms of speed over the last few years, and with these upcoming changes to Python, we expect that trend to continue.

The chart below shows how the runtime for a 10-year backtest of a systematic strategy using minute bar data and trading has decreased markedly over the last few years – and within another two, we expect the backtest to be complete in under two minutes – an improvement of >95%.

Figure 2: Graph to show minutes taken to run backtest over the years

The gap between Python’s performance and other compiled languages is narrowing. The unrivalled ecosystem that Python offers continues to make it the best choice for programmers and traders looking to create innovative, data-centric financial models.

Read Matthias’ original blog post here