R or python for finance

If you want to start an argument between two financial data scientists, ask them which coding language they prefer to use: R or Python? If they have a difference of opinion, then a heated and emotional debate will inevitably follow. But who is right?

R is now the best used language for data science

Traditionally R was more common in the data science community, due to its popularity on university campuses. Many neophyte data scientists will therefore have already used R during their Msc or Phd, indeed I first used the language myself when completing my Masters dissertation.

However Python has now caught up. In 2016 R was the most popular language amongst data scientists, with Python coming a close second. By 2018 their positions had been reversed, with two thirds of survey respondents preferring Python versus 49% for R.

R always had the best packages

Historically R has had a wider variety of packages for statistical analysis and visualisation. Popular libraries include dplyr, zoo and ggplot2; and there are dozens more. Python has been slow to catch up, but there are now plenty of available packages for budding data scientists, such as pandas, scipy, and matplotlib.

It's easy to code badly in R 

R is widely cited as being difficult to learn if you are used to more mainstream languages. Also, in my experiece it is very easy to write bad code in R, and somewhat easier to write good Python. Object orientated programming (OOP) in R is particularly ugly. The OOP in R is bolted on as an afterthought, rather than being an integral part of the language as in Python.

Both R and Python are dynamically typed languages. This makes them very flexible, but also potentially error-prone. However the weak typing in R is particularly dangerous. R functions have a nasty habit of returning unexpected type of objects, and are subsequently too relaxed about accepting the wrong type as an argument. This makes it difficult to debug code, as the program will often crash thousands of lines after the actual error has occured.

R is slower than Python 

Java programmers are always sneering about how slow Python is. But Python is still significantly faster than R; by roughly a factor of four. Both languages can be speeded up to a degree by embedding C or C++ code, but the interface for doing this in R is much clunkier than for Python.

Python is better for trading systems

Being able to use the same language in research and production environments is a major advantage for rapid deployment.  I have used R for automated live trading systems in the past, but I would not do so again. The memory management in R is poor, and the typing issues mentioned above can lead to weird errors that are hard to debug. Python is easily up to the job of running live trading strategies, as long as latency is not critical (in which case C++ or Java might be better options). It does have some well known issues, such as the Global Interpreter Lock, but in general it is a pretty robust platform for running production code.

In conclusion...

For pure data science R still has a slight edge over Python, although the gap has closed significantly. Nevertheless, the wider applications of Python make it the better all-round choice. If you’re at the start of your career then learning Python will also give you more options in the future.

Robert Carver is the former head of fixed income at quantitative hedge fund AHL. He began using R in 2005, and switched to Python in 2011. Robert is the author of 'Systematic Trading' and 'Smart Portfolios’

Have a confidential story, tip, or comment you’d like to share? Contact: in the first instance. Whatsapp/Signal/Telegram also available. Bear with us if you leave a comment at the bottom of this article: all our comments are moderated by human beings. Sometimes these humans might be asleep, or away from their desks, so it may take a while for your comment to appear. Eventually it will – unless it’s offensive or libelous (in which case it won’t.)

R is awesome. I would bet it is used by at least a few people in every financial firm.

A buddy once told me R is just a sandbox. I like that analogy.

You use it to fuck around with the data to figure out what you want to do. Then you can either use that for reporting (which I did often), or you can recode it into Python or some other more “robust” language for production. (When I say “robust” I don’t mean in terms of statistics or finance. R is obviously good for that. I mean in terms of making production-level code to run reliably and repeatedly in various environments.)

That being said, Shiny made some pretty impressive stuff I could put on the intranet for internal use. (Which I did sparingly.) I don’t think it’d be a stretch to have it as a front-end for external use. (Though I am not sure what security issues that may raise. I have no idea about any of that stuff...)

Is R programming useful for finance?

R is being widely used for credit risk analysis at firms like ANZ and portfolio management. Finance industries are also leveraging the time-series statistical processes of R, to model the movement of their stock-market and predict the prices of shares.

How is R and Python used in finance?

R: R is mostly used by data scientists as it is used only for data analysis. But compared to Python, it has been outraced. As finance involves the calculation and analysis of data R would be best for you. Python: Python is being used in almost all industries for data science, machine learning, and developing.

Is Python good for finance majors?

The Bottom Line. Python is an incredibly versatile language with a very simple syntax and great readability. It is used for building highly scalable platforms and web-based applications, and is extremely useful in a burdened industry such as finance.

Do quants use Python or R?

Python, MATLAB and R All three are mainly used for prototyping quant models, especially in hedge funds and quant trading groups within banks. Quant traders/researchers write their prototype code in these languages.