Whats faster matlab or python?
You might find some useful results at the bottom of this link Show http://wiki.scipy.org/PerformancePython From the introduction,
It also compares MATLAB and seems to show similar speeds to when using Python and NumPy. Of course this is only a specific example, your application might be allow better or worse performance. There is no harm in running the same test on both and comparing. You can also compile NumPy with optimized libraries such as ATLAS which provides some BLAS/LAPACK routines. These should be of comparable speed to MATLAB. I'm not sure if the NumPy downloads are already built against it, but I think ATLAS will tune libraries to your system if you compile NumPy, http://www.scipy.org/Installing_SciPy/Windows The link has more details on what is required under the Windows platform. EDIT: If you want to find out what performs better, C or C++, it might be worth asking a new question. Although from the link above C++ has best performance. Other solutions are quite close too i.e. Pyrex, Python/Fortran (using f2py) and inline C++. The only matrix algebra under C++ I have ever done was using MTL and implementing an Extended Kalman Filter. I guess, though, in essence it depends on the libraries you are using LAPACK/BLAS and how well optimised it is. This link has a list of object-oriented numerical packages for many languages. http://www.oonumerics.org/oon/ In this note, I extend a previous post on comparing run-time speeds of various econometrics packages by
In addition to the above, I attempted to do some optimization using the Numba python module, that has been shown to yield remarkable speedups, but saw no performance improvements for my code. The computational problem considered here is a fairly large bootstrap of a simple OLS model and is described in detail in the previous post. tl;dr Time consuming econometric problems are best performed in Python or Matlab. Based on this comparison, Stata is dramatically slower (particularly when Parallel processing in either Python or Matlab). Matlab is the fastest platform when code avoids the use of certain Matlab functions (like In [1]: %matplotlib inline import warnings warnings.filterwarnings('ignore') import numpy as np import pandas as pd import statsmodels.api as sm from timeit import timeit from numba import double from numba.decorators import jit, autojit from joblib import Parallel, delayed For boostrapping standard errors, we will consider 1,000 bootstrap replicate draws. We will explore several sample sizes ($n=\begin{bmatrix}1000& 10,000& 100,000\end{bmatrix}$) for the underlying dependent and independent variables. The true parameters are $$ \beta = \begin{bmatrix} -.5 \\ .5 \\ 10\end{bmatrix} $$ In [2]: reps, beta, n_array = 1000, [-.5,.5,10], [1000, 10000, 100000] A comparison of Bootstrapping with OLS Functions¶The first comparison we will perform uses the following functions:
What is calculated It is important to note several features of these OLS functions. The Stata Multiple Threads and Parallel Processing In Stata and Matlab, the
First consider the bootstrap in Matlab¶Starting MATLAB on ZMQ socket ipc:///tmp/pymatbridge Send 'exit' command to kill the server ....MATLAB started and connected! In [4]: %%matlab -i reps,beta,n_array -o mat_time,store_beta mat_time = zeros(cols(n_array),2); for i=1:cols(n_array) n=n_array(i); row_id =1:n; X = [normrnd(10,4,[n 2]) ones(n,1)]; Y = X*beta' + normrnd(0,1,[n 1]); store_beta = zeros(reps,3); tic; for r = 1:reps this_row = randsample(row_id,n,true); est = fitlm(X(this_row,:),Y(this_row),'linear','Intercept',false); store_beta(r,:) = (est.Coefficients.Estimate)'; end mat_time(i,:) = [n toc]; end Here are the results for $N=100,000$. In [5]: matlab_results = pd.DataFrame(store_beta.copy(),columns=['b1','b2','constant']) matlab_results.describe(percentiles=[.01,.025,.5,.95,.975,.99]) Out[5]:
The same Bootstrap in Python¶Here is the python function implementing each replicate of the bootstrap. It samples with replacement from the data, calculates the OLS estimates, and saves them in a numpy matrix. In [6]: def python_boot(arg_reps,arg_row_id,arg_n,arg_X,arg_Y): store_beta = np.zeros((reps,X.shape[1])) for r in range(arg_reps): this_sample = np.random.choice(arg_row_id, size=arg_n, replace=True) # gives sampled row numbers # Define data for this replicate: X_r = arg_X[this_sample,:] Y_r = arg_Y[this_sample] # Estimate model store_beta[r,:] = sm.regression.linear_model.OLS(Y_r,X_r).fit(disp=0).params return store_beta Matlab employs a just in time compiler to translate code to machine binary executables. This substantially increases speed and is seemless from the user perspective since since it is performed automatically in the background when a script is run. The python Numba Project has developed a similar just in time compiler, with very minimal addtional coding required. One only needs to add In [7]: @jit(nogil=True) # rewriting python_boot to make function args explicit: def python_boot_numba(arg_reps,arg_row_id,arg_n,arg_X,arg_Y): store_beta = np.zeros((1000,3)) for r in range(arg_reps): this_sample = np.random.choice(arg_row_id, size=arg_n, replace=True) # gives sampled row numbers # Define data for this replicate: X_r = arg_X[this_sample,:] Y_r = arg_Y[this_sample] # Estimate model store_beta[r,:] = sm.regression.linear_model.OLS(Y_r,X_r).fit(disp=0).params return store_beta In [8]: for n in [1000,10000,100000]: row_id = range(0,n) X1 = np.random.normal(10,4,(n,1)) X2 = np.random.normal(10,4,(n,1)) X=np.append(X1,X2,1) X = np.append(X,np.tile(1,(n,1)),1) error = np.random.randn(n,1) Y = np.dot(X,beta).reshape((n,1)) + error print "Number of observations= ",n %timeit python_boot(reps,row_id,n,X,Y) %timeit python_boot_numba(reps,row_id,n,X,Y) Number of observations= 1000 1 loops, best of 3: 836 ms per loop 1 loops, best of 3: 841 ms per loop Number of observations= 10000 1 loops, best of 3: 3.58 s per loop 1 loops, best of 3: 3.56 s per loop Number of observations= 100000 1 loops, best of 3: 30.7 s per loop 1 loops, best of 3: 32.5 s per loop The numba speed (the second entry for each value of n) up actually is very small at best, exactly as predicted by the numba project's documentation since we don't have "native" python code (we call numpy functions which can't be compiled in optimal ways). Also, it looks like run times scale linearly. Next, is a printout of the results for $ N=100,000 $. In [9]: store_beta = python_boot(reps,row_id,100000,X,Y) results = pd.DataFrame(store_beta,columns=['b1','b2','constant']) results.describe(percentiles=[.01,.025,.5,.95,.975,.99]) Out[9]:
Next we'll perform the same bootstrap in Stata.¶This is run in Stata 12.1 MP (2 cores). Admittedly, this is a fairly old version of stata, so perhaps newer ones are faster.
Produced these results
Discussion¶The following chart shows the performance of each statistical package using native OLS functions In [10]: # Convert to pandas dataframe for plotting: matlab_data=pd.DataFrame(mat_time,columns=['n','Matlab Time']) python_data=pd.DataFrame([[1000,.836],[10000,3.58],[100000,30.7]],columns=['n','Python Time']) stata_data=pd.DataFrame([[1000,3.445],[10000,10.346],[100000,91.113]],columns=['n','Stata Time']) plot_data = pd.concat([matlab_data,python_data['Python Time'],stata_data['Stata Time']], axis=1) #plot_data = plot_data.set_index('n') plot_data.plot(x=['n'],y=['Matlab Time','Python Time','Stata Time']) Out[10]: Having run the bootstrap for $n = \begin{bmatrix}1,000 & 10,000 & 100,000 \end{bmatrix}$, we see that
Because we are relying on the "canned" OLS functions, the comparison above may be capturing the relative inefficiency of these functions rather than the underlying speed of the statistical platform.
A comparison of Bootstrapping (no functions)¶We will perform the exact same analysis as before with slight modifications to the functions for calculating the OLS estimates using linear algebra code for each package ($(x'x)^{-1}x'y$). For the sake of brevity, I won't show results, but instead just focus on runtimes. The Matlab code is: In [11]: %%matlab -i reps,beta,n_array -o mat_time_la,store_beta mat_time_la = zeros(cols(n_array),2); for i=1:cols(n_array) n=n_array(i); row_id =1:n; X = [normrnd(10,4,[n 2]) ones(n,1)]; Y = X*beta' + normrnd(0,1,[n 1]); store_beta = zeros(reps,3); tic; for r = 1:reps this_row = randsample(row_id,n,true); store_beta(r,:) = (inv(X(this_row,:)'*X(this_row,:))*X(this_row,:)'*Y(this_row))'; end mat_time_la(i,:) = [n toc]; end In [12]: def python_boot_la(arg_reps,arg_row_id,arg_n,arg_X,arg_Y): store_beta = np.zeros((reps,X.shape[1])) for r in range(arg_reps): this_sample = np.random.choice(arg_row_id, size=arg_n, replace=True) # gives sampled row numbers # Define data for this replicate: X_r = arg_X[this_sample,:] Y_r = arg_Y[this_sample] # Estimate model store_beta[r,:] = (np.linalg.inv(np.dot(X_r.T,X_r)).dot(np.dot(X_r.T,Y_r))).T return store_beta for n in [1000,10000,100000]: row_id = range(0,n) X1 = np.random.normal(10,4,(n,1)) X2 = np.random.normal(10,4,(n,1)) X=np.append(X1,X2,1) X = np.append(X,np.tile(1,(n,1)),1) error = np.random.randn(n,1) Y = np.dot(X,beta).reshape((n,1)) + error print "Number of observations= ",n %timeit python_boot_la(reps,row_id,n,X,Y) Number of observations= 1000 1 loops, best of 3: 309 ms per loop Number of observations= 10000 1 loops, best of 3: 2.12 s per loop Number of observations= 100000 1 loops, best of 3: 21.7 s per loop In [13]: # add results for plotting: # Convert to pandas dataframe for plotting: matlab_la_data=pd.DataFrame(mat_time_la,columns=['n','Matlab Time (LA)']) python_la_data=pd.DataFrame([[1000,.309],[10000,2.12],[100000,21.7]],columns=['n','Python Time (LA)']) plot_data_2 = pd.concat([plot_data,python_la_data['Python Time (LA)'],matlab_la_data['Matlab Time (LA)']], axis=1) #plot_data = plot_data.set_index('n') plot_data_2.plot(x=['n'],y=['Matlab Time','Python Time','Stata Time','Matlab Time (LA)','Python Time (LA)']) Out[13]: Discussion¶The linear algebra model run times for both Python and Matlab are denoted by LA. We add them to the previous figure. The python results are very similar, showing that the statsmodels OLS function is highly optimized. On the other hand, Matlab shows significant speed improvements and demonstrates how native linear algebra code is preferred for speed. For this example, Matlab is roughly three times faster than python. Stata was dropped from the comparison because of lack of support in Stata's linear algebra environment (Mata) for sampling with replacement for large $N$. Parallelizing the Bootstrap¶All of the results above are run using default settings with respect to multi-threading or using multiple processing cores. Matlab and Stata automatically take advantage of multiple cores, whereas Python doesn't. The following comparison manually creates worker pools in both Matlab and Python. The current version of Matlab requires the license for the Parallel Computing Toolbox that supports 12 workers and to get more, one would need to purchase and configure the Matlab Distributed Computer Server and the price is conditional on the number of nodes (or roughly speaking, cores) one wants to use. To get any multi-core support in Stata, you must purchase the MP version of the program. Here is the Matlab code starting a worker pool and running the bootstrap code: In [14]: %%matlab -i reps,beta,n_array -o mat_time_la_par,store_beta if isempty(gcp('nocreate')) parpool; end mat_time_la_par = zeros(cols(n_array),2); for i=1:3 n=n_array(i); row_id = 1:n; X = [normrnd(10,4,[n 2]) ones(n,1)]; Y = X*beta' + normrnd(0,1,[n 1]); store_beta = zeros(reps,3); tic parfor r = 1:reps this_row = randsample(row_id,n,true); store_beta(r,:) = (ols(Y(this_row),X(this_row,:)))'; end mat_time_la_par(i,:) = [n toc]; end delete(gcp('nocreate')); Starting parallel pool (parpool) using the 'local' profile ... connected to 12 workers. Parallel pool using the 'local' profile is shutting down. The following runs the bootstrap in parallel in Python. Note, when passing the n_jobs parameter to the Parallel procedure, one is not arbitrarily restricted due to licensing limits. In [15]: def python_boot_parallel(arg_reps): np.random.seed(arg_reps) this_sample = np.random.choice(row_id, size=n, replace=True) # gives sampled row numbers # Define data for this replicate: X_r = X[this_sample,:] Y_r = Y[this_sample] # Estimate model beta = (np.linalg.inv(np.dot(X_r.T,X_r)).dot(np.dot(X_r.T,Y_r))).T return beta for n in [1000,10000,100000]: row_id = range(0,n) X1 = np.random.normal(10,4,(n,1)) X2 = np.random.normal(10,4,(n,1)) X=np.append(X1,X2,1) X = np.append(X,np.tile(1,(n,1)),1) error = np.random.randn(n,1) Y = np.dot(X,beta).reshape((n,1)) + error print n %timeit Parallel(n_jobs=10)(delayed(python_boot_parallel) (arg_reps) for arg_reps in range(reps)) %timeit Parallel(n_jobs=20)(delayed(python_boot_parallel) (arg_reps) for arg_reps in range(reps)) %timeit Parallel(n_jobs=25)(delayed(python_boot_parallel) (arg_reps) for arg_reps in range(reps)) 1000 1 loops, best of 3: 409 ms per loop 1 loops, best of 3: 429 ms per loop 1 loops, best of 3: 445 ms per loop 10000 1 loops, best of 3: 488 ms per loop 1 loops, best of 3: 531 ms per loop 1 loops, best of 3: 469 ms per loop 100000 1 loops, best of 3: 2.91 s per loop 1 loops, best of 3: 1.92 s per loop 1 loops, best of 3: 1.95 s per loop In [16]: # use n_jobs=20 python_para_data=pd.DataFrame([[1000,.429],[10000,.531],[100000,1.92]],columns=['n','Python Parallel Time']) matlab_para_data=pd.DataFrame(mat_time_la_par,columns=['n','Matlab Parallel Time']) plot_data_3 = pd.concat([plot_data_2,python_para_data['Python Parallel Time'],matlab_para_data['Matlab Parallel Time']], axis=1) plot_data_3.plot(x=['n'],y=['Matlab Time (LA)','Python Time (LA)','Matlab Parallel Time','Python Parallel Time']) Out[16]: Discussion Both Matlab and Python show dramatic improvements when bootstrap replicates are distributed across multiple processor cores. While Matlab is the fastest for this example, Python's parallel performance is impressive. In terms of percentage gains, Python shows the largest percentage improvements in run times when the linear algebra code is distributed over multiple processors. It is notable that Matlab's Parallel Toolbox is limited to 12 workers, whereas in Python there is no limit to the number of workers. The full table of results is shown below. Out[17]:
Detailed info on machine this was run on:
Is it better to use MATLAB or Python?MATLAB has very strong mathematical calculation ability, Python is difficult to do. Python has no matrix support, but the NumPy library can be achieved. MATLAB is particularly good at signal processing, image processing, in which Python is not strong, and performance is also much worse.
Which is faster MATLAB or NumPy?The code is almost the same, but the performance is very different. The time matlab takes to complete the task is 0.252454 seconds while numpy 0.973672151566, that is almost four times more.
What is the advantage of Python over MATLAB?Python data structures are superior to Matlab data structures. Python provides more control over the organization of one's code and better namespace management. Python makes it easy to maintain multiple versions of shared libraries. Python offers more choice in graphics packages and toolsets.
How different is MATLAB from Python?The biggest technical difference between MATLAB and Python is that in MATLAB, everything is treated as an array, while in Python everything is a more general object. For instance, in MATLAB, strings are arrays of characters or arrays of strings, while in Python, strings have their own type of object called str .
|