
Calling
fortran subroutine from python is really easy using the f2py tool that
does the compiling and wrapping of fortran code and ultimately produces a
python callable module. The process is really painless and fun! Now the
question comes why you would want to call fortran from python. One of
the important reasons could be plotting ! Yes, if you had ever wanted to
do all the post processing stuffs apart from the main simulation in a
language that provides easy-to-use plotting libraries you would have
possibly ruled out all basic languages like fortran/cpp/c/java. Most of
the people I know either use matlab or python+numpy for post processing
for obvious reasons. Well, life does not go smooth and we can not get
all the goodies all the time. There might come a time when you may feel
frustrated by the speed and memory management of python or matlab. In
such situations binding fortran with python may provide you with a
relief. In my case, I was trying to calculate correlation between
different x-y planes of a large 3D array having dimensions or ranks
2048x2048x64. The data file takes approximately of space. I was trying to
calculate correlation using Matlab's xcorr2 and it was excruciatingly
slow. It is not like Matlab is slow. I picked the wrong function for my
purpose. For large arrays xcorr2 or scipy.signal's 'correlate2d' is not
the right choice for correlation calculations and I am going to show
why. Before that, lets see how to call fortran from python. The key tool
for this purpose is f2py. I use Debian/Ubuntu linux systems and it was
installed from the repository. To get desired output and avoid surprise
you just need to declare variables in fortran subroutines with the
'intent' attribute. From 'intent(in)' or 'intent(out)' attribute f2py
will figure out how to arrange the dummy variables in call signature.
f2py removes all variables with 'intent(out)' attribute. The resulting
module from f2py will return the original 'intent(out)' variables in a
tuple when called from python. By the way I have not used any deferred
shape arrays in fortran subroutine dummy arguments. The fortran file
that I compiled using f2py is "multiLevelCrossCorrFFTM.F90" and created a
module named "mxcorr". "multiLevelCrossCorrFFTM.F90" uses the "fftw3"
library for calculating correlation. libfftw3.so will be dynamically
linked to f2py generated module without any issue. However I had hard
time forcing the compiler to search for 'fftw.h' header file. After
searching the web I realized that if the subroutine is saved with ".f90"
extension the preprocessor does not run correctly to include the header
but saving fortran files with " .F90" extension solves the problem!
The produced mxcorr module by f2py is imported in "forFromPy.py" script
and used. Full codes are given and the procedure should be apparent
from the example.
A very good tutorial on f2py can be found here :
http://www.ucs.cam.ac.uk/docs/course-notes/unix-courses/pythonfortran/files/f2py.pdf
Note
: It is important to check the call signature of the functions produced
by f2py. Issuing the command help(mxcorr) in python prompt produces
the output:
NAME
mxcorr
FILE
/home/ahsan/MEGA/javaProgramsPostProcessing/NeutralChannel/turbStruc3D_corr3D/testJava/mxcorr.so
DESCRIPTION
This module 'mxcorr' is auto-generated with f2py (version:2).
Functions:
status,res =
multilevelcrosscorrfftm(arr3d,template,nx=shape(arr3d,0),ny=shape(arr3d,1),nz=shape(arr3d,2),snx=shape(template,0),sny=shape(template,1))
compare this with the original fortran call signature
subroutine multilevelcrosscorrfftm(status, res, arr3d, nx, ny, nz, template, snx, sny)
implicit none
include 'fftw3.f'
integer,intent(in) :: nx, ny, nz, snx, sny
integer, intent(out) :: status
real(kind=8),dimension(nx,ny,nz), intent(in) :: arr3d
real(kind=8),dimension(nx,ny,nz), intent(out) :: res
It is
obvious that f2py has modified the call signature. Not only it has
removed the "intent(out)" dummy arguments from the call signature but
also it has rearranged the position of input arguments. It has moved
'template' argument from 7 th position to 2nd (next to arr3d)! Also note
'intent(out)' variables (status, res) are being returned in a tuple in
the order they appeared in original fortran subroutine.
file : forFromPy.py
[code language="python"]import numpy as np
from scipy import signal
import mxcorr
import time
import matplotlib.pyplot as plt
nx = 200; ny = 200;
imgf = np.random.random((nx,ny,1))
temf = imgf[:,:,0]
stime = time.time()
status, corrf = mxcorr.multilevelcrosscorrfftm(imgf,temf,nx,ny,1,nx,ny)
etime = time.time()
print "Elapsed time, fortran call :", etime-stime," sec";
corrp = np.zeros((nx,ny));
stime = time.time()
corrp[:,:] = signal.correlate2d(imgf[:,:,0], temf, mode='same')
etime = time.time()
print "Elapsed time, python call :", etime-stime, " sec";
corrp = corrp/(sum(sum(temf**2))*sum(sum(imgf**2)))**0.50
plt.subplot(2, 2, 1)
plt.imshow(corrf[:,:,0], cmap='RdBu', interpolation='nearest')
plt.colorbar()
plt.subplot(2, 2, 2)
plt.imshow(corrp[:,:], cmap='RdBu', interpolation='nearest')
plt.title('Correlation results')
plt.colorbar()
plt.show()
[/code]
Elapsed time, fortran call : 0.053929805755 sec
Elapsed time, python call : 5.06791615486 sec
Speed Gain : 101 times !!
Well
this is one of those extreme cases! AS has been pointed out, If it is
needed to calculate correlation between very large arrays, it will not
be wise to use the regular template matching routines like
Matlab's 'xcorr2' or in this case 'correlate2d' from scipy.signal. I don
not know the exact algorithm that are being used in these
functions but I have a feeling that the correlation/convolution is being
calculated in this cases using the trivial term by term
multiplications. The trivial algorithm "slide the template image over
all elements of the target image and multiply term by term and take the sum" will have a computational cost of approximately (nx * ny * nz * (nx * ny+ nx * ny)) whereas,
calculating the correlation/convolution using optimized fft libraries
like fftw3 will incur a total computational cost of approximately
(nz * (nx * ny * log(nx * ny))). So the speed up in the execution time
is not surprising at all. A very similar function like
"multilevelcrosscorrfftm" can be written in python using pythons fft
routines and I guess it will be as fast as the fortran routine. The
point is, it is important to realize early what algorithm we are using
for our calculations. If our data file is small, say, couple of megabytes
we will not notice the performance difference. As soon as data gets
large and large, picking the right algorithm becomes a paramount
task otherwise, one may find himself in utter frustration like me ! I
particularly choose fortran in this case because other calculations
involved in my project will make use of extensive loop operations, at
least 3 level deep for loops and data set is also large. Both Matlab and
Python have got bad reputation regarding loop executions. I am just
being cautious from the beginning and writing routines using fortran.
file : multiLevelCrossCorrFFTM.F90
[code language="cpp"]
! author : ahsan, cfd.ahsan@outlook.com, Aug 2015
subroutine multilevelcrosscorrfftm(status, res, arr3d, nx, ny, nz, template, snx, sny)
! this subroutine calcuates the correlation between the template image (template)
! and target image (arr3d). the template image is a 2d array and can be smaller than
! than any of the x-y plane of the target image array (arr3d)
! the target image is assumed to be a 3d array and each x-y plane is treated
! as a separate image. the resulting correlation array has the size of the target
! array. nx, ny, nz are the number of row (assued x-direction),
! column (assumed y-direction), and page(assumed z-direction) of the arr3d array.
!` snx and sny are the number of row and column of the template image
implicit none
include 'fftw3.f'
integer, intent(in):: nx, ny, nz, snx, sny
real(kind=8),dimension(nx,ny,nz), intent(in) :: arr3d
real(kind=8),dimension(snx,sny), intent(in) :: template
integer, intent(out) :: status
real(kind=8),dimension(nx,ny,nz), intent(out) :: res
real(kind=8), dimension(:,:), allocatable :: planeData, templateData,xx,yy
real(kind=8), dimension(:), allocatable :: onerow, onecol
complex(kind=8), dimension(:,:), allocatable :: templatefft, planeftt, corr
integer :: i,j,levelRow, levelCol, templateRow, templateCol, m ,n,nrow,ncol
integer(kind=8) :: planfd, planbd, planft, planbt
status = 0 ! status 0 mean success
! ! add an additional row or column at the end to make data and template size odd
if(mod(nx,2) == 0 .and. mod(ny,2)==0) then
levelRow = nx+1; levelCol = ny+1
else if(mod(nx,2) == 0 .and. mod(ny,2)==1) then
levelRow = nx+1; levelCol= ny;
else if(mod(nx,2) == 1 .and. mod(ny,2)==0) then
levelRow = nx; levelCol= ny+1;
else if(mod(nx,2) == 1 .and. mod(ny,2)==1) then
levelRow = nx; levelCol = ny;
end if
if(mod(snx,2) == 0 .and. mod(sny,2)==0) then
templateRow = snx+1; templateCol = sny+1;
else if(mod(snx,2) == 0 .and. mod(sny,2)==1) then
templateRow = snx+1; templateCol= sny;
else if(mod(snx,2) == 1 .and. mod(sny,2)==0) then
templateRow = snx; templateCol = sny+1;
else if(mod(snx,2) == 1 .and. mod(sny,2)==1) then
templateRow= snx; templateCol = sny;
end if
allocate(planeData(levelRow+templateRow -1, levelCol+templateCol-1), &
STAT = status)
allocate(templateData(levelRow+templateRow -1, levelCol+templateCol-1), &
STAT = status)
allocate(templatefft(size(templateData,1)/2+1, size(templateData,2)), &
STAT=status)
allocate(planeftt(size(planeData,1)/2+1, size(planeData,2)), STAT=status)
allocate(corr(size(planeData,1)/2+1, size(planeData,2)), STAT=status)
allocate(onerow(size(planeData,1)), onecol(size(planeData,2)))
allocate(xx(size(planeData,1), ((size(planeData,2))-1)/2))
allocate(yy((size(planeData,1)-1)/2,size(planeData,2)))
nrow = size(templateData,1); ncol = size(templateData,2);
do i=1,nz
planeData = 0.0
templateData = 0.0
xx = 0.0; yy = 0.0;
planeData(1:size(arr3d(:,:,i),1), 1:size(arr3d(:,:,i),2)) = arr3d(:,:,i)
templateData(1:size(template(:,:),1), 1:size(template(:,:),2)) = template
call dfftw_plan_dft_r2c_2d(planfd,size(planeData,1),size(planeData,2), &
planeData,planeftt,fftw_estimate)
call dfftw_execute_dft_r2c(planfd, planeData,planeftt)
call dfftw_destroy_plan(planfd)
call dfftw_plan_dft_r2c_2d(planft,size(templateData,1), &
size(templateData,2), templateData,templatefft,fftw_estimate)
call dfftw_execute_dft_r2c(planft, templateData,templatefft)
call dfftw_destroy_plan(planft)
corr = planeftt * conjg(templatefft)
call dfftw_plan_dft_c2r_2d(planbd,size(planeData,1), &
size(planeData,2),corr, planeData, fftw_estimate)
call dfftw_execute_dft_c2r(planbd,corr, planeData)
call dfftw_destroy_plan(planbd)
planeData = planeData / size(planeData)
! ! since fftw library does not have a fftshift function
! ! manually rearrange the data to keep highest auto
! ! correlation at center
! ! resulting correlation matrix output will conform to the matlab xcorr2 result
onecol = planeData(:,1);
xx = planeData(:,2:(size(planeData,2)-1)/2+1);
planeData(:,1:(size(planeData,2)-1)/2)= &
planeData(:, (size(planeData,2)-1)/2+2:size(planeData,2));
planeData(:, (size(planeData,2)-1)/2+2:size(planeData,2)) = xx;
planeData(:,(size(planeData,2)-1)/2+1) = onecol
onerow = planeData(1,:);
yy = planeData(2:(size(planeData,2)-1)/2+1, : )
planeData(1:(size(planeData,2)-1)/2, :)= &
planeData((size(planeData,2)-1)/2+2:size(planeData,2),: )
planeData((size(planeData,2)-1)/2+2:size(planeData,2),:) = yy
planeData((size(planeData,2)-1)/2+1,:) = onerow
! ! normalize the result to get maximum correlation of 1
planeData = planeData/dsqrt(sum(arr3d(:,:,i)**2)*sum(template**2))
! ! return the middle portion of resultant matrix that corresponds
! ! to the size of the input matrix
res(:,:,i) = planeData((templateRow-1)/2+2:size(planeData,1)- &
(templateRow-1)/2+1, &
(templateCol-1)/2+2:size(planeData,2)-(templateCol-1)/2+1)
end do
deallocate(planeData, onecol, onerow,planeftt, templatefft,corr, STAT=status)
deallocate(templateData, xx, yy, STAT=status)
end subroutine multilevelcrosscorrfftm
[/code]
!! compiling with f2py to make a python callable module
!! f2py -I/usr/include -c -L/usr/lib/x86_64-linux-gnu -lfftw3 -m mxcorr multiLevelCrossCorrFFTM.F90