Introduction
Following the suggestion mentioned in the conclusion of the previous article, the objective of this post is to speed up the computation of the 2D constant boundary element influence coefficients, $\mathbb{G}$ and $\mathbb{Q}$, by two methods and compare their performances.
Methods
The two methods to accelerate Python code that are studied in this post are Numba and F2PY. They are further examined in the following sections.
Numba
Numba is a open-source just-in-time compiler that translates a subset of Python and NumPy into fast machine code. The most common way of using Numba is through its decorators, which can be applied to the functions the user wants to compile. The following code snippet displays the application of Numba decorator jit
to a class static method that computes the influence coefficients $\mathbb{G}$ and $\mathbb{Q}$ by analytical integration.
Class static method accelerated with Numba
# Extract from element.ipynb
import numpy as np
import numba
eps = np.finfo(np.float64).eps
class Element:
@staticmethod
@numba.jit('Tuple((f8, f8))(f8[:], f8)', nopython=True, cache=True)
def _analytical_numba(field_local, element_length):
"""Get influence coefficients at a field point using analytical integration."""
a = 0.5 * element_length
x, y = field_local
if np.abs(y) <= element_length * eps \
and np.abs(np.abs(x) - a) <= element_length * eps:
G = a / np.pi * (np.log(2*a) - 1)
Q = 0.0
else:
xpa = x + a
xma = x - a
r1 = np.sqrt(xma**2 + y**2)
r2 = np.sqrt(xpa**2 + y**2)
t1 = np.arctan2(y, xma)
t2 = np.arctan2(y, xpa)
G = 0.5 / np.pi * (
y * (t1 - t2) - xma * np.log(r1) + xpa * np.log(r2) - 2 * a
)
if np.abs(y) <= element_length * eps:
# Q is discontinuous in |x| < a and y = 0.
Q = 0.0
else:
Q = -0.5 / np.pi * (t1 - t2)
return G, Q
The decorator could have been applied to a function, but it was used to a static method of the class Element
since this calculation is exclusive to its objects. In Numba version 0.61.0, applying the decorator to an instance method was not possible, as the jit
compiler could not interpret the self
argument.
The used arguments to numba.jit
are described below:
'Tuple((f8, f8))(f8[:], f8)'
: this is the signature that represents the types of the arguments, wheref8
means 8-byte float number. The first partTuple((f8, f8))
indicates two float outputs, the second part(f8[:], f8)
describe one float array and one float variable as inputs. In general, the signature is optional, but it’s necessary whencache=True
is used.nopython=True
: this argument tells Numba to run the static method entirely without the involvement of the Python interpreter. This is the recommended practice for best performance.cache=True
: this option makes Numba use a cached version of the compiled method. This avoids recompilation every time the routine is called for the first time.
F2PY
F2PY, already explored in another post, is a tool that builds Python wrappers for Fortran routines, allowing the speed of Fortran compiled code to be used from within the Python environment. Below is the Fortran code to be wrapped by F2PY.
Fortran subroutine
! Extract from incoef.f90
module incoef
implicit none
private
public :: gauss_fortran
contains
subroutine gauss_fortran(field_local, element_length, G, Q)
! Get influence coefficients at a field point using Gauss-Legendre quadrature.
implicit none
integer, parameter :: dp = kind(1.0d0)
real(dp), intent(in) :: field_local(2)
real(dp), intent(in) :: element_length
real(dp), intent(out) :: G, Q
real(dp), parameter :: pi = 3.141592653589793_dp
real(dp), parameter :: roots(2) = [0.3399810435848563_dp, 0.8611363115940526_dp]
real(dp), parameter :: weights(2) = [0.6521451548625461_dp, 0.3478548451374538_dp]
real(dp) :: a, x, y
integer :: i
a = 0.5_dp * element_length
x = field_local(1)
y = field_local(2)
G = 0.0_dp
Q = 0.0_dp
do i = 1, size(roots)
G = G + weights(i) * (intG(a * roots(i)) + intG(-a * roots(i)))
Q = Q + weights(i) * (intQ(a * roots(i)) + intQ(-a * roots(i)))
end do
G = G * 0.25_dp * a / pi
Q = -Q * 0.5_dp * a / pi
contains
real(dp) function intG(t)
real(dp), intent(in) :: t
intG = log((x - t)**2 + y**2)
end function intG
real(dp) function intQ(t)
real(dp), intent(in) :: t
intQ = y / ((x - t)**2 + y**2)
end function intQ
end subroutine gauss_fortran
end module incoef
The incoef
python module is created by running:
f2py -c incoef.f90 -m incoef
After creation of a shared library (incoef.cp312-win_amd64.pyd
on the author machine), the gauss_fortran
subroutine can then be called from a Python instance method like this:
# Extract from element.ipynb
import incoef
class Element:
# [...]
def get_influence_coefficients_gauss_fortran(self, field_global):
field_local = self.get_point_local_coordinates(field_global)
return incoef.incoef.gauss_fortran(field_local, self.length)
In the code above, the first incoef
is the Python module created by F2PY. The second incoef
is the Fortran module wrapped by F2PY. The inclusion of the Fortran subroutine inside a Fortran module allows the addition of auxilary functions that can be kept private from the Python environment.
Results
The previous post compared analytical integration and the 4-point Gauss-Legendre quadrature. Here, each method is independently reassessed, enhanced by the previously mentioned Numba and F2PY techniques, with focus on their performance improvements. The following plot shows the computation time for calculating the influence coefficients $\mathbb{G}$ and $\mathbb{Q}$ for several field points layed around a boundary element, the same conditions presented in the previous post.
On average, methods improved by Numba and Fortran perform similary, taking half the time of the original procedures.
Conclusion
The results indicate that both Numba and Fortran significantly enhance a Python routine performance. Numba’s ease of use encourages its further application, though it’s best suited for numerically oriented code using basic Python elements and NumPy, with limitations on other Python constructs. Conversely, Fortran offers greater flexibility in optimizing Python code, but the F2PY wrapper imposes constraints, requiring Fortran modules and procedures to follow a strict format distinct from traditional Fortran development.
References
- Siu Kwan Lam, Antoine Pitrou, and Stanley Seibert. 2015. Numba: a LLVM-based Python JIT compiler. In Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC (LLVM ‘15). Association for Computing Machinery, New York, NY, USA, Article 7, 1–6. https://doi.org/10.1145/2833157.2833162
- NumPy Developers. 2024. F2PY. Distributed as part of NumPy. https://numpy.org/doc/stable/f2py/