Parallelisation using numba by calling type of a self made class

I have never used numba before and I search a solution to parallelise a python code on GPU without rewriting all of my code.
There are two classes made by myself and called surface and system. Rougly speaking, system is a list of surfaces.
The function to parallelise is called trace() and belongs to class system.
Each computation to paralellise use the tensorflow functions surface.sag_param and surface.champsVec defining the current surface.
It loops on system.posx and system.champsx and inside this two loops it does a serial computation over the surfaces (list of object of class surface).
This computation is quite generic and I would want to conserve this flexibility.
I wounder if it is possible to do this paralellisation without rewrite all the code inside my classes ?
I just tried to write @jit(npython=True,parallel=True) before the function trace() and replace range() by prange() (for numCHamp and numRay, but I get the error :
TypingError: Failed at nopython (nopython frontend).
Untyped global name system : cannot determine Numba type <class ‘libsystem.system’>
Indeed numba does not know this type.
By setting the parameter nopython to False, I get the error "TypeError : can only concatenate tuple (not “dict_values”) to tuple
Here is the shape of my code (the two classes with the function system.trace() to paralellise).
Thank you for your help.

class system :

def __init__(self,**kwargs):
    self.numSurfaces = 3 # nombre de surfaces (Surface 0 : surface fictive où on défini les rayons incidents, Surface 2 : surface fictive où on regarde les rayons émergeants )
    self.surfaces = ... #list of elements of class surface
    
    self.champsx = ... #1D numpy array 
    self.champsz = ... #1D numpy array 
    self.posx = ... #3D numpy array  
    self.posy= ... #3D numpy array          
    self.numSurfaces = len(self.surfaces) 
	
def trace(self):
	# function to parallelize with respect to numChamp and numRay
    
   #parallel computation possible
    for numChamp in range(len(self.champsx)):
	#parallel computation possible
        for numRay in range(self.posx.shape[0]):
			#computation of a point and a direction on each  surface
            surf = self.surfaces[0]
            P = np.array([self.posx[numRay,numChamp,0],self.posy[numRay,numChamp,0],\
			surf.sag_param(self.posx[numRay,numChamp,0],self.posy[numRay,numChamp,0],*surf.params.values())])
            Q = np.array([np.sin(self.champsz[numChamp])*np.cos(self.champsx[numChamp]),\
			np.sin(self.champsz[numChamp])*np.sin(self.champsx[numChamp]),np.cos(self.champsz[numChamp])])
            
			#sequential computation
            for numSurf in range(1,self.numSurfaces):
                surfPrev = surf
                surf = self.surfaces[numSurf]                    
                P = surf.transferPoint(P,Q)                

                if numSurf<self.numSurfaces-1:
                    Q = surf.transferDirection(P,Q)
					
                self.posx[numRay,numChamp,numSurf] = P[0]
                self.posy[numRay,numChamp,numSurf] = P[1]    

class surface:

def __init__(self,**kwargs):        
    
    self.params = ... # a dictionnary containing some parameters value              
    self.sag_param =  ... #a tensor flow function depending on self.params and 2 other spatial variables          
    self.champsVec =  ...# idem with self.sag_param


def bz(self,u,v,cx,conic,cy,conicy,coefsZernike):
   
def transferPoint(self,point,direction):
    #a transfer function returning a point and using sag_param and params of self

def transferDirection(self,point,direction):
	#a transfer function returning a direction and using champsVec  of self

The answer is pretty much a straight no, I’m afraid. You can’t in general apply jit to methods, for starters.

But even if you pull apart that class, it still won’t work because the jit engine can only compile code when it acts on simple and familiar types — which, in essence, means integers, floats, bools, and basic container types and numpy arrays.

See http://numba.pydata.org/numba-doc/latest/reference/pysupported.html

Thank you John, I will try to rewrite my code by following this documentation.
Do you think that numba is the best way for beginners like me to parallelise a code on GPU ?
Many thanks

You’re welcome @drogoua

I love numba but it does take some time to understand it’s limitations and how to use it best.

If you’re looking to implement machine learning routines, then I believe tensorflow already implements parallelization under the hood in some of its routines, and that parallelization can target the GPU. But I don’t have experience using it myself.

Good luck with your studies.

John.

Hello john,
I was trying to paralellise my program using JIT. But it is not speeding up. Could you please help?
Here is my code…

import time  
import datetime
import numpy as np from math 
import pi
import numba from numba 
import jit,njit,double,vectorize,float64,int64
import time 

 #%% parameters for the calculations
    mu0 = 4e-7 * pi
        h_planck=6.58212e-4# mev*ns
        mub=5.78e-2#meV/T
        g=2
        s=2
        T=2.0 #K
       dt=0.5e-5
       Kb=8.6e-2
       Kjt=1.5
       gamma =(g*mub)/h_planck #1/(T*ns) 
       alpha = 1
       mus=mub*g*s
eA=np.array([-np.sqrt(2.0/3.0),0.0,-np.sqrt(1.0/3.0)])
eB=np.array([-np.sqrt(1.0/6.0),-np.sqrt(1.0/2.0),np.sqrt(1.0/3.0)])
eC=np.array([-np.sqrt(1.0/6.0),np.sqrt(1.0/2.0),np.sqrt(1.0/3.0)])
@njit
def dot(S1,eA,eB,eC):
    result1=0.0
    result2=0.0
    result3=0.0
    for i in range(3):
        result1 += S1[i]*eA[i]
        result2 += S1[i]*eB[i]
        result3 += S1[i]*eC[i]

    return result1,result2,result3
@njit
def jahnteller1(S1):
    global Kjt
    M,N,O=dot(S1,eA,eB,eC)
    P,Q,R=M**5,N**5,O**5
    X=3.0*Kjt*((eA*P+eB*Q+eC*R))
    return X/mus
@njit
def thermal1():
    mu, sigma = 0, 1 # mean and standard deviation
    G = np.random.normal(mu, sigma, 3)
    Hth1=G*np.sqrt((2*alpha*Kb*T)/(gamma*mus*dt))
    return Hth1
#%% calculation of effective field
@njit
def h_eff(B,S1,eH):
    Heff1 = eH*B+jahnteller1(S1)+thermal1()
    return  Heff1
#%% evaluating cross products
@njit
def cross1(S1,heff1):
    result1=np.zeros(3)
    a1, a2, a3 = S1[0], S1[1], S1[2]
    b1, b2, b3 = heff1[0], heff1[1],heff1[2]
    result1[0] = a2 * b3 - a3 * b2
    result1[1] = a3 * b1 - a1 * b3
    result1[2] = a1 * b2 - a2 * b1
    return result1
@njit
def cross2(S1,X):
    result2=np.zeros(3)
    a1, a2, a3 = S1[0],S1[1],S1[2]
    c1, c2, c3 = X[0],X[1],X[2]
    result2[0] = a2 * c3 - a3 * c2
    result2[1] = a3 * c1 - a1 * c3
    result2[2] = a1 * c2 - a2 * c1
    return result2
#%% Main function to calculate the Spin S1 by calculating the effective field
 @njit
def llg(S1,dt, B,eH):
    global gamma,alpha
    N_init = int(5)
    for i in range(N_init):
        heff1 = h_eff(B,S1,eH)
        X=cross1(S1,heff1)
        Y=cross2(S1,X)
        dS1dt = - gamma/(1+alpha**2) * X \
           - alpha*gamma/(1+alpha**2) * Y
        S1 += dt * dS1dt
        normS1 = np.sqrt(S1[0]*S1[0]+S1[1]*S1[1]+S1[2]*S1[2])
        S1 = S1/normS1
    Savg=np.array([0.0,0.0,0.0])
    Navg=N_init*10
    for i in range(Navg):
        heff1 = h_eff(B,S1,eH)
        X=cross1(S1,heff1)
        Y=cross2(S1,X)
        dS1dt = - gamma/(1+alpha**2) * X \
           - alpha*gamma/(1+alpha**2) * Y
        S1 += dt * dS1dt
        normS1 = np.sqrt(S1[0]*S1[0]+S1[1]*S1[1]+S1[2]*S1[2])
        S1 = S1/normS1
        Savg=Savg+S1
    Savg=Savg/Navg
    return Savg  
#%% calculating dot product
@njit
def dott(S1,K):
    result=0.0
    for i in range(3):
        result += S1[i]*K[i]
    return result


 #%% initialising magn
        magn=np.zeros([25,3]) 
        Th=[]
        Ph=[]
        B=5.0
        theta=np.linspace(0.0,np.pi,5)
        phi=np.linspace(0.0,2*np.pi,5)
    for i in range(len(phi)):
        for j in range(len(theta)):
            M,N=phi[i],theta[j]
            Th.append(N)
            Ph.append(M)

#%% calling the main fuction
for i in range(25):
    magn[i][0]=Ph[i]
    magn[i][1]=Th[i]
    eH=np.array([np.sin(Th[i])*np.cos(Ph[i]),np.sin(Th[i])*np.sin(Ph[i]),np.cos(Th[i])])
    normH = np.sqrt(eH[0]*eH[0]+eH[1]*eH[1]+eH[2]*eH[2])
    eH=eH/normH
    S1=np.array([np.sin(Th[i])*np.cos(Ph[i]),np.sin(Th[i])*np.sin(Ph[i]),np.cos(Th[i])])
    S1=llg(S1,dt,B,eH)
    K=eH*B
    Z=dott(S1,K)
    E=-Z*g*mub*s
    magn[i][2]=E

#%% printing magn
print(magn)

Hi @physics, is this the part you are paralellising?

If all your functions are jitted, then I found numba prange works really well…

#%% calling the main fuction
for i in range(25):
    magn[i][0]=Ph[i]
    magn[i][1]=Th[i]
    eH=np.array([np.sin(Th[i])*np.cos(Ph[i]),np.sin(Th[i])*np.sin(Ph[i]),np.cos(Th[i])])
    normH = np.sqrt(eH[0]*eH[0]+eH[1]*eH[1]+eH[2]*eH[2])
    eH=eH/normH
    S1=np.array([np.sin(Th[i])*np.cos(Ph[i]),np.sin(Th[i])*np.sin(Ph[i]),np.cos(Th[i])])
    S1=llg(S1,dt,B,eH)
    K=eH*B
    Z=dott(S1,K)
    E=-Z*g*mub*s
    magn[i][2]=E

[/quote]

Dear Akshay Shankar,
Thank you for your reply. Its great to know I can speed it up.
I want to parallelise the one you mentioned. What are the changes I should make?
could you please tell me?

Hi Physics.

I have not had a chance to go through your code. Can I suggest starting with a simple minimal working code, as they say, and applying prange to that? prange seems a staple, so there are lots of examples on stackexchange. I.e.