How to make the scipy minimization function multithread

Hello everyone! I try to use scipy minimize to solve a minimization functions. My objective function and gradient function require multi-threaded computation. However, when calling these functions, there is no parallel or multithreaded execution, where there is when I run them not in scipy minimization. Here is my code:

def objective_and_gradient(ks_flat):
    u_now, data_error_norm = heat_solver(k, t, T, num_steps, Lx, Ly, nx, ny, 
                                          k0, ks_flat, alpha, zero_source, 
                                          U_initial, u_data)
   
    r_delta, ks_delta, k0_delta = D_adjoint(k, t, T, num_steps, Lx, Ly, 
                                             nx, ny, k0, ks_flat, alpha, 
                                             u_now, U_initial, u_data)
    grad = ks_delta / np.linalg.norm(ks_delta) * np.linalg.norm(ks_flat) * 0.001
    return data_error_norm, grad
result = minimize(
    fun=objective_and_gradient,
    x0=ks_initial,
    method='BFGS',
    jac=True, 
    options={'maxiter': N_k, 'disp': True}
)

Here the functions heat_solver and D_adjoint are multithreaded.
I want to know that if scipy minimization is able to run in a multithreaded way. If it is, how to do it?

Can you expand on what you mean when you say that they are multithreaded and that there is multithreaded execution when you run them outside of minimize()?

minimize() is mostly using serial algorithms that have no option for parallelism in terms of calling the fun callable. The next input to try is usually determined from the past so many inputs. Most of the algorithms can’t generate N new inputs to try in parallel. So fun needs to be synchronous at least. Whether it uses multithreading internally or not is an implementation detail, and the code you posted looks like a perfectly fine synchronous interface. What you are doing in the other code that makes it “run multithreaded” is the missing piece that we don’t know.

Thanks for your reply! In my functions heat_solver and D_adjoint,I use MPI to use several cores in the same time to solve a heat equation. The communicator in the MPI is MPI.COMM_WORLD. When I run these two functions, all available cores will be used. However, when I run scipy minimize, which will call these functions, only one core will be used. I hope that when calling these two functions in minimize, they still keep using multithreading internally.

What does the code look like when you call these functions outside of minimize() (and it uses MPI as you desire)?

I just import MPI and call these functions. Simply use python to run the code. I give some details in my code of function heat_solver.

def heat_solver(ks):
   domain=create_rectangle(MPI.COMM_WORLD, [np.array([0, 0]), np.array([Lx, Ly])],
                              [nx, ny],  mesh.CellType.triangle)
   solver = PETSc.KSP().create(domain.comm)
   solver.setType(PETSc.KSP.Type.GMRES)
   solver.getPC().setType(PETSc.PC.Type.ILU)

First the domain is created where all available cores are used because of MPI.COMM_WORLD.
Then the package PETSc can help solve a linear problem where domain.comm is the number of used cores when creating domain.
I think the problem is that when I run scipy, only one core is used so when calling my objective functions, it does not use the MPI as I desire.

What does the code outside of heat_solver() look like? I suspect that somewhere, it’s setting up MPI to use multiple processes. It might be obscured a bit; other packages might be doing the setup for you, as a convenience, and you just aren’t using those packages in the same way when you set up your minimize()-using script. scipy knows nothing about MPI and won’t set it up for you.

I just try to call the function without doing anything else. At first I define some parameters, just numpy arrays. Then I call the heat_solver() and I will find the multiple processes. Here is my code:

import numpy as np
from solver2 import heat_solver
k=1   # degree of fenite element test function space
t = 0  # Start time
T = 1200  # End time
num_steps =1000  # Number of time steps
nx, ny = 200, 200 # gird point
Lx, Ly = 400, 400 # rectangle length
alpha_true = 0.003 # decay coefficient
N_k = 60 # iterative step number
u_initial = np.zeros((k*nx+1)*(k*ny+1))
zero_init = np.zeros((k*nx+1)*(k*ny+1))
zero_source=np.zeros((num_steps,(k*nx+1)*(k*ny+1)))
one_source=np.ones((num_steps,(k*nx+1)*(k*ny+1)))

k0_true = np.ones((k*nx+1)*(k*ny+1))
ks_true = np.ones((k*nx+1)*(k*ny+1))
u_data,u_data_norm = heat_solver(k,t,T,num_steps,Lx,Ly,nx,ny,k0_true,ks_true,alpha_true,zero_source,u_initial,zero_source)

Cool. Next I would try inserting just import scipy.optimize into this script above the solver2 import. Don’t call anything from it, just import it. It’s possible that one of the FORTRAN modules adjusts something that MPI detects and decides to avoid using multiprocessing.

Then I would try your optimization script again, but pick a pure Python optimizer. I think the trustregion ones are all pure Python. It’s possible that the machinery that handles FORTRAN callbacks is setting up the environment in such a way that MPI will infer that it should not use multiprocessing.

Thanks for your advice! In my optimization script, I will also call the function heat_solver() outside the minimize function. In that case, MPI will still use multiprocessing. So I think the key is to set up an appropriate MPI environment for my object functions when calling heat_solver(), which I do not know how to do LOL.