This page was generated from unit-5a.1-mpi/poisson_mpi.ipynb.
5.1 Poisson Equation in Parallel¶
NGSolve can be executed on a cluster using the MPI message passing interface. You can download poisson_mpi.py and run it as
mpirun -np 4 python3 poisson_mpi.py
This computes and saves a solution which we can visualize it with drawsolution.py
netgen drawsolution.py
For proper parallel execution, Netgen/NGSolve must be configured with -DUSE_MPI=ON
. Recent binaries for Linux and Mac are built with MPI support (?). If you are unsure, when your Netgen/NGSolve supports MPI, look for outputs like “Including MPI version 3.1” during Netgen startup.
MPI-parallel execution using ipyparallel¶
For the MPI jupyter-tutorials we use ipyparallel
module. Please consult https://ipyparallel.readthedocs.io for installation.
On my notebook I have created the profile once by the command
ipython profile create –parallel –profile=mpi
Since I have only two physical cores, I edited the file .ipython/profile_mpi/ipcluster_config.py
to allow for oversubscription:
c.MPILauncher.mpi_args = [‘–oversubscribe’]
I start the cluster via
ipcluster start –engines=MPI -n 4 –profile=mpi
In jupyter, we can then connect to the cluster via a Client
object of ipyparallel.
[1]:
from ipyparallel import Client
c = Client(profile='mpi')
c.ids
[1]:
[0, 1, 2, 3]
We use mpi4py https://mpi4py.readthedocs.io/ for issuing MPI calls from Python. The %%px syntax magic causes parallel execution of that cell:
[2]:
%%px
from mpi4py import MPI
comm = MPI.COMM_WORLD
print (comm.rank, comm.size)
[stdout:0] 0 4
[stdout:2] 2 4
[stdout:1] 1 4
[stdout:3] 3 4
The master process (rank==0) generates the mesh, and distributes it within the group of processes defined by the communicator. All other ranks receive a part of the mesh. The function mesh.GetNE(VOL) returns the local number of elements:
[3]:
%%px
from ngsolve import *
from netgen.geom2d import unit_square
if comm.rank == 0:
mesh = Mesh(unit_square.GenerateMesh(maxh=0.1).Distribute(comm))
else:
mesh = Mesh(netgen.meshing.Mesh.Receive(comm))
print (mesh.GetNE(VOL))
[stdout:2] 79
[stdout:0] 0
[stdout:1] 73
[stdout:3] 78
We can define spaces, bilinear / linear forms, and gridfunctions in the same way as in sequential mode. But now, the degrees of freedom are distributed on the cluster following the distribution of the mesh. The finite element spaces define how the dofs match together.
[4]:
%%px
fes = H1(mesh, order=3, dirichlet=".*")
u,v = fes.TnT()
a = BilinearForm(grad(u)*grad(v)*dx)
pre = Preconditioner(a, "local")
a.Assemble()
f = LinearForm(1*v*dx).Assemble()
gfu = GridFunction(fes)
from ngsolve.krylovspace import CGSolver
inv = CGSolver(a.mat, pre.mat, printing=comm.rank==0, maxsteps=200, tol=1e-8)
gfu.vec.data = inv*f.vec
[stdout:1] WARNING: maxsteps is deprecated, use maxiter instead!
[stdout:3] WARNING: maxsteps is deprecated, use maxiter instead!
[stdout:2] WARNING: maxsteps is deprecated, use maxiter instead!
[stdout:0] WARNING: printing is deprecated, use printrates instead!
WARNING: maxsteps is deprecated, use maxiter instead!
CG iteration 1, residual = 0.053126041648229295
CG iteration 2, residual = 0.07344287617289662
CG iteration 3, residual = 0.06110786204339784
CG iteration 4, residual = 0.049294272666440334
CG iteration 5, residual = 0.04641237052420435
CG iteration 6, residual = 0.030446095751156133
CG iteration 7, residual = 0.025763981457526706
CG iteration 8, residual = 0.016975392921489908
CG iteration 9, residual = 0.012989664894467545
CG iteration 10, residual = 0.011308228515895673
CG iteration 11, residual = 0.007800282946410086
CG iteration 12, residual = 0.003524698587117304
CG iteration 13, residual = 0.0017401033251838
CG iteration 14, residual = 0.0011366587144063154
CG iteration 15, residual = 0.0008035602552022152
CG iteration 16, residual = 0.0005243481976976639
CG iteration 17, residual = 0.0003741983982428153
CG iteration 18, residual = 0.00027497275542884286
CG iteration 19, residual = 0.0001722947196498317
CG iteration 20, residual = 0.00012857735127167933
CG iteration 21, residual = 8.997393301231656e-05
CG iteration 22, residual = 6.195695703999047e-05
CG iteration 23, residual = 4.865877020420862e-05
CG iteration 24, residual = 3.387607816378531e-05
CG iteration 25, residual = 2.2199678719710126e-05
CG iteration 26, residual = 1.5023055764553777e-05
CG iteration 27, residual = 1.0581004609879279e-05
CG iteration 28, residual = 7.3557518572924645e-06
CG iteration 29, residual = 5.352395923297321e-06
CG iteration 30, residual = 3.6339266638961192e-06
CG iteration 31, residual = 2.0444337740669425e-06
CG iteration 32, residual = 1.5751267049433527e-06
CG iteration 33, residual = 1.3159605053740326e-06
CG iteration 34, residual = 8.346360401330136e-07
CG iteration 35, residual = 4.6043838009577006e-07
CG iteration 36, residual = 2.945004617259998e-07
CG iteration 37, residual = 1.9185646830449405e-07
CG iteration 38, residual = 1.3661842726572128e-07
CG iteration 39, residual = 9.497862727053461e-08
CG iteration 40, residual = 6.613160428773548e-08
CG iteration 41, residual = 4.228230865738215e-08
CG iteration 42, residual = 2.7713291458349878e-08
CG iteration 43, residual = 1.8141562264858647e-08
CG iteration 44, residual = 1.27279033041476e-08
CG iteration 45, residual = 8.504452240284736e-09
CG iteration 46, residual = 5.564358237230814e-09
CG iteration 47, residual = 3.6807848439137528e-09
CG iteration 48, residual = 2.3751444014589703e-09
CG iteration 49, residual = 1.6993731516376926e-09
CG iteration 50, residual = 1.0657829474396408e-09
CG iteration 51, residual = 7.154229520918308e-10
CG iteration 52, residual = 5.014272479607597e-10
Parallel pickling allows to serialize the distributed solution and transfer it to the client. The process with rank=0 gets the whole mesh and computed solution, all other processes get the local parts of the mesh and solution:
[5]:
gfu = c[:]["gfu"]
We can now draw the whole solution using the the master process’s gfu[0]
.
[6]:
from ngsolve.webgui import Draw
Draw (gfu[0])
[6]:
BaseWebGuiScene
Drawing gfu[n]
will draw only part of the solution that the process with rank=n
possesses.
[7]:
Draw (gfu[3])
[7]:
BaseWebGuiScene
We can also visualize the sub-domains obtained by the automatic partitioning, without using any computed solution, as follows.
[8]:
%%px
fesL2 = L2(mesh, order=0)
gfL2 = GridFunction(fesL2)
gfL2.vec.local_vec[:] = comm.rank
[9]:
gfL2 = c[:]["gfL2"]
Draw (gfL2[0])
[9]:
BaseWebGuiScene