This page was generated from unit-2.11-matrixfree/matrixfree.ipynb.
2.11 Matrix free operator application¶
Usually, we assemble matrices in sparse matrix format. Certain methods allow to improve performance and reduce memory requirements considerably by avoiding building and storing the system matrix. The counterpart of this approach is that it is now difficult to build preconditioners
Hybrid mixed method¶
consider the hybrid mixed method
where the matrix \(A\) comes from
and the matrix \(B\) from
For \(\sigma\) we use a VectorL2
space, with a Piola-mapping to the physical elements
[1]:
from ngsolve import *
from ngsolve.webgui import Draw
from netgen.geom2d import unit_square
from ngsolve.fem import MixedFE
mesh = Mesh(unit_square.GenerateMesh(maxh=0.3))
order=1
Sigma = VectorL2(mesh, order=order, piola=True)
Vt = L2(mesh, order=order-1)
Vf = FacetFESpace(mesh, order=order)
V = Vt*Vf
sigma,tau = Sigma.TnT()
(ut,uf), (vt,vf) = V.TnT()
[2]:
a = BilinearForm(Sigma)
a += sigma*tau * dx
b = BilinearForm(trialspace=Sigma, testspace=V)
b += div(sigma) * vt * dx
n = specialcf.normal(mesh.dim)
dS = dx(element_boundary=True)
b += -sigma*n*vf * dS
b.Assemble()
[2]:
<ngsolve.comp.BilinearForm at 0x7f961943b0f0>
compute element matrices …
we should observe that the matrices from the \(A\) operator are block-diagonal with \(2\times 2\) blocks, and each element matrix from \(B\) belongs to one of two equivalence classes.
[3]:
for el in mesh.Elements():
felS = Sigma.GetFE(el)
trafo = mesh.GetTrafo(el)
elmat = a.integrators[0].CalcElementMatrix(felS, trafo)
print (elmat)
0.5 0 0 0.5 0 0
0 0.25 0 0 0.25 0
0 0 0.0833333 0 0 0.0833333
0.5 0 0 1 0 0
0 0.25 0 0 0.5 0
0 0 0.0833333 0 0 0.166667
0.520958 0 0 0.183569 0 0
0 0.260479 0 0 0.0917847 0
0 0 0.0868264 0 0 0.0305949
0.183569 0 0 0.544569 0 0
0 0.0917847 0 0 0.272285 0
0 0 0.0305949 0 0 0.0907615
0.569183 0 0 0.129853 0 0
0 0.284592 0 0 0.0649265 0
0 0 0.0948639 0 0 0.0216422
0.129853 0 0 0.46885 0 0
0 0.0649265 0 0 0.234425 0
0 0 0.0216422 0 0 0.0781417
0.631472 0 0 0.217655 0 0
0 0.315736 0 0 0.108827 0
0 0 0.105245 0 0 0.0362758
0.217655 0 0 0.470922 0 0
0 0.108827 0 0 0.235461 0
0 0 0.0362758 0 0 0.0784869
0.885812 0 0 0.316764 0 0
0 0.442906 0 0 0.158382 0
0 0 0.147635 0 0 0.0527939
0.316764 0 0 0.395501 0 0
0 0.158382 0 0 0.19775 0
0 0 0.0527939 0 0 0.0659168
1.02921 0 0 0.351738 0 0
0 0.514604 0 0 0.175869 0
0 0 0.171535 0 0 0.058623
0.351738 0 0 0.363114 0 0
0 0.175869 0 0 0.181557 0
0 0 0.058623 0 0 0.0605189
0.36027 0 0 0.100212 0 0
0 0.180135 0 0 0.0501061 0
0 0 0.0600451 0 0 0.016702
0.100212 0 0 0.721798 0 0
0 0.0501061 0 0 0.360899 0
0 0 0.016702 0 0 0.1203
0.767408 0 0 0.410378 0 0
0 0.383704 0 0 0.205189 0
0 0 0.127901 0 0 0.0683963
0.410378 0 0 0.545225 0 0
0 0.205189 0 0 0.272613 0
0 0 0.0683963 0 0 0.0908709
0.495366 0 0 0.129845 0 0
0 0.247683 0 0 0.0649223 0
0 0 0.0825611 0 0 0.0216408
0.129845 0 0 0.538712 0 0
0 0.0649223 0 0 0.269356 0
0 0 0.0216408 0 0 0.0897853
0.610455 0 0 0.132925 0 0
0 0.305228 0 0 0.0664624 0
0 0 0.101743 0 0 0.0221541
0.132925 0 0 0.438475 0 0
0 0.0664624 0 0 0.219237 0
0 0 0.0221541 0 0 0.0730791
0.606985 0 0 0.245865 0 0
0 0.303492 0 0 0.122933 0
0 0 0.101164 0 0 0.0409775
0.245865 0 0 0.511462 0 0
0 0.122933 0 0 0.255731 0
0 0 0.0409775 0 0 0.0852437
0.819927 0 0 0.2964 0 0
0 0.409964 0 0 0.1482 0
0 0 0.136655 0 0 0.0494
0.2964 0 0 0.412053 0 0
0 0.1482 0 0 0.206026 0
0 0 0.0494 0 0 0.0686754
1.03869 0 0 0.538909 0 0
0 0.519343 0 0 0.269454 0
0 0 0.173114 0 0 0.0898181
0.538909 0 0 0.520295 0 0
0 0.269454 0 0 0.260147 0
0 0 0.0898181 0 0 0.0867158
0.754369 0 0 0.256653 0 0
0 0.377184 0 0 0.128327 0
0 0 0.125728 0 0 0.0427755
0.256653 0 0 0.418722 0 0
0 0.128327 0 0 0.209361 0
0 0 0.0427755 0 0 0.069787
0.443235 0 0 0.129495 0 0
0 0.221617 0 0 0.0647473 0
0 0 0.0738724 0 0 0.0215824
0.129495 0 0 0.601868 0 0
0 0.0647473 0 0 0.300934 0
0 0 0.0215824 0 0 0.100311
0.688846 0 0 -0.0113756 0 0
0 0.344423 0 0 -0.00568782 0
0 0 0.114808 0 0 -0.00189594
-0.0113756 0 0 0.363114 0 0
0 -0.00568782 0 0 0.181557 0
0 0 -0.00189594 0 0 0.0605189
0.843809 0 0 0.442685 0 0
0 0.421905 0 0 0.221342 0
0 0 0.140635 0 0 0.0737808
0.442685 0 0 0.52852 0 0
0 0.221342 0 0 0.26426 0
0 0 0.0737808 0 0 0.0880866
0.626531 0 0 0.290343 0 0
0 0.313266 0 0 0.145172 0
0 0 0.104422 0 0 0.0483905
0.290343 0 0 0.533571 0 0
0 0.145172 0 0 0.266786 0
0 0 0.0483905 0 0 0.0889285
0.589162 0 0 0.149169 0 0
0 0.294581 0 0 0.0745846 0
0 0 0.0981937 0 0 0.0248615
0.149169 0 0 0.462099 0 0
0 0.0745846 0 0 0.23105 0
0 0 0.0248615 0 0 0.0770166
0.537192 0 0 0.416452 0 0
0 0.268596 0 0 0.208226 0
0 0 0.0895319 0 0 0.0694087
0.416452 0 0 0.788234 0 0
0 0.208226 0 0 0.394117 0
0 0 0.0694087 0 0 0.131372
0.569748 0 0 0.436652 0 0
0 0.284874 0 0 0.218326 0
0 0 0.094958 0 0 0.0727754
0.436652 0 0 0.773439 0 0
0 0.218326 0 0 0.38672 0
0 0 0.0727754 0 0 0.128907
0.99655 0 0 0.255208 0 0
0 0.498275 0 0 0.127604 0
0 0 0.166092 0 0 0.0425347
0.255208 0 0 0.316222 0 0
0 0.127604 0 0 0.158111 0
0 0 0.0425347 0 0 0.0527037
0.663993 0 0 0.26225 0 0
0 0.331997 0 0 0.131125 0
0 0 0.110666 0 0 0.0437084
0.26225 0 0 0.480088 0 0
0 0.131125 0 0 0.240044 0
0 0 0.0437084 0 0 0.0800147
0.641628 0 0 0.310158 0 0
0 0.320814 0 0 0.155079 0
0 0 0.106938 0 0 0.051693
0.310158 0 0 0.539562 0 0
0 0.155079 0 0 0.269781 0
0 0 0.051693 0 0 0.089927
[4]:
for el in mesh.Elements():
felS = Sigma.GetFE(el)
felV = V.GetFE(el)
fel = MixedFE(felS, felV)
trafo = mesh.GetTrafo(el)
# try integratos 0 and 1 ...
elmat = b.integrators[0].CalcElementMatrix(fel, trafo)
print (elmat)
0 1.5 0.5 0 0 1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1.5 0.5 0 0 1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1.5 -0.5 0 -8.66913e-17 -1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1.5 -0.5 0 0 -1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1.5 0.5 0 0 1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1.5 0.5 0 0 1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1.5 -0.5 0 0 -1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1.5 0.5 0 5.01581e-17 1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1.5 0.5 0 0 1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1.5 -0.5 0 -4.04151e-17 -1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1.5 -0.5 0 5.65519e-17 -1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1.5 0.5 0 0 1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1.5 0.5 0 -1.52993e-16 1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1.5 0.5 0 0 1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1.5 -0.5 0 0 -1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1.5 -0.5 0 4.47698e-18 -1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1.5 -0.5 0 0 -1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1.5 0.5 0 0 1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1.5 0.5 0 0 1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1.5 0.5 0 0 1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1.5 0.5 0 0 1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1.5 -0.5 0 0 -1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1.5 0.5 0 0 1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 1.5 -0.5 0 0 -1
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
Geometry-free matrix multiplication¶
NGSolve provides now the possibility to benefit from the geometry independent element matrix. Just one matrix per equivalence class is stored, and the matrix vector product is performed for all elements simultaneously using dense matrix algebra.
[5]:
from netgen.csg import *
geom = CSGeometry()
geom.Add (Sphere(Pnt(50,50,50),80) \
-Cylinder(Pnt(-100,0,0),Pnt(200,0,0), 40) \
-Cylinder(Pnt(100,-100,100),Pnt(100,200,100),40)
-Cylinder(Pnt(0,100,-100), Pnt(0,100,200),40)
-Sphere(Pnt(50,50,50),50))
# geom.Draw()
mesh = Mesh(geom.GenerateMesh(maxh=25))
mesh.Curve(5)
# Draw (mesh)
[5]:
<ngsolve.comp.Mesh at 0x7f9648976c50>
Same as above, with the geom_free
flag for the BilinearForm
:
[6]:
order=6
SetHeapSize(100*1000*1000)
Sigma = VectorL2(mesh, order=order, piola=True, tp=True)
Vt = L2(mesh, order=order-1)
Vf = FacetFESpace(mesh, order=order, dirichlet=[2])
V = Vt*Vf
print ("Sigma.ndof =", Sigma.ndof, ", V.ndof =", V.ndof)
sigma,tau = Sigma.TnT()
(ut,uf), (vt,vf) = V.TnT()
b = BilinearForm(trialspace=Sigma, testspace=V, geom_free=True)
b += div(sigma) * vt * dx
n = specialcf.normal(mesh.dim)
dS = dx(element_boundary=True)
b += -sigma*n*vf * dS
b.Assemble()
Sigma.ndof = 118692 , V.ndof = 58492
[6]:
<ngsolve.comp.BilinearForm at 0x7f9619445ff0>
Check timing with geom_free=True/False
:
[7]:
vx = b.mat.CreateRowVector()
vy = b.mat.CreateColVector()
vx[:] = 1
%timeit vy.data = b.mat * vx
3.29 ms ± 2.16 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
L2
finite element spaces have a Mass
method generating a linear operator for multiplication with the diagonal mass matrix. In case of variable coefficients or curved elements it uses numerical integration. It also provide an inverse operator:
[8]:
ainv = Sigma.Mass(rho=1).Inverse()
Check timig with tp=True/False
:
[9]:
vx = ainv.CreateRowVector()
vy = ainv.CreateColVector()
vx[:] = 1
%timeit vy.data = ainv * vx
54.6 ms ± 66.2 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Combine the operators to form the Schur complement, which is actually a discretization of the Laplace operator:
[10]:
Laplace = b.mat @ ainv @ b.mat.T
[11]:
f = LinearForm (V)
f += 1*vt * dx
f.Assemble()
proj = Projector(V.FreeDofs(), True)
gfu = GridFunction (V)
from time import time
start = time()
with TaskManager():
solvers.CG(mat=Laplace, pre=proj, rhs=f.vec, sol=gfu.vec, maxsteps=5000, printrates=False)
print ("time =", time()-start)
time = 53.21499180793762
The iteration count is very high since we used only the most trivial preconditioner. We can add a low order coarse-grid preconditioner to get rid of the mesh dependency, and also local inverses to improve the \(p\)-dependency.
[12]:
Draw (gfu.components[1], mesh, "gfu1", draw_vol=False);
[ ]: