This page was generated from unit-5.5-cuda/wave_cuda.ipynb.
5.5.2 Discontinuous Galerkin for the Wave Equation¶
We solve the first order wave equation by a matrix-free explicit DG method:
\begin{eqnarray*} \frac{\partial p}{\partial t} & = & \operatorname{div} u \\ \frac{\partial u}{\partial t} & = & \nabla p \end{eqnarray*}
Using DG discretization in space we obtain the ODE system \begin{eqnarray*} M_p \dot{p} & = & -B^T u \\ M_u \dot{u} & = & B p, \end{eqnarray*}
with mass-matrices \(M_p\) and \(M_u\), and the discretization matrix \(B\) for the gradient, and \(-B^T\) for the divergence.
The DG bilinear-form for the gradient is:
where \(\{ p \}\) is the average of \(p\) on the facet.
The computational advantages of a proper version of DG is:
universal element-matrices for the differntial operator \(B\)
cheaply invertible mass matrices (orthogonal polynomials, sum-factorization)
[1]:
from ngsolve import *
from netgen.occ import *
from time import time
from ngsolve.webgui import Draw
from netgen.webgui import Draw as DrawGeo
box = Box((-1,-1,-1), (1,1,0))
sp = Sphere((0.5,0,0), 0.2)
shape = box-sp
geo = OCCGeometry(shape)
h = 0.1
mesh = Mesh( geo.GenerateMesh(maxh=h))
mesh.Curve(3)
Draw(mesh);
[2]:
order = 4
fes_p = L2(mesh, order=order, all_dofs_together=True)
fes_u = VectorL2(mesh, order=order, piola=True)
fes_tr = FacetFESpace(mesh, order=order)
print ("ndof_p = ", fes_p.ndof, "+", fes_tr.ndof, ", ndof_u =", fes_u.ndof)
traceop = fes_p.TraceOperator(fes_tr, average=True)
gfu = GridFunction(fes_u)
gfp = GridFunction(fes_p)
gftr = GridFunction(fes_tr)
gfp.Interpolate( exp(-400*(x**2+y**2+z**2)))
gftr.vec.data = traceop * gfp.vec
ndof_p = 585515 + 530340 , ndof_u = 1756545
[3]:
p = fes_p.TrialFunction()
v = fes_u.TestFunction()
phat = fes_tr.TrialFunction()
n = specialcf.normal(mesh.dim)
where \(\{ p \}\) is the average of \(p\) on the facet.
[4]:
Bel = BilinearForm(trialspace=fes_p, testspace=fes_u, geom_free = True)
Bel += grad(p)*v * dx -p*(v*n) * dx(element_boundary=True)
Bel.Assemble()
Btr = BilinearForm(trialspace=fes_tr, testspace=fes_u, geom_free = True)
Btr += phat * (v*n) *dx(element_boundary=True)
Btr.Assemble();
B = Bel.mat + Btr.mat @ traceop
[5]:
invmassp = fes_p.Mass(1).Inverse()
invmassu = fes_u.Mass(1).Inverse()
[6]:
gfp.Interpolate( exp(-100*(x**2+y**2+z**2)))
gfu.vec[:] = 0
scene = Draw (gftr, draw_vol=False, order=3, min=-0.05, max=0.05);
t = 0
dt = 0.3 * h / (order+1)**2
tend = 1
op1 = dt * invmassu @ B
op2 = dt * invmassp @ B.T
cnt = 0
with TaskManager():
while t < tend:
t = t+dt
gfu.vec.data += op1 * gfp.vec
gfp.vec.data -= op2 * gfu.vec
cnt = cnt+1
if cnt%10 == 0:
gftr.vec.data = traceop * gfp.vec
scene.Redraw()
Time-stepping on the device¶
[7]:
try:
import ngsolve.ngscuda
except:
print ("Sorry, no cuda device")
Sorry, no cuda device
[8]:
gfp.Interpolate( exp(-100*(x**2+y**2+z**2)))
gfu.vec[:] = 0
scene = Draw (gftr, draw_vol=False, order=3);
t = 0
dt = 0.5 * h / (order+1)**2 / 2
tend = 0.1
op1 = (dt * invmassu @ B).CreateDeviceMatrix()
op2 = (dt * invmassp @ B.T).CreateDeviceMatrix()
devu = gfu.vec.CreateDeviceVector(copy=True)
devp = gfp.vec.CreateDeviceVector(copy=True)
cnt = 0
with TaskManager():
while t < tend:
t = t+dt
devu += op1 * devp
devp -= op2 * devu
cnt = cnt+1
if cnt%10 == 0:
gfp.vec.data = devp
gftr.vec.data = traceop * gfp.vec
scene.Redraw()
[9]:
print (op1.GetOperatorInfo())
ProductMatrix, h = 1756545, w = 585515
ScaleMatrix, scale = 0.001, h = 1756545, w = 1756545
SumMatrix, h = 1756545, w = 1756545
SumMatrix, h = 1756545, w = 1756545
BlockDiagonalMatrixSoA (bs = 3x3), h = 1756545, w = 1756545
ProductMatrix, h = 1756545, w = 1756545
ProductMatrix, h = 1756545, w = 18750
Transpose, h = 1756545, w = 18750
ConstantEBEMatrix (bs = 125x35), h = 18750, w = 1756545
BlockDiagonalMatrixSoA (bs = 3x3), h = 18750, w = 18750
ConstantEBEMatrix (bs = 125x35), h = 18750, w = 1756545
ProductMatrix, h = 1756545, w = 1756545
ProductMatrix, h = 1756545, w = 22125
Transpose, h = 1756545, w = 22125
ConstantEBEMatrix (bs = 125x35), h = 22125, w = 1756545
BlockDiagonalMatrixSoA (bs = 3x3), h = 22125, w = 22125
ConstantEBEMatrix (bs = 125x35), h = 22125, w = 1756545
SumMatrix, h = 1756545, w = 585515
SumMatrix, h = 1756545, w = 585515
ConstantEBEMatrix (bs = 105x35), h = 1756545, w = 585515
ConstantEBEMatrix (bs = 105x35), h = 1756545, w = 585515
ProductMatrix, h = 1756545, w = 585515
SumMatrix, h = 1756545, w = 530340
ConstantEBEMatrix (bs = 105x60), h = 1756545, w = 530340
ConstantEBEMatrix (bs = 105x60), h = 1756545, w = 530340
ProductMatrix, h = 530340, w = 585515
N4ngla14DiagonalMatrixIdEE, h = 530340, w = 530340
SumMatrix, h = 530340, w = 585515
ConstantEBEMatrix (bs = 60x35), h = 530340, w = 585515
ConstantEBEMatrix (bs = 60x35), h = 530340, w = 585515
[10]:
ts = time()
steps = 10
for i in range(steps):
devu += op1 * devp
devp -= op2 * devu
te = time()
print ("ndof = ", gfp.space.ndof, "+", gfu.space.ndof, ", time per step =", (te-ts)/steps)
ndof = 585515 + 1756545 , time per step = 0.12446055412292481
On the A-100:
ndof = 651035 + 1953105 , time per step = 0.0023579835891723634Time-domain PML¶
PML (perfectly matched layers) is a method for approximating outgoing waves on a truncated domain. Its time domain version leads to additional field variables in the PML region, which are coupled via time-derivatives only.
[11]:
from ring_resonator_import import *
from ngsolve import *
from ngsolve.webgui import Draw
[12]:
gfu = GridFunction(fes)
scene = Draw (gfu.components[0], order=3, min=-0.05, max=0.05, autoscale=False)
t = 0
tend = 15
tau = 2e-4
i = 0
sigma = 10 # pml damping parameter
op1 = invp@(-fullB.T-sigma*dampingp)
op2 = invu@(fullB-sigma*dampingu)
with TaskManager():
while t < tend:
gfu.vec.data += tau*Envelope(t)*srcvec
gfu.vec.data += tau*op1*gfu.vec
gfu.vec.data += tau*op2*gfu.vec
t += tau
i += 1
if i%20 == 0:
scene.Redraw()
The differential operators and coupling terms to the pml - variables are represented via linear operators;
[13]:
print (fullB.GetOperatorInfo())
print (dampingp.GetOperatorInfo())
SumMatrix, h = 72125, w = 72125
EmbeddedTransposeMatrix, h = 72125, w = 72125
EmbeddedMatrix, h = 72125, w = 13920
SumMatrix, h = 18560, w = 13920
SumMatrix, h = 18560, w = 13920
ConstantEBEMatrix (bs = 20x15), h = 18560, w = 13920
ConstantEBEMatrix (bs = 20x15), h = 18560, w = 13920
ProductMatrix, h = 18560, w = 13920
SumMatrix, h = 18560, w = 7165
ConstantEBEMatrix (bs = 20x15), h = 18560, w = 7165
ConstantEBEMatrix (bs = 20x15), h = 18560, w = 7165
SumMatrix, h = 7165, w = 13920
ConstantEBEMatrix (bs = 15x15), h = 7165, w = 13920
ConstantEBEMatrix (bs = 15x15), h = 7165, w = 13920
EmbeddedTransposeMatrix, h = 72125, w = 72125
EmbeddedMatrix, h = 72125, w = 13920
SumMatrix, h = 7165, w = 13920
ConstantEBEMatrix (bs = 15x15), h = 7165, w = 13920
ConstantEBEMatrix (bs = 15x15), h = 7165, w = 13920
SumMatrix, h = 72125, w = 72125
SumMatrix, h = 72125, w = 72125
EmbeddedTransposeMatrix, h = 72125, w = 72125
EmbeddedMatrix, h = 72125, w = 13920
N6ngcomp9ApplyMassE, h = 13920, w = 13920
ProductMatrix, h = 72125, w = 72125
EmbeddedMatrix, h = 72125, w = 13920
N6ngcomp9ApplyMassE, h = 13920, w = 13920
SumMatrix, h = 13920, w = 72125
ScaleMatrix, scale = 2, h = 13920, w = 72125
N4ngla18EmbeddingTransposeE, h = 13920, w = 72125
N4ngla18EmbeddingTransposeE, h = 13920, w = 72125
EmbeddedTransposeMatrix, h = 72125, w = 72125
EmbeddedMatrix, h = 72125, w = 13920
N6ngcomp9ApplyMassE, h = 13920, w = 13920
[14]:
print (op1.GetOperatorInfo())
ProductMatrix, h = 72125, w = 72125
SumMatrix, h = 72125, w = 72125
EmbeddedTransposeMatrix, h = 72125, w = 72125
EmbeddedMatrix, h = 72125, w = 13920
N6ngcomp9ApplyMassE, h = 13920, w = 13920
EmbeddedTransposeMatrix, h = 72125, w = 72125
EmbeddedMatrix, h = 72125, w = 13920
N6ngcomp9ApplyMassE, h = 13920, w = 13920
SumMatrix, h = 72125, w = 72125
ScaleMatrix, scale = -1, h = 72125, w = 72125
Transpose, h = 72125, w = 72125
SumMatrix, h = 72125, w = 72125
EmbeddedTransposeMatrix, h = 72125, w = 72125
EmbeddedMatrix, h = 72125, w = 13920
SumMatrix, h = 18560, w = 13920
SumMatrix, h = 18560, w = 13920
ConstantEBEMatrix (bs = 20x15), h = 18560, w = 13920
ConstantEBEMatrix (bs = 20x15), h = 18560, w = 13920
ProductMatrix, h = 18560, w = 13920
SumMatrix, h = 18560, w = 7165
ConstantEBEMatrix (bs = 20x15), h = 18560, w = 7165
ConstantEBEMatrix (bs = 20x15), h = 18560, w = 7165
SumMatrix, h = 7165, w = 13920
ConstantEBEMatrix (bs = 15x15), h = 7165, w = 13920
ConstantEBEMatrix (bs = 15x15), h = 7165, w = 13920
EmbeddedTransposeMatrix, h = 72125, w = 72125
EmbeddedMatrix, h = 72125, w = 13920
SumMatrix, h = 7165, w = 13920
ConstantEBEMatrix (bs = 15x15), h = 7165, w = 13920
ConstantEBEMatrix (bs = 15x15), h = 7165, w = 13920
ScaleMatrix, scale = 10, h = 72125, w = 72125
SumMatrix, h = 72125, w = 72125
SumMatrix, h = 72125, w = 72125
EmbeddedTransposeMatrix, h = 72125, w = 72125
EmbeddedMatrix, h = 72125, w = 13920
N6ngcomp9ApplyMassE, h = 13920, w = 13920
ProductMatrix, h = 72125, w = 72125
EmbeddedMatrix, h = 72125, w = 13920
N6ngcomp9ApplyMassE, h = 13920, w = 13920
SumMatrix, h = 13920, w = 72125
ScaleMatrix, scale = 2, h = 13920, w = 72125
N4ngla18EmbeddingTransposeE, h = 13920, w = 72125
N4ngla18EmbeddingTransposeE, h = 13920, w = 72125
EmbeddedTransposeMatrix, h = 72125, w = 72125
EmbeddedMatrix, h = 72125, w = 13920
N6ngcomp9ApplyMassE, h = 13920, w = 13920
[ ]: