This page was generated from unit-5.5-cuda/wave_cuda.ipynb.

5.5.2 Discontinuous Galerkin for the Wave Equation

We solve the first order wave equation by a matrix-free explicit DG method:

\begin{eqnarray*} \frac{\partial p}{\partial t} & = & \operatorname{div} u \\ \frac{\partial u}{\partial t} & = & \nabla p \end{eqnarray*}

Using DG discretization in space we obtain the ODE system \begin{eqnarray*} M_p \dot{p} & = & -B^T u \\ M_u \dot{u} & = & B p, \end{eqnarray*}

with mass-matrices \(M_p\) and \(M_u\), and the discretization matrix \(B\) for the gradient, and \(-B^T\) for the divergence.

The DG bilinear-form for the gradient is:

\[b(p,v) = \sum_{T} \Big\{ \int_T \nabla p \, v + \int_{\partial T} (\{ p \} - p) \, v_n \, ds \Big\},\]

where \(\{ p \}\) is the average of \(p\) on the facet.

The computational advantages of a proper version of DG is:

  • universal element-matrices for the differntial operator \(B\)

  • cheaply invertible mass matrices (orthogonal polynomials, sum-factorization)

[1]:
from ngsolve import *
from netgen.occ import *
from time import time

from ngsolve.webgui import Draw
from netgen.webgui import Draw as DrawGeo

box = Box((-1,-1,-1), (1,1,0))
sp = Sphere((0.5,0,0), 0.2)
shape = box-sp
geo = OCCGeometry(shape)

h = 0.1

mesh = Mesh( geo.GenerateMesh(maxh=h))
mesh.Curve(3)
Draw(mesh);
[2]:
order = 4
fes_p = L2(mesh, order=order, all_dofs_together=True)
fes_u = VectorL2(mesh, order=order, piola=True)
fes_tr = FacetFESpace(mesh, order=order)

print ("ndof_p = ", fes_p.ndof, "+", fes_tr.ndof, ", ndof_u =", fes_u.ndof)

traceop = fes_p.TraceOperator(fes_tr, average=True)

gfu = GridFunction(fes_u)
gfp = GridFunction(fes_p)
gftr = GridFunction(fes_tr)

gfp.Interpolate( exp(-400*(x**2+y**2+z**2)))
gftr.vec.data = traceop * gfp.vec
ndof_p =  651350 + 586770 , ndof_u = 1954050
[3]:
p = fes_p.TrialFunction()
v = fes_u.TestFunction()
phat = fes_tr.TrialFunction()

n = specialcf.normal(mesh.dim)
\[b(p,v) = \sum_{T} \Big\{ \int_T \nabla p \, v + \int_{\partial T} (\{ p \} - p) \, v_n \, ds \Big\},\]

where \(\{ p \}\) is the average of \(p\) on the facet.

[4]:
Bel = BilinearForm(trialspace=fes_p, testspace=fes_u, geom_free = True)
Bel += grad(p)*v * dx -p*(v*n) * dx(element_boundary=True)
Bel.Assemble()

Btr = BilinearForm(trialspace=fes_tr, testspace=fes_u, geom_free = True)
Btr += phat * (v*n) *dx(element_boundary=True)
Btr.Assemble();

B = Bel.mat + Btr.mat @ traceop
[5]:
invmassp = fes_p.Mass(1).Inverse()
invmassu = fes_u.Mass(1).Inverse()
[6]:
gfp.Interpolate( exp(-100*(x**2+y**2+z**2)))
gfu.vec[:] = 0

scene = Draw (gftr, draw_vol=False, order=3, min=-0.05, max=0.05);

t = 0
dt = 0.3 * h / (order+1)**2
tend = 1

op1 = dt * invmassu @ B
op2 = dt * invmassp @ B.T

cnt = 0
with TaskManager():
    while t < tend:
        t = t+dt
        gfu.vec.data += op1 * gfp.vec
        gfp.vec.data -= op2 * gfu.vec
        cnt = cnt+1
        if cnt%10 == 0:
            gftr.vec.data = traceop * gfp.vec
            scene.Redraw()

Time-stepping on the device

[7]:
try:
    import ngsolve.ngscuda
except:
    print ("Sorry, no cuda device")
Sorry, no cuda device
[8]:
gfp.Interpolate( exp(-100*(x**2+y**2+z**2)))
gfu.vec[:] = 0

scene = Draw (gftr, draw_vol=False, order=3);

t = 0
dt = 0.5 * h / (order+1)**2 / 2
tend = 0.1

op1 = (dt * invmassu @ B).CreateDeviceMatrix()
op2 = (dt * invmassp @ B.T).CreateDeviceMatrix()

devu = gfu.vec.CreateDeviceVector(copy=True)
devp = gfp.vec.CreateDeviceVector(copy=True)

cnt = 0
with TaskManager():
    while t < tend:
        t = t+dt
        devu += op1 * devp
        devp -= op2 * devu
        cnt = cnt+1
        if cnt%10 == 0:
            gfp.vec.data = devp
            gftr.vec.data = traceop * gfp.vec
            scene.Redraw()
[9]:
print (op1.GetOperatorInfo())
ProductMatrix, h = 1954050, w = 651350
  ScaleMatrix, scale = 0.001, h = 1954050, w = 1954050
    SumMatrix, h = 1954050, w = 1954050
      SumMatrix, h = 1954050, w = 1954050
        BlockDiagonalMatrixSoA (bs = 3x3), h = 1954050, w = 1954050
        ProductMatrix, h = 1954050, w = 1954050
          ProductMatrix, h = 1954050, w = 18375
            Transpose, h = 1954050, w = 18375
              ConstantEBEMatrix (bs = 125x35), h = 18375, w = 1954050
            BlockDiagonalMatrixSoA (bs = 3x3), h = 18375, w = 18375
          ConstantEBEMatrix (bs = 125x35), h = 18375, w = 1954050
      ProductMatrix, h = 1954050, w = 1954050
        ProductMatrix, h = 1954050, w = 23625
          Transpose, h = 1954050, w = 23625
            ConstantEBEMatrix (bs = 125x35), h = 23625, w = 1954050
          BlockDiagonalMatrixSoA (bs = 3x3), h = 23625, w = 23625
        ConstantEBEMatrix (bs = 125x35), h = 23625, w = 1954050
  SumMatrix, h = 1954050, w = 651350
    SumMatrix, h = 1954050, w = 651350
      ConstantEBEMatrix (bs = 105x35), h = 1954050, w = 651350
      ConstantEBEMatrix (bs = 105x35), h = 1954050, w = 651350
    ProductMatrix, h = 1954050, w = 651350
      SumMatrix, h = 1954050, w = 586770
        ConstantEBEMatrix (bs = 105x60), h = 1954050, w = 586770
        ConstantEBEMatrix (bs = 105x60), h = 1954050, w = 586770
      ProductMatrix, h = 586770, w = 651350
        N4ngla14DiagonalMatrixIdEE, h = 586770, w = 586770
        SumMatrix, h = 586770, w = 651350
          ConstantEBEMatrix (bs = 60x35), h = 586770, w = 651350
          ConstantEBEMatrix (bs = 60x35), h = 586770, w = 651350

[10]:
ts = time()
steps = 10
for i in range(steps):
    devu += op1 * devp
    devp -= op2 * devu
te = time()
print ("ndof = ", gfp.space.ndof, "+", gfu.space.ndof, ", time per step =", (te-ts)/steps)
ndof =  651350 + 1954050 , time per step = 0.1520322799682617

On the A-100:

ndof = 651035 + 1953105 , time per step = 0.0023579835891723634

Time-domain PML

PML (perfectly matched layers) is a method for approximating outgoing waves on a truncated domain. Its time domain version leads to additional field variables in the PML region, which are coupled via time-derivatives only.

[11]:
from ring_resonator_import import *
from ngsolve import *
from ngsolve.webgui import Draw
[12]:
gfu = GridFunction(fes)
scene = Draw (gfu.components[0], order=3, min=-0.05, max=0.05, autoscale=False)

t = 0
tend = 15
tau = 2e-4
i = 0
sigma = 10   # pml damping parameter

op1 = invp@(-fullB.T-sigma*dampingp)
op2 = invu@(fullB-sigma*dampingu)
with TaskManager():
    while t < tend:

        gfu.vec.data += tau*Envelope(t)*srcvec
        gfu.vec.data += tau*op1*gfu.vec
        gfu.vec.data += tau*op2*gfu.vec

        t += tau
        i += 1
        if i%20 == 0:
            scene.Redraw()

The differential operators and coupling terms to the pml - variables are represented via linear operators;

[13]:
print (fullB.GetOperatorInfo())
print (dampingp.GetOperatorInfo())
SumMatrix, h = 72125, w = 72125
  EmbeddedTransposeMatrix, h = 72125, w = 72125
    EmbeddedMatrix, h = 72125, w = 13920
      SumMatrix, h = 18560, w = 13920
        SumMatrix, h = 18560, w = 13920
          ConstantEBEMatrix (bs = 20x15), h = 18560, w = 13920
          ConstantEBEMatrix (bs = 20x15), h = 18560, w = 13920
        ProductMatrix, h = 18560, w = 13920
          SumMatrix, h = 18560, w = 7165
            ConstantEBEMatrix (bs = 20x15), h = 18560, w = 7165
            ConstantEBEMatrix (bs = 20x15), h = 18560, w = 7165
          SumMatrix, h = 7165, w = 13920
            ConstantEBEMatrix (bs = 15x15), h = 7165, w = 13920
            ConstantEBEMatrix (bs = 15x15), h = 7165, w = 13920
  EmbeddedTransposeMatrix, h = 72125, w = 72125
    EmbeddedMatrix, h = 72125, w = 13920
      SumMatrix, h = 7165, w = 13920
        ConstantEBEMatrix (bs = 15x15), h = 7165, w = 13920
        ConstantEBEMatrix (bs = 15x15), h = 7165, w = 13920

SumMatrix, h = 72125, w = 72125
  SumMatrix, h = 72125, w = 72125
    EmbeddedTransposeMatrix, h = 72125, w = 72125
      EmbeddedMatrix, h = 72125, w = 13920
        N6ngcomp9ApplyMassE, h = 13920, w = 13920
    ProductMatrix, h = 72125, w = 72125
      EmbeddedMatrix, h = 72125, w = 13920
        N6ngcomp9ApplyMassE, h = 13920, w = 13920
      SumMatrix, h = 13920, w = 72125
        ScaleMatrix, scale = 2, h = 13920, w = 72125
          N4ngla18EmbeddingTransposeE, h = 13920, w = 72125
        N4ngla18EmbeddingTransposeE, h = 13920, w = 72125
  EmbeddedTransposeMatrix, h = 72125, w = 72125
    EmbeddedMatrix, h = 72125, w = 13920
      N6ngcomp9ApplyMassE, h = 13920, w = 13920

[14]:
print (op1.GetOperatorInfo())
ProductMatrix, h = 72125, w = 72125
  SumMatrix, h = 72125, w = 72125
    EmbeddedTransposeMatrix, h = 72125, w = 72125
      EmbeddedMatrix, h = 72125, w = 13920
        N6ngcomp9ApplyMassE, h = 13920, w = 13920
    EmbeddedTransposeMatrix, h = 72125, w = 72125
      EmbeddedMatrix, h = 72125, w = 13920
        N6ngcomp9ApplyMassE, h = 13920, w = 13920
  SumMatrix, h = 72125, w = 72125
    ScaleMatrix, scale = -1, h = 72125, w = 72125
      Transpose, h = 72125, w = 72125
        SumMatrix, h = 72125, w = 72125
          EmbeddedTransposeMatrix, h = 72125, w = 72125
            EmbeddedMatrix, h = 72125, w = 13920
              SumMatrix, h = 18560, w = 13920
                SumMatrix, h = 18560, w = 13920
                  ConstantEBEMatrix (bs = 20x15), h = 18560, w = 13920
                  ConstantEBEMatrix (bs = 20x15), h = 18560, w = 13920
                ProductMatrix, h = 18560, w = 13920
                  SumMatrix, h = 18560, w = 7165
                    ConstantEBEMatrix (bs = 20x15), h = 18560, w = 7165
                    ConstantEBEMatrix (bs = 20x15), h = 18560, w = 7165
                  SumMatrix, h = 7165, w = 13920
                    ConstantEBEMatrix (bs = 15x15), h = 7165, w = 13920
                    ConstantEBEMatrix (bs = 15x15), h = 7165, w = 13920
          EmbeddedTransposeMatrix, h = 72125, w = 72125
            EmbeddedMatrix, h = 72125, w = 13920
              SumMatrix, h = 7165, w = 13920
                ConstantEBEMatrix (bs = 15x15), h = 7165, w = 13920
                ConstantEBEMatrix (bs = 15x15), h = 7165, w = 13920
    ScaleMatrix, scale = 10, h = 72125, w = 72125
      SumMatrix, h = 72125, w = 72125
        SumMatrix, h = 72125, w = 72125
          EmbeddedTransposeMatrix, h = 72125, w = 72125
            EmbeddedMatrix, h = 72125, w = 13920
              N6ngcomp9ApplyMassE, h = 13920, w = 13920
          ProductMatrix, h = 72125, w = 72125
            EmbeddedMatrix, h = 72125, w = 13920
              N6ngcomp9ApplyMassE, h = 13920, w = 13920
            SumMatrix, h = 13920, w = 72125
              ScaleMatrix, scale = 2, h = 13920, w = 72125
                N4ngla18EmbeddingTransposeE, h = 13920, w = 72125
              N4ngla18EmbeddingTransposeE, h = 13920, w = 72125
        EmbeddedTransposeMatrix, h = 72125, w = 72125
          EmbeddedMatrix, h = 72125, w = 13920
            N6ngcomp9ApplyMassE, h = 13920, w = 13920

[ ]: