{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 2.11 Matrix free operator application\n",
    "Usually, we assemble matrices in sparse matrix format. Certain methods allow to improve performance and reduce memory requirements considerably by avoiding building and storing the system matrix. The counterpart of this approach is that it is now difficult to build preconditioners"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Hybrid mixed method\n",
    "\n",
    "consider the hybrid mixed method\n",
    "\n",
    "$$\n",
    "\\begin{array}{ccccl}\n",
    "A \\sigma & + & B^T u & = & f \\\\\n",
    "B u & & & = & g\n",
    "\\end{array}\n",
    "$$\n",
    "\n",
    "where the matrix $A$ comes from\n",
    "\n",
    "$$\n",
    "\\int \\sigma \\tau dx \n",
    "$$\n",
    "\n",
    "and the matrix $B$ from\n",
    "\n",
    "$$\n",
    "\\sum_T \\int_T \\operatorname{div} \\, \\sigma \\, v_T - \\int_{\\partial T} \\sigma_n v_F\n",
    "$$\n",
    "\n",
    "For $\\sigma$ we use a `VectorL2` space, with a Piola-mapping to the physical elements"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from ngsolve import *\n",
    "from ngsolve.webgui import Draw\n",
    "from netgen.geom2d import unit_square\n",
    "from ngsolve.fem import MixedFE\n",
    "\n",
    "mesh = Mesh(unit_square.GenerateMesh(maxh=0.3))\n",
    "order=1\n",
    "Sigma = VectorL2(mesh, order=order, piola=True)\n",
    "Vt = L2(mesh, order=order-1)\n",
    "Vf = FacetFESpace(mesh, order=order)\n",
    "V = Vt*Vf\n",
    "sigma,tau = Sigma.TnT()\n",
    "(ut,uf), (vt,vf) = V.TnT()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "a = BilinearForm(Sigma)\n",
    "a += sigma*tau * dx\n",
    "\n",
    "b = BilinearForm(trialspace=Sigma, testspace=V)\n",
    "b += div(sigma) * vt * dx\n",
    "n = specialcf.normal(mesh.dim)\n",
    "dS = dx(element_boundary=True)\n",
    "b += -sigma*n*vf * dS\n",
    "b.Assemble()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "compute element matrices ...\n",
    "\n",
    "we should observe that the matrices from the $A$ operator are block-diagonal with $2\\times 2$ blocks, and each element matrix from $B$ belongs to one of two equivalence classes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for el in mesh.Elements():\n",
    "    felS = Sigma.GetFE(el)\n",
    "    trafo = mesh.GetTrafo(el)\n",
    "    elmat = a.integrators[0].CalcElementMatrix(felS, trafo)   \n",
    "    print (elmat)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for el in mesh.Elements():\n",
    "    felS = Sigma.GetFE(el)\n",
    "    felV = V.GetFE(el)\n",
    "    fel = MixedFE(felS, felV)\n",
    "    trafo = mesh.GetTrafo(el)\n",
    "\n",
    "    # try integratos 0 and 1 ...\n",
    "    elmat = b.integrators[0].CalcElementMatrix(fel, trafo)   \n",
    "    print (elmat)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Geometry-free matrix multiplication \n",
    "NGSolve provides now the possibility to benefit from the geometry independent element matrix. Just one matrix per equivalence class is stored, and the matrix vector product is performed for all elements simultaneously using dense matrix algebra."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from netgen.csg import *\n",
    "geom = CSGeometry()\n",
    "geom.Add (Sphere(Pnt(50,50,50),80) \\\n",
    "          -Cylinder(Pnt(-100,0,0),Pnt(200,0,0), 40) \\\n",
    "          -Cylinder(Pnt(100,-100,100),Pnt(100,200,100),40)\n",
    "          -Cylinder(Pnt(0,100,-100), Pnt(0,100,200),40)\n",
    "          -Sphere(Pnt(50,50,50),50))\n",
    "# geom.Draw()\n",
    "\n",
    "mesh = Mesh(geom.GenerateMesh(maxh=25))\n",
    "mesh.Curve(5)\n",
    "# Draw (mesh)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Same as above, with the `geom_free` flag for the `BilinearForm`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "order=6\n",
    "SetHeapSize(100*1000*1000)\n",
    "Sigma = VectorL2(mesh, order=order, piola=True, tp=True)\n",
    "Vt = L2(mesh, order=order-1)\n",
    "Vf = FacetFESpace(mesh, order=order, dirichlet=[2])\n",
    "V = Vt*Vf\n",
    "print (\"Sigma.ndof =\", Sigma.ndof, \", V.ndof =\", V.ndof)\n",
    "sigma,tau = Sigma.TnT()\n",
    "(ut,uf), (vt,vf) = V.TnT()\n",
    "\n",
    "b = BilinearForm(trialspace=Sigma, testspace=V, geom_free=True)\n",
    "b += div(sigma) * vt * dx\n",
    "n = specialcf.normal(mesh.dim)\n",
    "dS = dx(element_boundary=True)\n",
    "b += -sigma*n*vf * dS\n",
    "b.Assemble()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Check timing with `geom_free=True/False`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "vx = b.mat.CreateRowVector()\n",
    "vy = b.mat.CreateColVector()\n",
    "vx[:] = 1\n",
    "\n",
    "%timeit vy.data = b.mat * vx"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`L2` finite element spaces have a `Mass` method generating a linear operator for multiplication with the diagonal mass matrix. In case of variable coefficients or curved elements it uses numerical integration. It also provide an inverse operator:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ainv = Sigma.Mass(rho=1).Inverse()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Check timig with `tp=True/False`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "vx = ainv.CreateRowVector()\n",
    "vy = ainv.CreateColVector()\n",
    "vx[:] = 1\n",
    "\n",
    "%timeit vy.data = ainv * vx"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Combine the operators to form the Schur complement, which is actually a discretization of the Laplace operator:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "Laplace = b.mat @ ainv @ b.mat.T"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "f = LinearForm (V)\n",
    "f += 1*vt * dx\n",
    "f.Assemble()\n",
    "\n",
    "proj = Projector(V.FreeDofs(), True)\n",
    "\n",
    "gfu = GridFunction (V)\n",
    "from time import time\n",
    "start = time()\n",
    "with TaskManager():\n",
    "  solvers.CG(mat=Laplace, pre=proj, rhs=f.vec, sol=gfu.vec, maxsteps=5000, printrates=False)\n",
    "print (\"time =\", time()-start)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The iteration count is very high since we used only the most trivial preconditioner. We can add a low order coarse-grid preconditioner to get rid of the mesh dependency, and also local inverses to improve the $p$-dependency."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "Draw (gfu.components[1], mesh, \"gfu1\", draw_vol=False);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
