This page was generated from unit-5a.1-mpi/poisson_mpi.ipynb.

Poisson Equation in Parallel¶

NGSolve can be executed on a cluster using the MPI message passing interface. You can download poisson_mpi.py and run it as

mpirun -np 4 python3 poisson_mpi.py

This computes and saves a solution which we can visualize it with drawsolution.py

netgen drawsolution.py

For proper parallel execution, Netgen/NGSolve must be configured with -DUSE_MPI=ON. Recent binaries for Linux and Mac are built with MPI support (?). If you are unsure, when your Netgen/NGSolve supports MPI, look for outputs like “Including MPI version 3.1” during Netgen startup.

MPI-parallel execution using ipyparallel¶

For the MPI jupyter-tutorials we use ipyparallel module. Please consult https://ipyparallel.readthedocs.io for installation.

On my notebook I have created the profile once by the command

ipython profile create –parallel –profile=mpi

Since I have only two physical cores, I edited the file .ipython/profile_mpi/ipcluster_config.py to allow for oversubscription:

c.MPILauncher.mpi_args = [‘–oversubscribe’]

I start the cluster via

ipcluster start –engines=MPI -n 4 –profile=mpi

In jupyter, we can then connect to the cluster via a Client object of ipyparallel.

[1]:

from ipyparallel import Client
c = Client(profile='mpi')
c.ids

[1]:

[0, 1, 2, 3]

[2]:

help (Client)

Help on class Client in module ipyparallel.client.client:

class Client(traitlets.traitlets.HasTraits)
 |  Client(*args, **kw)
 |
 |  A semi-synchronous client to an IPython parallel cluster
 |
 |  Parameters
 |  ----------
 |
 |  url_file : str
 |      The path to ipcontroller-client.json.
 |      This JSON file should contain all the information needed to connect to a cluster,
 |      and is likely the only argument needed.
 |      Connection information for the Hub's registration.  If a json connector
 |      file is given, then likely no further configuration is necessary.
 |      [Default: use profile]
 |  profile : bytes
 |      The name of the Cluster profile to be used to find connector information.
 |      If run from an IPython application, the default profile will be the same
 |      as the running application, otherwise it will be 'default'.
 |  cluster_id : str
 |      String id to added to runtime files, to prevent name collisions when using
 |      multiple clusters with a single profile simultaneously.
 |      When set, will look for files named like: 'ipcontroller-<cluster_id>-client.json'
 |      Since this is text inserted into filenames, typical recommendations apply:
 |      Simple character strings are ideal, and spaces are not recommended (but
 |      should generally work)
 |  context : zmq.Context
 |      Pass an existing zmq.Context instance, otherwise the client will create its own.
 |  debug : bool
 |      flag for lots of message printing for debug purposes
 |  timeout : float
 |      time (in seconds) to wait for connection replies from the Hub
 |      [Default: 10]
 |
 |  Other Parameters
 |  ----------------
 |
 |  sshserver : str
 |      A string of the form passed to ssh, i.e. 'server.tld' or 'user@server.tld:port'
 |      If keyfile or password is specified, and this is not, it will default to
 |      the ip given in addr.
 |  sshkey : str; path to ssh private key file
 |      This specifies a key to be used in ssh login, default None.
 |      Regular default ssh keys will be used without specifying this argument.
 |  password : str
 |      Your ssh password to sshserver. Note that if this is left None,
 |      you will be prompted for it if passwordless key based login is unavailable.
 |  paramiko : bool
 |      flag for whether to use paramiko instead of shell ssh for tunneling.
 |      [default: True on win32, False else]
 |
 |
 |  Attributes
 |  ----------
 |
 |  ids : list of int engine IDs
 |      requesting the ids attribute always synchronizes
 |      the registration state. To request ids without synchronization,
 |      use semi-private _ids attributes.
 |
 |  history : list of msg_ids
 |      a list of msg_ids, keeping track of all the execution
 |      messages you have submitted in order.
 |
 |  outstanding : set of msg_ids
 |      a set of msg_ids that have been submitted, but whose
 |      results have not yet been received.
 |
 |  results : dict
 |      a dict of all our results, keyed by msg_id
 |
 |  block : bool
 |      determines default behavior when block not specified
 |      in execution methods
 |
 |  Method resolution order:
 |      Client
 |      traitlets.traitlets.HasTraits
 |      traitlets.traitlets.HasDescriptors
 |      builtins.object
 |
 |  Methods defined here:
 |
 |  __del__(self)
 |      cleanup sockets, but _not_ context.
 |
 |  __getitem__(self, key)
 |      index access returns DirectView multiplexer objects
 |
 |      Must be int, slice, or list/tuple/xrange of ints
 |
 |  __init__(self, url_file=None, profile=None, profile_dir=None, ipython_dir=None, context=None, debug=False, sshserver=None, sshkey=None, password=None, paramiko=None, timeout=10, cluster_id=None, **extra_args)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |
 |  __iter__(self)
 |      Since we define getitem, Client is iterable
 |
 |      but unless we also define __iter__, it won't work correctly unless engine IDs
 |      start at zero and are continuous.
 |
 |  __len__(self)
 |      len(client) returns # of engines.
 |
 |  abort(self, jobs=None, targets=None, block=None)
 |      Abort specific jobs from the execution queues of target(s).
 |
 |      This is a mechanism to prevent jobs that have already been submitted
 |      from executing.
 |
 |      Parameters
 |      ----------
 |
 |      jobs : msg_id, list of msg_ids, or AsyncResult
 |          The jobs to be aborted
 |
 |          If unspecified/None: abort all outstanding jobs.
 |
 |  activate(self, targets='all', suffix='')
 |      Create a DirectView and register it with IPython magics
 |
 |      Defines the magics `%px, %autopx, %pxresult, %%px`
 |
 |      Parameters
 |      ----------
 |
 |      targets: int, list of ints, or 'all'
 |          The engines on which the view's magics will run
 |      suffix: str [default: '']
 |          The suffix, if any, for the magics.  This allows you to have
 |          multiple views associated with parallel magics at the same time.
 |
 |          e.g. ``rc.activate(targets=0, suffix='0')`` will give you
 |          the magics ``%px0``, ``%pxresult0``, etc. for running magics just
 |          on engine 0.
 |
 |  become_dask(self, targets='all', port=0, nanny=False, scheduler_args=None, **worker_args)
 |      Turn the IPython cluster into a dask.distributed cluster
 |
 |      Parameters
 |      ----------
 |
 |      targets: target spec (default: all)
 |          Which engines to turn into dask workers.
 |      port: int (default: random)
 |          Which port
 |      nanny: bool (default: False)
 |          Whether to start workers as subprocesses instead of in the engine process.
 |          Using a nanny allows restarting the worker processes via ``executor.restart``.
 |      scheduler_args: dict
 |          Keyword arguments (e.g. ip) to pass to the distributed.Scheduler constructor.
 |      **worker_args:
 |          Any additional keyword arguments (e.g. ncores) are passed to the distributed.Worker constructor.
 |
 |      Returns
 |      -------
 |
 |      client = distributed.Client
 |          A dask.distributed.Client connected to the dask cluster.
 |
 |  become_distributed = become_dask(self, targets='all', port=0, nanny=False, scheduler_args=None, **worker_args)
 |
 |  clear(self, targets=None, block=None)
 |      Clear the namespace in target(s).
 |
 |  close(self, linger=None)
 |      Close my zmq Sockets
 |
 |      If `linger`, set the zmq LINGER socket option,
 |      which allows discarding of messages.
 |
 |  db_query(self, query, keys=None)
 |      Query the Hub's TaskRecord database
 |
 |      This will return a list of task record dicts that match `query`
 |
 |      Parameters
 |      ----------
 |
 |      query : mongodb query dict
 |          The search dict. See mongodb query docs for details.
 |      keys : list of strs [optional]
 |          The subset of keys to be returned.  The default is to fetch everything but buffers.
 |          'msg_id' will *always* be included.
 |
 |  direct_view(self, targets='all', **kwargs)
 |      construct a DirectView object.
 |
 |      If no targets are specified, create a DirectView using all engines.
 |
 |      rc.direct_view('all') is distinguished from rc[:] in that 'all' will
 |      evaluate the target engines at each execution, whereas rc[:] will connect to
 |      all *current* engines, and that list will not change.
 |
 |      That is, 'all' will always use all engines, whereas rc[:] will not use
 |      engines added after the DirectView is constructed.
 |
 |      Parameters
 |      ----------
 |
 |      targets: list,slice,int,etc. [default: use all engines]
 |          The engines to use for the View
 |      kwargs: passed to DirectView
 |
 |  executor(self, targets=None)
 |      Construct a PEP-3148 Executor with a LoadBalancedView
 |
 |      Parameters
 |      ----------
 |
 |      targets: list,slice,int,etc. [default: use all engines]
 |          The subset of engines across which to load-balance execution
 |
 |      Returns
 |      -------
 |
 |      executor: Executor
 |          The Executor object
 |
 |  get_result(self, indices_or_msg_ids=None, block=None, owner=True)
 |      Retrieve a result by msg_id or history index, wrapped in an AsyncResult object.
 |
 |      If the client already has the results, no request to the Hub will be made.
 |
 |      This is a convenient way to construct AsyncResult objects, which are wrappers
 |      that include metadata about execution, and allow for awaiting results that
 |      were not submitted by this Client.
 |
 |      It can also be a convenient way to retrieve the metadata associated with
 |      blocking execution, since it always retrieves
 |
 |      Examples
 |      --------
 |      ::
 |
 |          In [10]: r = client.apply()
 |
 |      Parameters
 |      ----------
 |
 |      indices_or_msg_ids : integer history index, str msg_id, AsyncResult,
 |          or a list of same.
 |          The indices or msg_ids of indices to be retrieved
 |
 |      block : bool
 |          Whether to wait for the result to be done
 |      owner : bool [default: True]
 |          Whether this AsyncResult should own the result.
 |          If so, calling `ar.get()` will remove data from the
 |          client's result and metadata cache.
 |          There should only be one owner of any given msg_id.
 |
 |      Returns
 |      -------
 |
 |      AsyncResult
 |          A single AsyncResult object will always be returned.
 |
 |      AsyncHubResult
 |          A subclass of AsyncResult that retrieves results from the Hub
 |
 |  hub_history(self)
 |      Get the Hub's history
 |
 |      Just like the Client, the Hub has a history, which is a list of msg_ids.
 |      This will contain the history of all clients, and, depending on configuration,
 |      may contain history across multiple cluster sessions.
 |
 |      Any msg_id returned here is a valid argument to `get_result`.
 |
 |      Returns
 |      -------
 |
 |      msg_ids : list of strs
 |              list of all msg_ids, ordered by task submission time.
 |
 |  load_balanced_view(self, targets=None, **kwargs)
 |      construct a DirectView object.
 |
 |      If no arguments are specified, create a LoadBalancedView
 |      using all engines.
 |
 |      Parameters
 |      ----------
 |
 |      targets: list,slice,int,etc. [default: use all engines]
 |          The subset of engines across which to load-balance execution
 |      kwargs: passed to LoadBalancedView
 |
 |  purge_everything(self)
 |      Clears all content from previous Tasks from both the hub and the local client
 |
 |      In addition to calling `purge_results("all")` it also deletes the history and
 |      other bookkeeping lists.
 |
 |  purge_hub_results(self, jobs=[], targets=[])
 |      Tell the Hub to forget results.
 |
 |      Individual results can be purged by msg_id, or the entire
 |      history of specific targets can be purged.
 |
 |      Use `purge_results('all')` to scrub everything from the Hub's db.
 |
 |      Parameters
 |      ----------
 |
 |      jobs : str or list of str or AsyncResult objects
 |              the msg_ids whose results should be forgotten.
 |      targets : int/str/list of ints/strs
 |              The targets, by int_id, whose entire history is to be purged.
 |
 |              default : None
 |
 |  purge_local_results(self, jobs=[], targets=[])
 |      Clears the client caches of results and their metadata.
 |
 |      Individual results can be purged by msg_id, or the entire
 |      history of specific targets can be purged.
 |
 |      Use `purge_local_results('all')` to scrub everything from the Clients's
 |      results and metadata caches.
 |
 |      After this call all `AsyncResults` are invalid and should be discarded.
 |
 |      If you must "reget" the results, you can still do so by using
 |      `client.get_result(msg_id)` or `client.get_result(asyncresult)`. This will
 |      redownload the results from the hub if they are still available
 |      (i.e `client.purge_hub_results(...)` has not been called.
 |
 |      Parameters
 |      ----------
 |
 |      jobs : str or list of str or AsyncResult objects
 |              the msg_ids whose results should be purged.
 |      targets : int/list of ints
 |              The engines, by integer ID, whose entire result histories are to be purged.
 |
 |      Raises
 |      ------
 |
 |      RuntimeError : if any of the tasks to be purged are still outstanding.
 |
 |  purge_results(self, jobs=[], targets=[])
 |      Clears the cached results from both the hub and the local client
 |
 |      Individual results can be purged by msg_id, or the entire
 |      history of specific targets can be purged.
 |
 |      Use `purge_results('all')` to scrub every cached result from both the Hub's and
 |      the Client's db.
 |
 |      Equivalent to calling both `purge_hub_results()` and `purge_client_results()` with
 |      the same arguments.
 |
 |      Parameters
 |      ----------
 |
 |      jobs : str or list of str or AsyncResult objects
 |              the msg_ids whose results should be forgotten.
 |      targets : int/str/list of ints/strs
 |              The targets, by int_id, whose entire history is to be purged.
 |
 |              default : None
 |
 |  queue_status(self, targets='all', verbose=False)
 |      Fetch the status of engine queues.
 |
 |      Parameters
 |      ----------
 |
 |      targets : int/str/list of ints/strs
 |              the engines whose states are to be queried.
 |              default : all
 |      verbose : bool
 |              Whether to return lengths only, or lists of ids for each element
 |
 |  resubmit(self, indices_or_msg_ids=None, metadata=None, block=None)
 |      Resubmit one or more tasks.
 |
 |      in-flight tasks may not be resubmitted.
 |
 |      Parameters
 |      ----------
 |
 |      indices_or_msg_ids : integer history index, str msg_id, or list of either
 |          The indices or msg_ids of indices to be retrieved
 |
 |      block : bool
 |          Whether to wait for the result to be done
 |
 |      Returns
 |      -------
 |
 |      AsyncHubResult
 |          A subclass of AsyncResult that retrieves results from the Hub
 |
 |  result_status(self, msg_ids, status_only=True)
 |      Check on the status of the result(s) of the apply request with `msg_ids`.
 |
 |      If status_only is False, then the actual results will be retrieved, else
 |      only the status of the results will be checked.
 |
 |      Parameters
 |      ----------
 |
 |      msg_ids : list of msg_ids
 |          if int:
 |              Passed as index to self.history for convenience.
 |      status_only : bool (default: True)
 |          if False:
 |              Retrieve the actual results of completed tasks.
 |
 |      Returns
 |      -------
 |
 |      results : dict
 |          There will always be the keys 'pending' and 'completed', which will
 |          be lists of msg_ids that are incomplete or complete. If `status_only`
 |          is False, then completed results will be keyed by their `msg_id`.
 |
 |  send_apply_request(self, socket, f, args=None, kwargs=None, metadata=None, track=False, ident=None)
 |      construct and send an apply message via a socket.
 |
 |      This is the principal method with which all engine execution is performed by views.
 |
 |  send_execute_request(self, socket, code, silent=True, metadata=None, ident=None)
 |      construct and send an execute request via a socket.
 |
 |  shutdown(self, targets='all', restart=False, hub=False, block=None)
 |      Terminates one or more engine processes, optionally including the hub.
 |
 |      Parameters
 |      ----------
 |
 |      targets: list of ints or 'all' [default: all]
 |          Which engines to shutdown.
 |      hub: bool [default: False]
 |          Whether to include the Hub.  hub=True implies targets='all'.
 |      block: bool [default: self.block]
 |          Whether to wait for clean shutdown replies or not.
 |      restart: bool [default: False]
 |          NOT IMPLEMENTED
 |          whether to restart engines after shutting them down.
 |
 |  spin(self)
 |      DEPRECATED, DOES NOTHING
 |
 |  spin_thread(self, interval=1)
 |      DEPRECATED, DOES NOTHING
 |
 |  stop_dask(self, targets='all')
 |      Stop the distributed Scheduler and Workers started by become_dask.
 |
 |      Parameters
 |      ----------
 |
 |      targets: target spec (default: all)
 |          Which engines to stop dask workers on.
 |
 |  stop_distributed = stop_dask(self, targets='all')
 |
 |  stop_spin_thread(self)
 |      DEPRECATED, DOES NOTHING
 |
 |  wait(self, jobs=None, timeout=-1)
 |      waits on one or more `jobs`, for up to `timeout` seconds.
 |
 |      Parameters
 |      ----------
 |
 |      jobs : int, str, or list of ints and/or strs, or one or more AsyncResult objects
 |              ints are indices to self.history
 |              strs are msg_ids
 |              default: wait on all outstanding messages
 |      timeout : float
 |              a time in seconds, after which to give up.
 |              default is -1, which means no timeout
 |
 |      Returns
 |      -------
 |
 |      True : when all msg_ids are done
 |      False : timeout reached, some msg_ids still outstanding
 |
 |  wait_interactive(self, jobs=None, interval=1.0, timeout=-1.0)
 |      Wait interactively for jobs
 |
 |      If no job is specified, will wait for all outstanding jobs to complete.
 |
 |  ----------------------------------------------------------------------
 |  Static methods defined here:
 |
 |  __new__(self, *args, **kw)
 |      Create and return a new object.  See help(type) for accurate signature.
 |
 |  ----------------------------------------------------------------------
 |  Readonly properties defined here:
 |
 |  ids
 |      Always up-to-date ids property.
 |
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |
 |  block
 |      A boolean (True, False) trait.
 |
 |  debug
 |      A boolean (True, False) trait.
 |
 |  history
 |      An instance of a Python list.
 |
 |  metadata
 |      A trait whose value must be an instance of a specified class.
 |
 |      The value can also be an instance of a subclass of the specified class.
 |
 |      Subclasses can declare default classes by overriding the klass attribute
 |
 |  outstanding
 |      An instance of a Python set.
 |
 |  profile
 |      A trait for unicode strings.
 |
 |  results
 |      A trait whose value must be an instance of a specified class.
 |
 |      The value can also be an instance of a subclass of the specified class.
 |
 |      Subclasses can declare default classes by overriding the klass attribute
 |
 |  ----------------------------------------------------------------------
 |  Methods inherited from traitlets.traitlets.HasTraits:
 |
 |  __getstate__(self)
 |
 |  __setstate__(self, state)
 |
 |  add_traits(self, **traits)
 |      Dynamically add trait attributes to the HasTraits instance.
 |
 |  has_trait(self, name)
 |      Returns True if the object has a trait with the specified name.
 |
 |  hold_trait_notifications(self)
 |      Context manager for bundling trait change notifications and cross
 |      validation.
 |
 |      Use this when doing multiple trait assignments (init, config), to avoid
 |      race conditions in trait notifiers requesting other trait values.
 |      All trait notifications will fire after all values have been assigned.
 |
 |  notify_change(self, change)
 |
 |  observe(self, handler, names=traitlets.All, type='change')
 |      Setup a handler to be called when a trait changes.
 |
 |      This is used to setup dynamic notifications of trait changes.
 |
 |      Parameters
 |      ----------
 |      handler : callable
 |          A callable that is called when a trait changes. Its
 |          signature should be ``handler(change)``, where ``change`` is a
 |          dictionary. The change dictionary at least holds a 'type' key.
 |          * ``type``: the type of notification.
 |          Other keys may be passed depending on the value of 'type'. In the
 |          case where type is 'change', we also have the following keys:
 |          * ``owner`` : the HasTraits instance
 |          * ``old`` : the old value of the modified trait attribute
 |          * ``new`` : the new value of the modified trait attribute
 |          * ``name`` : the name of the modified trait attribute.
 |      names : list, str, All
 |          If names is All, the handler will apply to all traits.  If a list
 |          of str, handler will apply to all names in the list.  If a
 |          str, the handler will apply just to that name.
 |      type : str, All (default: 'change')
 |          The type of notification to filter by. If equal to All, then all
 |          notifications are passed to the observe handler.
 |
 |  on_trait_change(self, handler=None, name=None, remove=False)
 |      DEPRECATED: Setup a handler to be called when a trait changes.
 |
 |      This is used to setup dynamic notifications of trait changes.
 |
 |      Static handlers can be created by creating methods on a HasTraits
 |      subclass with the naming convention '_[traitname]_changed'.  Thus,
 |      to create static handler for the trait 'a', create the method
 |      _a_changed(self, name, old, new) (fewer arguments can be used, see
 |      below).
 |
 |      If `remove` is True and `handler` is not specified, all change
 |      handlers for the specified name are uninstalled.
 |
 |      Parameters
 |      ----------
 |      handler : callable, None
 |          A callable that is called when a trait changes.  Its
 |          signature can be handler(), handler(name), handler(name, new),
 |          handler(name, old, new), or handler(name, old, new, self).
 |      name : list, str, None
 |          If None, the handler will apply to all traits.  If a list
 |          of str, handler will apply to all names in the list.  If a
 |          str, the handler will apply just to that name.
 |      remove : bool
 |          If False (the default), then install the handler.  If True
 |          then unintall it.
 |
 |  set_trait(self, name, value)
 |      Forcibly sets trait attribute, including read-only attributes.
 |
 |  setup_instance(self, *args, **kwargs)
 |      This is called **before** self.__init__ is called.
 |
 |  trait_metadata(self, traitname, key, default=None)
 |      Get metadata values for trait by key.
 |
 |  trait_names(self, **metadata)
 |      Get a list of all the names of this class' traits.
 |
 |  traits(self, **metadata)
 |      Get a ``dict`` of all the traits of this class.  The dictionary
 |      is keyed on the name and the values are the TraitType objects.
 |
 |      The TraitTypes returned don't know anything about the values
 |      that the various HasTrait's instances are holding.
 |
 |      The metadata kwargs allow functions to be passed in which
 |      filter traits based on metadata values.  The functions should
 |      take a single value as an argument and return a boolean.  If
 |      any function returns False, then the trait is not included in
 |      the output.  If a metadata key doesn't exist, None will be passed
 |      to the function.
 |
 |  unobserve(self, handler, names=traitlets.All, type='change')
 |      Remove a trait change handler.
 |
 |      This is used to unregister handlers to trait change notifications.
 |
 |      Parameters
 |      ----------
 |      handler : callable
 |          The callable called when a trait attribute changes.
 |      names : list, str, All (default: All)
 |          The names of the traits for which the specified handler should be
 |          uninstalled. If names is All, the specified handler is uninstalled
 |          from the list of notifiers corresponding to all changes.
 |      type : str or All (default: 'change')
 |          The type of notification to filter by. If All, the specified handler
 |          is uninstalled from the list of notifiers corresponding to all types.
 |
 |  unobserve_all(self, name=traitlets.All)
 |      Remove trait change handlers of any type for the specified name.
 |      If name is not specified, removes all trait notifiers.
 |
 |  ----------------------------------------------------------------------
 |  Class methods inherited from traitlets.traitlets.HasTraits:
 |
 |  class_own_trait_events(name) from traitlets.traitlets.MetaHasTraits
 |      Get a dict of all event handlers defined on this class, not a parent.
 |
 |      Works like ``event_handlers``, except for excluding traits from parents.
 |
 |  class_own_traits(**metadata) from traitlets.traitlets.MetaHasTraits
 |      Get a dict of all the traitlets defined on this class, not a parent.
 |
 |      Works like `class_traits`, except for excluding traits from parents.
 |
 |  class_trait_names(**metadata) from traitlets.traitlets.MetaHasTraits
 |      Get a list of all the names of this class' traits.
 |
 |      This method is just like the :meth:`trait_names` method,
 |      but is unbound.
 |
 |  class_traits(**metadata) from traitlets.traitlets.MetaHasTraits
 |      Get a ``dict`` of all the traits of this class.  The dictionary
 |      is keyed on the name and the values are the TraitType objects.
 |
 |      This method is just like the :meth:`traits` method, but is unbound.
 |
 |      The TraitTypes returned don't know anything about the values
 |      that the various HasTrait's instances are holding.
 |
 |      The metadata kwargs allow functions to be passed in which
 |      filter traits based on metadata values.  The functions should
 |      take a single value as an argument and return a boolean.  If
 |      any function returns False, then the trait is not included in
 |      the output.  If a metadata key doesn't exist, None will be passed
 |      to the function.
 |
 |  trait_events(name=None) from traitlets.traitlets.MetaHasTraits
 |      Get a ``dict`` of all the event handlers of this class.
 |
 |      Parameters
 |      ----------
 |      name: str (default: None)
 |          The name of a trait of this class. If name is ``None`` then all
 |          the event handlers of this class will be returned instead.
 |
 |      Returns
 |      -------
 |      The event handlers associated with a trait name, or all event handlers.
 |
 |  ----------------------------------------------------------------------
 |  Readonly properties inherited from traitlets.traitlets.HasTraits:
 |
 |  cross_validation_lock
 |      A contextmanager for running a block with our cross validation lock set
 |      to True.
 |
 |      At the end of the block, the lock's value is restored to its value
 |      prior to entering the block.
 |
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from traitlets.traitlets.HasDescriptors:
 |
 |  __dict__
 |      dictionary for instance variables (if defined)
 |
 |  __weakref__
 |      list of weak references to the object (if defined)

We use mpi4py https://mpi4py.readthedocs.io/ for issuing MPI calls from Python. The %%px syntax magic causes parallel execution of that cell:

[3]:

%%px
from mpi4py import MPI
comm = MPI.COMM_WORLD
print (comm.rank, comm.size)

[stdout:0] 0 4
[stdout:1] 1 4
[stdout:2] 2 4
[stdout:3] 3 4

The master process (rank==0) generates the mesh, and distributes it within the group of processes defined by the communicator. All other ranks receive a part of the mesh. The function mesh.GetNE(VOL) returns the local number of elements:

[4]:

%%px
from ngsolve import *
from netgen.geom2d import unit_square

if comm.rank == 0:
    mesh = Mesh(unit_square.GenerateMesh(maxh=0.1).Distribute(comm))
else:
    mesh = Mesh(netgen.meshing.Mesh.Receive(comm))
print (mesh.GetNE(VOL))

[stdout:0] 0
[stdout:1] 77
[stdout:2] 78
[stdout:3] 75

We can define spaces, bilinear / linear forms, and gridfunctions in the same way as in sequential mode. But now, the degrees of freedom are distributed on the cluster following the distribution of the mesh. The finite element spaces define how the dofs match together.

[5]:

%%px
fes = H1(mesh, order=3, dirichlet=".*")
u,v = fes.TnT()

a = BilinearForm(grad(u)*grad(v)*dx)
pre = Preconditioner(a, "local")
a.Assemble()

f = LinearForm(1*v*dx).Assemble()
gfu = GridFunction(fes)

from ngsolve.krylovspace import CGSolver
inv = CGSolver(a.mat, pre.mat, printing=comm.rank==0, maxsteps=200, tol=1e-8)
gfu.vec.data = inv*f.vec

[stdout:0]
iteration 0 error = 0.053339234092911685
iteration 1 error = 0.0740460414310471
iteration 2 error = 0.06370425458795265
iteration 3 error = 0.04919124251788342
iteration 4 error = 0.04829652542142217
iteration 5 error = 0.031106912554491464
iteration 6 error = 0.0249356433981004
iteration 7 error = 0.01784194991245135
iteration 8 error = 0.013817492065755424
iteration 9 error = 0.011423769319077626
iteration 10 error = 0.00839386761852051
iteration 11 error = 0.004233406891736342
iteration 12 error = 0.0019377308491299874
iteration 13 error = 0.001408555531820299
iteration 14 error = 0.000991343797171786
iteration 15 error = 0.0006412217321541269
iteration 16 error = 0.00044876370080339955
iteration 17 error = 0.00029026903025157827
iteration 18 error = 0.00017121515214383354
iteration 19 error = 0.00013131363792643467
iteration 20 error = 0.00010180630781276401
iteration 21 error = 6.493748016473812e-05
iteration 22 error = 3.747328865245323e-05
iteration 23 error = 2.4894442492438547e-05
iteration 24 error = 1.8668022673771387e-05
iteration 25 error = 1.4656078306112323e-05
iteration 26 error = 1.00312807766201e-05
iteration 27 error = 6.195489468925556e-06
iteration 28 error = 3.8394953774092735e-06
iteration 29 error = 2.9619818216397206e-06
iteration 30 error = 2.191076333882794e-06
iteration 31 error = 1.5147554899846986e-06
iteration 32 error = 9.456591809733725e-07
iteration 33 error = 6.934293022267518e-07
iteration 34 error = 5.419825539015876e-07
iteration 35 error = 3.8073885692382817e-07
iteration 36 error = 2.610840714803283e-07
iteration 37 error = 1.6150747478403415e-07
iteration 38 error = 1.0843607023800082e-07
iteration 39 error = 8.266347488194315e-08
iteration 40 error = 6.401498304181558e-08
iteration 41 error = 4.089801380525814e-08
iteration 42 error = 2.518568909359594e-08
iteration 43 error = 1.7451968530291785e-08
iteration 44 error = 1.2422309974412238e-08
iteration 45 error = 8.990884037271554e-09
iteration 46 error = 5.8101609164032095e-09
iteration 47 error = 3.298163877994398e-09
iteration 48 error = 2.1285824956359945e-09
iteration 49 error = 1.5456064017184192e-09
iteration 50 error = 1.0303209026040304e-09
iteration 51 error = 6.846274435536751e-10
iteration 52 error = 4.5512272688861414e-10

Parallel pickling allows to serialize the distributed solution and transfer it to the client:

[6]:

%%px
netgen.meshing.SetParallelPickling(True)

In the next bit of code, all processes of the client object c pickle their local gfu object. Setting SetParallelPickling(True) enables communication so that the process with rank=0 gets the whole mesh and the computed solution.

[7]:

gfu = c[:]["gfu"]

We can now draw the whole solution using the the master process’s gfu[0].

[8]:

from ngsolve.webgui import Draw
Draw (gfu[0])

[8]:

Drawing gfu[n] will draw only part of the solution that the process with rank=n possesses.

[9]:

Draw (gfu[3])

[9]:

We can also visualize the sub-domains obtained by the automatic partitioning, without using any computed solution, as follows.

[10]:

%%px
fesL2 = L2(mesh, order=0)
gfL2 = GridFunction(fesL2)
gfL2.vec.local_vec[:] = comm.rank

[11]:

gfL2 = c[:]["gfL2"]
Draw (gfL2[0])

[11]: