armi.mpiActions module¶
This module provides an abstract class to be used to implement “MPI actions.”
MPI actions are tasks, activities, or work that can be executed on the worker nodes. The standard
workflow is essentially that the master node creates an MpiAction
,
sends it to the workers, and then both the master and the workers
invoke()
together. For example:
Step |
Code |
Notes |
---|---|---|
1 |
master: worker: |
master: Initializing a distribute state action. worker: Waiting for something to do, as determined by the master, this happens within the
worker’s |
2 |
master: worker: |
master: Broadcasts a distribute state action to all the worker nodes worker: Receives the action from the master, which is a
|
3 |
master: worker: |
Both invoke the action, and are in sync. Any broadcast or receive within the action should also be synced up. |
In order to create a new, custom MPI Action, inherit from MpiAction
,
and override the invokeHook()
method.
-
class
armi.mpiActions.
MpiAction
[source]¶ Bases:
object
Base of all MPI actions.
MPI Actions are tasks that can be executed without needing lots of other information. When a worker node sits in it’s main loop, and receives an MPI Action, it will simply call
invoke()
.-
property
parallel
¶
-
classmethod
invokeAsMaster
(o, r, cs)[source]¶ Simplified method to call from the master process.
This can be used in place of:
someInstance = MpiAction() someInstance = COMM_WORLD.bcast(someInstance, root=0) someInstance.invoke(o, r, cs)
Interestingly, the code above can be used in two ways:
Both the master and worker can call the above code at the same time, or
the master can run the above code, which will be handled by the worker’s main loop.
Option number 2 is the most common usage.
Warning
This method will not work if the constructor (i.e.
__init__
) requires additional arguments. Since the method body is so simple, it is strong discouraged to add a*args
or**kwargs
arguments to this method.- Parameters
o (
armi.operators.Operator
) – If an operator is not necessary, supplyNone
.r (
armi.operators.Reactor
) – If a reactor is not necessary, supplyNone
.
-
_mpiOperationHelper
(obj, mpiFunction)[source]¶ Strips off the operator, reactor, cs from the mpiAction before
-
broadcast
(obj=None)[source]¶ A wrapper around
bcast
, on the master node can be run with an equals sign, so that it can be consistent within both master and worker nodes.- Parameters
obj – This is any object that can be broadcast, if it is None, then it will broadcast itself, which triggers it to run on the workers (assuming the workers are in the worker main loop.
See also
armi.operators.operator.OperatorMPI.workerOperate()
receives this on the workers and calls
invoke
Notes
The standard
bcast
method creates a new instance even for the root process. Consequently, when passing an object, references can be broken to the original object. Therefore, this method, returns the original object when called by the master node, or the broadcasted object when called on the worker nodes.
-
gather
(obj=None)[source]¶ A wrapper around
MPI_COMM.gather
.- Parameters
obj – This is any object that can be gathered, if it is None, then it will gather itself.
Notes
The returned list will contain a reference to the original gathered object, without making a copy of it.
-
invoke
(o, r, cs)[source]¶ This method is called by worker nodes, and passed the worker node’s operator, reactor and settings file.
- Parameters
o (
armi.operators.operator.Operator
) – the operator for this processr (
armi.reactor.reactors.Reactor
) – the reactor represented in this processcs (
armi.settings.caseSettings.Settings
) – the case settings
- Returns
result – result from invokeHook
- Return type
-
static
mpiFlatten
(allCPUResults)[source]¶ Flatten results to the same order they were in before making a list of mpiIter results.
See also
mpiIter()
used for distributing objects/tasks
-
static
mpiIter
(objectsForAllCoresToIter)[source]¶ Generate the subset of objects one node is responsible for in MPI.
Notes
Each CPU will get similar number of objects. E.G. if there are 12 objects and 5 CPUs, the first 2 CPUs will get 3 objects and the last 3 CPUS will get 2.
- Parameters
objectsForAllCoresToIter (list) – List of all objects that need to have an MPI calculation performed on. Note, that since len() is needed this method cannot accept a generator.
See also
mpiFlatten()
used for collecting results
-
invokeHook
()[source]¶ This method must be overridden in sub-clases.
This method is called by worker nodes, and has access to the worker node’s operator, reactor, and settings (through
self.o
,self.r
, andself.cs
). It must return a boolean value ofTrue
orFalse
, otherwise the worker node will raise an exception and terminate execution.- Returns
result – Dependent on implementation
- Return type
-
property
-
armi.mpiActions.
runActions
(o, r, cs, actions, numPerNode=None, serial=False)[source]¶ Run a series of MpiActions in parallel, or in series if
serial=True
.Notes
The number of actions DOES NOT need to match
armi.MPI_SIZE
.Calling this method may invoke MPI Split which will change the MPI_SIZE during the action. This allows someone to call MPI operations without being blocked by tasks which are not doing the same thing.
-
armi.mpiActions.
runActionsInSerial
(o, r, cs, actions)[source]¶ Run a series of MpiActions in serial.
Notes
This will set the MpiAction.serial attribute to
True
, and the MpiAction.broadcast and MpiAction.gather methods will basically just return the value being supplied.
-
class
armi.mpiActions.
DistributionAction
(actions)[source]¶ Bases:
armi.mpiActions.MpiAction
This MpiAction scatters the workload of multiple actions to available resources.
Notes
This currently only works from the root (of COMM_WORLD). Eventually, it would be nice to make it possible for sub-tasks to manage their own communicators and spawn their own work within some sub-communicator.
This performs an MPI Split operation and takes over the armi.MPI_COMM and associated varaibles. For this reason, it is possible that when someone thinks they have distributed information to all nodes, it may only be a subset that was necessary to perform the number of actions needed by this DsitributionAction.
-
exception
armi.mpiActions.
MpiActionError
[source]¶ Bases:
Exception
Exception class raised when error conditions occur during an MpiAction.
-
class
armi.mpiActions.
DistributeStateAction
(skipInterfaces=False)[source]¶ Bases:
armi.mpiActions.MpiAction
-
invokeHook
()[source]¶ Sync up all nodes with the reactor, the cs, and the interfaces.
Notes
This is run by all workers and the master any time the code needs to sync all processors.
-
_distributeInterfaces
()[source]¶ Distribute the interfaces to all MPI nodes.
Interface copy description Since interfaces store information that can influence a calculation, it is important in branch searches to make sure that no information is carried forward from these runs on either the master node or the workers. However, there are interfaces that cannot be distributed, making this a challenge. To solve this problem, any interface that cannot be distributed is simply re-initialized. If any information needs to be given to the worker nodes on a non-distributable interface, additional function definitions (and likely soul searching as to why needed distributable information is on a non-distributable interface) are required to pass the information around.
See also
armi.interfaces.Interface.preDistributeState()
runs on master before DS
armi.interfaces.Interface.postDistributeState()
runs on master after DS
armi.interfaces.Interface.interactDistributeState()
runs on workers after DS
-
-
armi.mpiActions.
_diagnosePickleError
(o)[source]¶ Scans through various parts of the reactor to identify which part cannot be pickled.
Notes
So, you’re having a pickle error and you don’t know why. This method will help you find the problem. It doesn’t always catch everything, but it does help.
We also find that modifying the Python library as documented here tells us which object can’t be pickled by printing it out.