armi.bookkeeping.db.database3 module¶
ARMI Database implementation, version 3.
This Implementation of the database is a significant departure from the previous. One of the foundational concepts in this version is that a reactor model should be fully recoverable from the database itself; all the way down to the component level. As a result, the structure of the underlying data is bound to the hierarchical Composite Reactor Model, rather than an ad hoc collection of Block parameter fields and other parameters. Furthermore, this format is intended to be more dynamic, permitting as-yet undeveloped levels and classes in the Composite Reactor Model to be supported as they are added. More high-level discussion is contained in The Database File.
The most important contents of this module are the DatabaseInterface
, the
Database3
class, the Layout
class, and the special data
packing/unpacking functions. The Database3
class contains most of the functionality
for interacting with the underlying data. This includes things like dumping a Reactor
state to the database and loading it back again, as well as extracting historical data
for a given object or collection of object from the database file. When interacting with
the database file, the Layout
class is used to help map the hierarchical Composite
Reactor Model to the flat representation in the database.
Refer to armi.bookkeeping.db
for notes about versioning.
Minor revision changelog¶
3.1: Improve the handling of reading/writing grids.
3.2: Change the strategy for storing large attributes from using an Object Reference to an external dataset to using a special string starting with an “@” symbol (e.g., “@/c00n00/attrs/5_linkedDims”). This was done to support copying time node datasets from one file to another without invalidating the references. Support is maintained for reading previous versions, and for performing a
mergeHistory()
and converting to the new reference strategy, but the old version cannot be written.3.3: Compress the way locations are stored in the database and allow MultiIndex locations to be read and written.
3.4: Modified the way that locations are stored in the database to include complete indices for indices that can be composed from multiple grids. This was done since the space is already being used to be able to store them, and because having complete indices allows for more efficient means of extracting information based on location without having to compose the full model.
-
armi.bookkeeping.db.database3.
describeInterfaces
(cs)[source]¶ Function for exposing interface(s) to other code
-
class
armi.bookkeeping.db.database3.
DatabaseInterface
(r, cs)[source]¶ Bases:
armi.interfaces.Interface
Handles interactions between the ARMI data model and the persistent data storage system.
This reads/writes the ARMI state to/from the database and helps derive state information that can be derived.
Construct an interface.
The
r
andcs
arguments are required, but may beNone
, where appropriate for the specificInterface
implementation.- Parameters
- Raises
RuntimeError – Interfaces derived from Interface must define their name
-
name
= 'database'¶
-
property
database
¶ Presents the internal database object, if it exists.
-
interactBOL
()[source]¶ Initialize the database if the main interface was not available. (Begining of Life)
-
initDB
(fName: Optional[os.PathLike] = None)[source]¶ Open the underlying database to be written to, and write input files to DB.
Notes
Main Interface calls this so that the database is available as early as possible in the run. The database interface interacts near the end of the interface stack (so that all the parameters have been updated) while the Main Interface interacts first.
-
interactEveryNode
(cycle, node)[source]¶ Write to database.
DBs should receive the state information of the run at each node.
-
interactEOC
(cycle=None)[source]¶ In case anything changed since last cycle (e.g. rxSwing), update DB. (End of Cycle)
-
interactDistributeState
() → None[source]¶ Reconnect to pre-existing database.
DB is created and managed by the master node only but we can still connect to it from workers to enable things like history tracking.
-
distributable
()[source]¶ Return true if this can be MPI broadcast.
Notes
Cases where this isn’t possible include the database interface, where the SQL driver cannot be distributed.
-
prepRestartRun
(dbCycle, dbNode)[source]¶ Load the data history from the database being restarted from.
-
_getLoadDB
(fileName)[source]¶ Return the database to load from in order of preference.
Notes
If filename is present only returns one database since specifically instructed to load from that database.
-
loadState
(cycle, timeNode, timeStepName='', fileName=None, updateGlobalAssemNum=True)[source]¶ Loads a fresh reactor and applies it to the Operator.
Notes
Will load preferentially from the fileName if passed. Otherwise will load from existing database in memory or cs[“reloadDBName”] in that order.
- Raises
RuntimeError – If fileName is specified and that file does not have the time step. If fileName is not specified and neither the database in memory, nor the cs[“reloadDBName”] have the time step specified.
-
getHistory
(comp: armi.reactor.composites.ArmiObject, params: Optional[Sequence[str]] = None, timeSteps: Optional[MutableSequence[Tuple[int, int]]] = None, byLocation: bool = False) → Dict[str, Dict[Tuple[int, int], Any]][source]¶ Get historical parameter values for a single object.
This is mostly a wrapper around the same function on the
Database3
class, but knows how to return the current value as well.See also
-
getHistories
(comps: Sequence[armi.reactor.composites.ArmiObject], params: Optional[Sequence[str]] = None, timeSteps: Optional[MutableSequence[Tuple[int, int]]] = None, byLocation: bool = False) → Dict[armi.reactor.composites.ArmiObject, Dict[str, Dict[Tuple[int, int], Any]]][source]¶ Get historical parameter values for one or more objects.
This is mostly a wrapper around the same function on the
Database3
class, but knows how to return the current value as well.See also
-
class
armi.bookkeeping.db.database3.
Database3
(fileName: os.PathLike, permission: str)[source]¶ Bases:
armi.bookkeeping.db.database.Database
Version 3 of the ARMI Database, handling serialization and loading of Reactor states.
This implementation of the database pushes all objects in the Composite Reactor Model into the database. This process is aided by the
Layout
class, which handles the packing and unpacking of the structure of the objects, their relationships, and their non-parameter attributes.See also
doc/user/outputs/database for more details.
Create a new Database3 object.
- Parameters
fileName – name of the file
permission – file permissions, write (“w”) or read (“r”)
-
timeNodeGroupPattern
= re.compile('^c(\\d\\d)n(\\d\\d)$')¶
-
property
version
¶
-
property
versionMajor
¶
-
property
versionMinor
¶
-
static
grabLocalCommitHash
()[source]¶ Try to determine the local Git commit.
We have to be sure to handle the errors where the code is run on a system that doesn’t have Git installed. Or if the code is simply not run from inside a repo.
- Returns
The commit hash if it exists, otherwise “unknown”.
- Return type
-
splitDatabase
(keepTimeSteps: Sequence[Tuple[int, int]], label: str) → str[source]¶ Discard all data except for specific time steps, retaining old data in a separate file.
This is useful when performing more exotic analyses, where each “time step” may not represent a specific point in time, but something more nuanced. For example, equilibrium cases store a new “cycle” for each iteration as it attempts to converge the equilibrium cycle. At the end of the run, the last “cycle” is the converged equilibrium cycle, whereas the previous cycles constitute the path to convergence, which we typically wish to discard before further analysis.
- Parameters
keepTimeSteps – A collection of the time steps to retain
label – An informative label for the backed-up database. Usually something like “-all-iterations”. Will be interposed between the source name and the “.h5” extension.
- Returns
The name of the new, backed-up database file.
- Return type
-
property
fileName
¶
-
loadCS
()[source]¶ Attempt to load settings from the database file
Notes
There are no guarantees here. If the database was written from a different version of ARMI than you are using, these results may not be usable. For instance, the database could have been written from a vastly old or future version of ARMI from the code you are using.
-
loadBlueprints
()[source]¶ Attempt to load reactor blueprints from the database file
Notes
There are no guarantees here. If the database was written from a different version of ARMI than you are using, these results may not be usable. For instance, the database could have been written from a vastly old or future version of ARMI from the code you are using.
-
loadGeometry
()[source]¶ This is primarily just used for migrations. The “geometry files” were replaced by
systems:
andgrids:
sections ofBlueprints
.
-
writeInputsToDB
(cs, csString=None, geomString=None, bpString=None)[source]¶ Write inputs into the database based the CaseSettings.
This is not DRY on purpose. The goal is that any particular Database implementation should be very stable, so we dont want it to be easy to change one Database implementation’s behavior when trying to change another’s.
Notes
This is hard-coded to read the entire file contents into memory and write that directly into the database. We could have the cs/blueprints/geom write to a string, however the ARMI log file contains a hash of each files’ contents. In the future, we should be able to reproduce a calculation with confidence that the inputs are identical.
-
mergeHistory
(inputDB, startCycle, startNode)[source]¶ Copy time step data up to, but not including the passed cycle and node.
Notes
This is used for restart runs with the standard operator for example. The current time step (being loaded from) should not be copied, as that time steps data will be written at the end of the time step.
-
__exit__
(type, value, traceback)[source]¶ Typically we don’t care why it broke but we want the DB to close
-
genTimeStepGroups
(timeSteps: Sequence[Tuple[int, int]] = None) → Generator[h5py._hl.group.Group, None, None][source]¶ Returns a generator of HDF5 Groups for all time nodes, or for the passed selection.
-
getLayout
(cycle, node)[source]¶ Return a Layout object representing the requested cycle and time node.
-
genTimeSteps
() → Generator[Tuple[int, int], None, None][source]¶ Returns a generator of (cycle, node) tuples that are present in the DB.
-
genAuxiliaryData
(ts: Tuple[int, int]) → Generator[str, None, None][source]¶ Returns a generator of names of auxiliary data on the requested time point.
-
getAuxiliaryDataPath
(ts: Tuple[int, int], name: str) → str[source]¶ Get a string describing a path to an auxiliary data location.
- Parameters
ts – The time step that the auxiliary data belongs to
name – The name of the auxiliary data
- Returns
An absolute location for storing auxiliary data with the given name for the given time step
- Return type
-
getH5Group
(r, statePointName=None)[source]¶ Get the H5Group for the current ARMI timestep.
This method can be used to allow other interfaces to place data into the database at the correct timestep.
-
hasTimeStep
(cycle, timeNode, statePointName='')[source]¶ Returns True if (cycle, timeNode, statePointName) is contained in the database.
Copy DB to run working directory.
Needed when multiple MPI processes need to read the same db, for example when a history is needed from independent runs (e.g. for fuel performance on a variety of assemblies).
Notes
At some future point, we may implement a client-server like DB system which would render this kind of operation unnecessary.
-
load
(cycle, node, cs=None, bp=None, statePointName=None, allowMissing=False)[source]¶ Load a new reactor from (cycle, node).
Case settings, blueprints, and geom can be provided by the client, or read from the database itself. Providing these from the client could be useful when performing snapshot runs or the like, where it is expected to use results from a run using different settings, then continue with new settings. Even in this case, the blueprints and geom should probably be the same as the original run.
- Parameters
cycle (int) – cycle number
node (int) – time node
cs (armi.settings.Settings (optional)) – if not provided one is read from the database
bp (armi.reactor.Blueprints (Optional)) – if not provided one is read from the database
statePointName (str) – Optional arbitrary statepoint name (e.g., “special” for “c00n00-special/”)
allowMissing (bool) – Whether to emit a warning, rather than crash if reading a database with undefined parameters. Default False.
- Returns
root – The top-level object stored in the database; usually a Reactor.
- Return type
-
_compose
(comps, cs, parent=None)[source]¶ Given a flat collection of all of the ArmiObjects in the model, reconstitute the hierarchy.
-
static
_addHomogenizedNumberDensityParams
(blocks, h5group)[source]¶ Create on-the-fly block homog. number density params for XTVIEW viewing.
See also
-
getHistoryByLocation
(comp: armi.reactor.composites.ArmiObject, params: Optional[List[str]] = None, timeSteps: Optional[Sequence[Tuple[int, int]]] = None) → Dict[str, Dict[Tuple[int, int], Any]][source]¶ Get the parameter histories at a specific location.
-
getHistoriesByLocation
(comps: Sequence[armi.reactor.composites.ArmiObject], params: Optional[List[str]] = None, timeSteps: Optional[Sequence[Tuple[int, int]]] = None) → Dict[armi.reactor.composites.ArmiObject, Dict[str, Dict[Tuple[int, int], Any]]][source]¶ Get the parameter histories at specific locations.
- This has a number of limitations, which should in practice not be too limiting:
The passed objects must have IndexLocations. This type of operation doesn’t make much sense otherwise.
The passed objects must exist in a hierarchy that leads to a Core object, which serves as an anchor that can fully define all index locations. This could possibly be made more general by extending grids, but that gets a little more complicated.
All requested objects must exist under the same anchor object, and at the same depth below it.
All requested objects must have the same type.
- Parameters
comps (list of ArmiObject) – The components/composites that currently occupy the location that you want histories at. ArmiObjects are passed, rather than locations, because this makes it easier to figure out things related to layout.
params (List of str, optional) – The parameter names for the parameters that we want the history of. If None, all parameter history is given
timeSteps (List of (cycle, node) tuples, optional) – The time nodes that you want history for. If None, all available time nodes will be returned.
-
getHistory
(comp: armi.reactor.composites.ArmiObject, params: Optional[Sequence[str]] = None, timeSteps: Optional[Sequence[Tuple[int, int]]] = None) → Dict[str, Dict[Tuple[int, int], Any]][source]¶ Get parameter history for a single ARMI Object.
- Parameters
comps – An individual ArmiObject
params – parameters to gather
- Returns
Dictionary of str/list pairs.
- Return type
-
getHistories
(comps: Sequence[armi.reactor.composites.ArmiObject], params: Optional[Sequence[str]] = None, timeSteps: Optional[Sequence[Tuple[int, int]]] = None) → Dict[armi.reactor.composites.ArmiObject, Dict[str, Dict[Tuple[int, int], Any]]][source]¶ Get the parameter histories for a sequence of ARMI Objects.
This implementation is unaware of the state of the reactor outside of the database itself, and is therefore not usually what client code should be calling directly during normal ARMI operation. It only knows about historical data that have actually been written to the database. Usually one wants to be able to get historical, plus current data, for which the similar method on the DatabaseInterface may be more useful.
- Parameters
comps – Something that is iterable multiple times
params – parameters to gather.
timeSteps – Selection of time nodes to get data for. If omitted, return full history
- Returns
Dictionary ArmiObject (input): dict of str/list pairs containing ((cycle, node), value).
- Return type
-
armi.bookkeeping.db.database3.
_packLocations
(locations: List[armi.reactor.grids.LocationBase], minorVersion: int = 4) → Tuple[List[str], List[Tuple[int, int, int]]][source]¶ Extract information from a location needed to write it to this DB.
Each locator has one locationType and up to N location-defining datums, where N is the number of entries in a possible multiindex, or just 1 for everything else.
Shrink grid locator names for storage efficiency.
Notes
Contains some conditionals to still load databases made before db version 3.3 which can be removed once no users care about those DBs anymore.
-
armi.bookkeeping.db.database3.
_packLocationsV1
(locations: List[armi.reactor.grids.LocationBase]) → Tuple[List[str], List[Tuple[int, int, int]]][source]¶ Delete when reading v <=3.2 DB’s no longer wanted.
-
armi.bookkeeping.db.database3.
_packLocationsV2
(locations: List[armi.reactor.grids.LocationBase]) → Tuple[List[str], List[Tuple[int, int, int]]][source]¶ Location packing implementation for minor version 3. See release notes above.
-
armi.bookkeeping.db.database3.
_packLocationsV3
(locations: List[armi.reactor.grids.LocationBase]) → Tuple[List[str], List[Tuple[int, int, int]]][source]¶ Location packing implementation for minor version 4. See release notes above.
-
armi.bookkeeping.db.database3.
_unpackLocations
(locationTypes, locData, minorVersion: int = 4)[source]¶ Convert location data as read from DB back into data structure for building reactor model.
location and locationType will only have different lengths when multiindex locations are used.
-
armi.bookkeeping.db.database3.
_unpackLocationsV1
(locationTypes, locData)[source]¶ Delete when reading v <=3.2 DB’s no longer wanted.
-
armi.bookkeeping.db.database3.
_unpackLocationsV2
(locationTypes, locData)[source]¶ Location unpacking implementation for minor version 3+. See release notes above.
-
class
armi.bookkeeping.db.database3.
Layout
(version: Tuple[int, int], h5group=None, comp=None)[source]¶ Bases:
object
The Layout class describes the hierarchical layout of the composite Reactor model in a flat representation.
A Layout is built up by starting at the root of a composite tree and recursively appending each node in the tree to the list of data. So for a typical Reactor model, the data will be ordered by depth-first search: [r, c, a1, a1b1, a1b1c1, a1b1c2, a1b2, a1b2c1, …, a2, …].
The layout is also responsible for storing Component attributes, like location, material, and temperatures (from blueprints), which aren’t stored as Parameters. Temperatures, specifically, are rather complicated beasts in ARMI, and more fundamental changes to how we deal with them may allow us to remove them from Layout.
Notes
As this format is liable to be consumed by other code, it is important to specify its structure so that code attempting to read/write Layouts can make safe assumptions. Below is a list of things to be aware of. More will be added as issues arise or things become more precise:
Elements in Layout are stored in depth-first order. This permits use of algorithms such as Pre-Order Tree Traversal for efficient traversal of regions of the model.
indexInData
increases monotonically within each objecttype
. This means that, for instance, the data for allHexBlock
children of a given parent are stored contiguously within theHexBlock
group, and will not be interleaved with data from theHexBlock
children of any of the parent’s siblings.Aside from the hierarchy itself, there is no guarantee what order objects are stored in the layout. “The
Core
” is not necessarily the first child of theReactor
, and is not guaranteed to use the zeroth grid.
-
_createLayout
(comp)[source]¶ Populate a hierarchical representation and group the reactor model items by type.
This is used when writing a reactor model to the database.
Notes
This is recursive.
See also
_readLayout()
does the opposite
-
_readLayout
(h5group)[source]¶ Populate a hierarchical representation and group the reactor model items by type.
This is used when reading a reactor model from a database.
See also
_createLayout()
does the opposite
-
static
computeAncestors
(serialNum, numChildren, depth=1) → List[Optional[int]][source]¶ Return a list containing the serial number of the parent corresponding to each object at the given depth.
Depth in this case means how many layers to reach up to find the desired ancestor. A depth of 1 will yield the direct parent of each element, depth of 2 would yield the elemen’s parent’s parent, and so on.
The zero-th element will always be None, as the first object is the root element and so has no parent. Subsequent depths will result in more Nones.
This function is useful for forming a lightweight sense of how the database contents stitch together, without having to go to the trouble of fully unpacking the Reactor model.
- Parameters
serialNum (List of int) – List of serial numbers for each object/element, as laid out in Layout
numChildren (List of int) – List of numbers of children for each object/element, as laid out in Layout
Note
This is not using a recursive approach for a couple of reasons. First, the iterative form isn’t so bad; we just need two stacks. Second, the interface of the recursive function would be pretty unwieldy. We are progressively consuming two lists, of which we would need to keep passing down with an index/cursor, or progressively slice them as we go, which would be pretty inefficient.
-
armi.bookkeeping.db.database3.
allSubclasses
(cls)[source]¶ This currently include Materials… and it should not.
-
armi.bookkeeping.db.database3.
packSpecialData
(data: numpy.ndarray, paramName: str) → Tuple[Optional[numpy.ndarray], Dict[str, Any]][source]¶ Reduce data that wouldn’t otherwise play nicely with HDF5/numpy arrays to a format that will.
This is the main entry point for conforming “strange” data into something that will both fit into a numpy array/HDF5 dataset, and be recoverable to its original-ish state when reading it back in. This is accomplished by detecting a handful of known offenders and using various HDF5 attributes to store necessary auxiliary data. It is important to keep in mind that the data that is passed in has already been converted to a numpy array, so the top dimension is always representing the collection of composites that are storing the parameters. For instance, if we are dealing with a Block parameter, the first index in the numpy array of data is the block index; so if each block has a parameter that is a dictionary,
data
would be a ndarray, where each element is a dictionary. This routine supports a number of different “strange” things: * Dict[str, float]: These are stored by finding the set of all keys for allinstances, and storing those keys as a list in an attribute. The data themselves are stored as arrays indexed by object, then key index. Dictionaries lacking data for a key store a nan in it’s place. This will work well in instances where most objects have data for most keys.
Jagged arrays: These are stored by concatenating all of the data into a single, one-dimensional array, and storing attributes to describe the shapes of each object’s data, and an offset into the beginning of each object’s data.
Arrays with
None
in them: These are stored by replacing each instance ofNone
with a magical value that shouldn’t be encountered in realistic scenarios.
- Parameters
data – An ndarray storing the data that we want to stuff into the database. These are usually dtype=Object, which is how we usually end up here in the first place.
paramName – The parameter name that we are trying to store data for. This is mostly used for diagnostics.
See also
-
armi.bookkeeping.db.database3.
unpackSpecialData
(data: numpy.ndarray, attrs, paramName: str) → numpy.ndarray[source]¶ Extract data from a specially-formatted HDF5 dataset into a numpy array.
This should invert the operations performed by
packSpecialData()
.- Parameters
data – Specially-formatted data array straight from the database.
attrs – The attributes associated with the dataset that contained the data.
paramName – The name of the parameter that is being unpacked. Only used for diagnostics.
- Returns
An ndarray containing the closest possible representation of the data that was originally written to the database.
- Return type
numpy.ndarray
See also
-
armi.bookkeeping.db.database3.
replaceNonsenseWithNones
(data: numpy.ndarray, paramName: str) → numpy.ndarray[source]¶ Replace special nonsense values with
None
.This essentially reverses the operations performed by
replaceNonesWithNonsense()
.- Parameters
data – The array from the database that contains special
None
nonsense values.paramName – The param name who’s data we are dealing with. Only used for diagnostics.
See also
-
armi.bookkeeping.db.database3.
replaceNonesWithNonsense
(data: numpy.ndarray, paramName: str, nones: numpy.ndarray = None) → numpy.ndarray[source]¶ Replace instances of
None
with nonsense values that can be detected/recovered when reading.- Parameters
data – The numpy array containing
None
values that need to be replaced.paramName – The name of the parameter who’s data we are treating. Only used for diagnostics.
nones – An array containing the index locations on the
None
elements. It is a little strange to pass these, in but we find these indices to determine whether we need to call this function in the first place, so might as well pass it in, so that we don’t need to perform the operation again.
Notes
This only supports situations where the data is a straight-up
None
, or a valid, database-storable numpy array (or easily convertable to one (e.g. tuples/lists with numerical values)). This does not support, for instance, a numpy ndarray with some Nones in it.For example, the following is supported:
[[1, 2, 3], None, [7, 8, 9]]
However, the following is not:
[[1, 2, 3], [4, None, 6], [7, 8, 9]]
See also
replaceNonsenseWithNones()
Reverses this operation.
-
armi.bookkeeping.db.database3.
_writeAttrs
(obj, group, attrs)[source]¶ Handle safely writing attributes to a dataset, handling large data if necessary.
This will attempt to store attributes directly onto an HDF5 object if possible, falling back to proper datasets and reference attributes if necessary. This is needed because HDF5 tries to fit attributes into the object header, which has limited space. If an attribute is too large, h5py raises a RuntimeError. In such cases, this will store the attribute data in a proper dataset and place a reference to that dataset in the attribute instead.
In practice, this takes
linkedDims
attrs from a particular component type (likec00n00/Circle/id
) and stores them in new datasets (likec00n00/attrs/1_linkedDims
,c00n00/attrs/2_linkedDims
) and then sets the object’s attrs to links to those datasets.
-
armi.bookkeeping.db.database3.
_resolveAttrs
(attrs, group)[source]¶ Reverse the action of _writeAttrs.
This reads actual attrs and looks for the real data in the datasets that the attrs were pointing to.
-
armi.bookkeeping.db.database3.
collectBlockNumberDensities
(blocks) → Dict[str, numpy.ndarray][source]¶ Collect block-by-block homogenized number densities for each nuclide.
Long ago, composition was stored on block params. No longer; they are on the component numberDensity params. These block-level params, are still useful to see compositions in some visualization tools. Rather than keep them on the reactor model, we dynamically compute them here and slap them in the database. These are ignored upon reading and will not affect the results.
Remove this once a better viz tool can view composition distributions. Also remove the try/except in
_readParams