armi.bookkeeping.db.database3 module

ARMI Database implementation, version 3.

This Implementation of the database is a significant departure from the previous. One of the foundational concepts in this version is that a reactor model should be fully recoverable from the database itself; all the way down to the component level. As a result, the structure of the underlying data is bound to the hierarchical Composite Reactor Model, rather than an ad hoc collection of Block parameter fields and other parameters. Furthermore, this format is intended to be more dynamic, permitting as-yet undeveloped levels and classes in the Composite Reactor Model to be supported as they are added. More high-level discussion is contained in The Database File.

The most important contents of this module are the DatabaseInterface, the Database3 class, the Layout class, and the special data packing/unpacking functions. The Database3 class contains most of the functionality for interacting with the underlying data. This includes things like dumping a Reactor state to the database and loading it back again, as well as extracting historical data for a given object or collection of object from the database file. When interacting with the database file, the Layout class is used to help map the hierarchical Composite Reactor Model to the flat representation in the database.

Refer to armi.bookkeeping.db for notes about versioning.

Minor revision changelog

  • 3.1: Improve the handling of reading/writing grids.

  • 3.2: Change the strategy for storing large attributes from using an Object Reference to an external dataset to using a special string starting with an “@” symbol (e.g., “@/c00n00/attrs/5_linkedDims”). This was done to support copying time node datasets from one file to another without invalidating the references. Support is maintained for reading previous versions, and for performing a mergeHistory() and converting to the new reference strategy, but the old version cannot be written.

  • 3.3: Compress the way locations are stored in the database and allow MultiIndex locations to be read and written.

  • 3.4: Modified the way that locations are stored in the database to include complete indices for indices that can be composed from multiple grids. This was done since the space is already being used to be able to store them, and because having complete indices allows for more efficient means of extracting information based on location without having to compose the full model.

armi.bookkeeping.db.database3.getH5GroupName(cycle, timeNode, statePointName=None)[source]
armi.bookkeeping.db.database3.describeInterfaces(cs)[source]

Function for exposing interface(s) to other code

armi.bookkeeping.db.database3.updateGlobalAssemblyNum(r)[source]
class armi.bookkeeping.db.database3.DatabaseInterface(r, cs)[source]

Bases: armi.interfaces.Interface

Handles interactions between the ARMI data model and the persistent data storage system.

This reads/writes the ARMI state to/from the database and helps derive state information that can be derived.

Construct an interface.

The r and cs arguments are required, but may be None, where appropriate for the specific Interface implementation.

Parameters
  • r (Reactor) – A reactor to attach to

  • cs (Settings) – Settings object to use

Raises

RuntimeError – Interfaces derived from Interface must define their name

name = 'database'
property database

Presents the internal database object, if it exists.

interactBOL()[source]

Initialize the database if the main interface was not available. (Begining of Life)

initDB(fName: Optional[os.PathLike] = None)[source]

Open the underlying database to be written to, and write input files to DB.

Notes

Main Interface calls this so that the database is available as early as possible in the run. The database interface interacts near the end of the interface stack (so that all the parameters have been updated) while the Main Interface interacts first.

interactEveryNode(cycle, node)[source]

Write to database.

DBs should receive the state information of the run at each node.

interactEOC(cycle=None)[source]

In case anything changed since last cycle (e.g. rxSwing), update DB. (End of Cycle)

interactEOL()[source]

DB’s should be closed at run’s end. (End of Life)

interactError()[source]

Get shutdown state information even if the run encounters an error

interactDistributeState() → None[source]

Reconnect to pre-existing database.

DB is created and managed by the master node only but we can still connect to it from workers to enable things like history tracking.

distributable()[source]

Return true if this can be MPI broadcast.

Notes

Cases where this isn’t possible include the database interface, where the SQL driver cannot be distributed.

prepRestartRun(dbCycle, dbNode)[source]

Load the data history from the database being restarted from.

_getLoadDB(fileName)[source]

Return the database to load from in order of preference.

Notes

If filename is present only returns one database since specifically instructed to load from that database.

loadState(cycle, timeNode, timeStepName='', fileName=None, updateGlobalAssemNum=True)[source]

Loads a fresh reactor and applies it to the Operator.

Notes

Will load preferentially from the fileName if passed. Otherwise will load from existing database in memory or cs[“reloadDBName”] in that order.

Raises

RuntimeError – If fileName is specified and that file does not have the time step. If fileName is not specified and neither the database in memory, nor the cs[“reloadDBName”] have the time step specified.

getHistory(comp: armi.reactor.composites.ArmiObject, params: Optional[Sequence[str]] = None, timeSteps: Optional[MutableSequence[Tuple[int, int]]] = None, byLocation: bool = False) → Dict[str, Dict[Tuple[int, int], Any]][source]

Get historical parameter values for a single object.

This is mostly a wrapper around the same function on the Database3 class, but knows how to return the current value as well.

getHistories(comps: Sequence[armi.reactor.composites.ArmiObject], params: Optional[Sequence[str]] = None, timeSteps: Optional[MutableSequence[Tuple[int, int]]] = None, byLocation: bool = False) → Dict[armi.reactor.composites.ArmiObject, Dict[str, Dict[Tuple[int, int], Any]]][source]

Get historical parameter values for one or more objects.

This is mostly a wrapper around the same function on the Database3 class, but knows how to return the current value as well.

class armi.bookkeeping.db.database3.Database3(fileName: os.PathLike, permission: str)[source]

Bases: armi.bookkeeping.db.database.Database

Version 3 of the ARMI Database, handling serialization and loading of Reactor states.

This implementation of the database pushes all objects in the Composite Reactor Model into the database. This process is aided by the Layout class, which handles the packing and unpacking of the structure of the objects, their relationships, and their non-parameter attributes.

See also

doc/user/outputs/database for more details.

Create a new Database3 object.

Parameters
  • fileName – name of the file

  • permission – file permissions, write (“w”) or read (“r”)

timeNodeGroupPattern = re.compile('^c(\\d\\d)n(\\d\\d)$')
property version
property versionMajor
property versionMinor
open()[source]
static grabLocalCommitHash()[source]

Try to determine the local Git commit.

We have to be sure to handle the errors where the code is run on a system that doesn’t have Git installed. Or if the code is simply not run from inside a repo.

Returns

The commit hash if it exists, otherwise “unknown”.

Return type

str

close(completedSuccessfully=False)[source]

Close the DB and perform cleanups and auto-conversions.

splitDatabase(keepTimeSteps: Sequence[Tuple[int, int]], label: str) → str[source]

Discard all data except for specific time steps, retaining old data in a separate file.

This is useful when performing more exotic analyses, where each “time step” may not represent a specific point in time, but something more nuanced. For example, equilibrium cases store a new “cycle” for each iteration as it attempts to converge the equilibrium cycle. At the end of the run, the last “cycle” is the converged equilibrium cycle, whereas the previous cycles constitute the path to convergence, which we typically wish to discard before further analysis.

Parameters
  • keepTimeSteps – A collection of the time steps to retain

  • label – An informative label for the backed-up database. Usually something like “-all-iterations”. Will be interposed between the source name and the “.h5” extension.

Returns

The name of the new, backed-up database file.

Return type

str

property fileName
loadCS()[source]

Attempt to load settings from the database file

Notes

There are no guarantees here. If the database was written from a different version of ARMI than you are using, these results may not be usable. For instance, the database could have been written from a vastly old or future version of ARMI from the code you are using.

loadBlueprints()[source]

Attempt to load reactor blueprints from the database file

Notes

There are no guarantees here. If the database was written from a different version of ARMI than you are using, these results may not be usable. For instance, the database could have been written from a vastly old or future version of ARMI from the code you are using.

loadGeometry()[source]

This is primarily just used for migrations. The “geometry files” were replaced by systems: and grids: sections of Blueprints.

writeInputsToDB(cs, csString=None, geomString=None, bpString=None)[source]

Write inputs into the database based the CaseSettings.

This is not DRY on purpose. The goal is that any particular Database implementation should be very stable, so we dont want it to be easy to change one Database implementation’s behavior when trying to change another’s.

Notes

This is hard-coded to read the entire file contents into memory and write that directly into the database. We could have the cs/blueprints/geom write to a string, however the ARMI log file contains a hash of each files’ contents. In the future, we should be able to reproduce a calculation with confidence that the inputs are identical.

readInputsFromDB()[source]
mergeHistory(inputDB, startCycle, startNode)[source]

Copy time step data up to, but not including the passed cycle and node.

Notes

This is used for restart runs with the standard operator for example. The current time step (being loaded from) should not be copied, as that time steps data will be written at the end of the time step.

__enter__()[source]

Context management support

__exit__(type, value, traceback)[source]

Typically we don’t care why it broke but we want the DB to close

genTimeStepGroups(timeSteps: Sequence[Tuple[int, int]] = None) → Generator[h5py._hl.group.Group, None, None][source]

Returns a generator of HDF5 Groups for all time nodes, or for the passed selection.

getLayout(cycle, node)[source]

Return a Layout object representing the requested cycle and time node.

genTimeSteps() → Generator[Tuple[int, int], None, None][source]

Returns a generator of (cycle, node) tuples that are present in the DB.

genAuxiliaryData(ts: Tuple[int, int]) → Generator[str, None, None][source]

Returns a generator of names of auxiliary data on the requested time point.

getAuxiliaryDataPath(ts: Tuple[int, int], name: str) → str[source]

Get a string describing a path to an auxiliary data location.

Parameters
  • ts – The time step that the auxiliary data belongs to

  • name – The name of the auxiliary data

Returns

An absolute location for storing auxiliary data with the given name for the given time step

Return type

str

keys()[source]
getH5Group(r, statePointName=None)[source]

Get the H5Group for the current ARMI timestep.

This method can be used to allow other interfaces to place data into the database at the correct timestep.

hasTimeStep(cycle, timeNode, statePointName='')[source]

Returns True if (cycle, timeNode, statePointName) is contained in the database.

writeToDB(reactor, statePointName=None)[source]

Write reactor data to the DB

syncToSharedFolder()[source]

Copy DB to run working directory.

Needed when multiple MPI processes need to read the same db, for example when a history is needed from independent runs (e.g. for fuel performance on a variety of assemblies).

Notes

At some future point, we may implement a client-server like DB system which would render this kind of operation unnecessary.

load(cycle, node, cs=None, bp=None, statePointName=None, allowMissing=False)[source]

Load a new reactor from (cycle, node).

Case settings, blueprints, and geom can be provided by the client, or read from the database itself. Providing these from the client could be useful when performing snapshot runs or the like, where it is expected to use results from a run using different settings, then continue with new settings. Even in this case, the blueprints and geom should probably be the same as the original run.

Parameters
  • cycle (int) – cycle number

  • node (int) – time node

  • cs (armi.settings.Settings (optional)) – if not provided one is read from the database

  • bp (armi.reactor.Blueprints (Optional)) – if not provided one is read from the database

  • statePointName (str) – Optional arbitrary statepoint name (e.g., “special” for “c00n00-special/”)

  • allowMissing (bool) – Whether to emit a warning, rather than crash if reading a database with undefined parameters. Default False.

Returns

root – The top-level object stored in the database; usually a Reactor.

Return type

ArmiObject

static _assignBlueprintsParams(blueprints, groupedComps)[source]
_compose(comps, cs, parent=None)[source]

Given a flat collection of all of the ArmiObjects in the model, reconstitute the hierarchy.

_writeParams(h5group, comps)[source]
static _addHomogenizedNumberDensityParams(blocks, h5group)[source]

Create on-the-fly block homog. number density params for XTVIEW viewing.

static _readParams(h5group, compTypeName, comps, allowMissing=False)[source]
getHistoryByLocation(comp: armi.reactor.composites.ArmiObject, params: Optional[List[str]] = None, timeSteps: Optional[Sequence[Tuple[int, int]]] = None) → Dict[str, Dict[Tuple[int, int], Any]][source]

Get the parameter histories at a specific location.

getHistoriesByLocation(comps: Sequence[armi.reactor.composites.ArmiObject], params: Optional[List[str]] = None, timeSteps: Optional[Sequence[Tuple[int, int]]] = None) → Dict[armi.reactor.composites.ArmiObject, Dict[str, Dict[Tuple[int, int], Any]]][source]

Get the parameter histories at specific locations.

This has a number of limitations, which should in practice not be too limiting:
  • The passed objects must have IndexLocations. This type of operation doesn’t make much sense otherwise.

  • The passed objects must exist in a hierarchy that leads to a Core object, which serves as an anchor that can fully define all index locations. This could possibly be made more general by extending grids, but that gets a little more complicated.

  • All requested objects must exist under the same anchor object, and at the same depth below it.

  • All requested objects must have the same type.

Parameters
  • comps (list of ArmiObject) – The components/composites that currently occupy the location that you want histories at. ArmiObjects are passed, rather than locations, because this makes it easier to figure out things related to layout.

  • params (List of str, optional) – The parameter names for the parameters that we want the history of. If None, all parameter history is given

  • timeSteps (List of (cycle, node) tuples, optional) – The time nodes that you want history for. If None, all available time nodes will be returned.

getHistory(comp: armi.reactor.composites.ArmiObject, params: Optional[Sequence[str]] = None, timeSteps: Optional[Sequence[Tuple[int, int]]] = None) → Dict[str, Dict[Tuple[int, int], Any]][source]

Get parameter history for a single ARMI Object.

Parameters
  • comps – An individual ArmiObject

  • params – parameters to gather

Returns

Dictionary of str/list pairs.

Return type

dict

getHistories(comps: Sequence[armi.reactor.composites.ArmiObject], params: Optional[Sequence[str]] = None, timeSteps: Optional[Sequence[Tuple[int, int]]] = None) → Dict[armi.reactor.composites.ArmiObject, Dict[str, Dict[Tuple[int, int], Any]]][source]

Get the parameter histories for a sequence of ARMI Objects.

This implementation is unaware of the state of the reactor outside of the database itself, and is therefore not usually what client code should be calling directly during normal ARMI operation. It only knows about historical data that have actually been written to the database. Usually one wants to be able to get historical, plus current data, for which the similar method on the DatabaseInterface may be more useful.

Parameters
  • comps – Something that is iterable multiple times

  • params – parameters to gather.

  • timeSteps – Selection of time nodes to get data for. If omitted, return full history

Returns

Dictionary ArmiObject (input): dict of str/list pairs containing ((cycle, node), value).

Return type

dict

armi.bookkeeping.db.database3._packLocations(locations: List[armi.reactor.grids.LocationBase], minorVersion: int = 4) → Tuple[List[str], List[Tuple[int, int, int]]][source]

Extract information from a location needed to write it to this DB.

Each locator has one locationType and up to N location-defining datums, where N is the number of entries in a possible multiindex, or just 1 for everything else.

Shrink grid locator names for storage efficiency.

Notes

Contains some conditionals to still load databases made before db version 3.3 which can be removed once no users care about those DBs anymore.

armi.bookkeeping.db.database3._packLocationsV1(locations: List[armi.reactor.grids.LocationBase]) → Tuple[List[str], List[Tuple[int, int, int]]][source]

Delete when reading v <=3.2 DB’s no longer wanted.

armi.bookkeeping.db.database3._packLocationsV2(locations: List[armi.reactor.grids.LocationBase]) → Tuple[List[str], List[Tuple[int, int, int]]][source]

Location packing implementation for minor version 3. See release notes above.

armi.bookkeeping.db.database3._packLocationsV3(locations: List[armi.reactor.grids.LocationBase]) → Tuple[List[str], List[Tuple[int, int, int]]][source]

Location packing implementation for minor version 4. See release notes above.

armi.bookkeeping.db.database3._unpackLocations(locationTypes, locData, minorVersion: int = 4)[source]

Convert location data as read from DB back into data structure for building reactor model.

location and locationType will only have different lengths when multiindex locations are used.

armi.bookkeeping.db.database3._unpackLocationsV1(locationTypes, locData)[source]

Delete when reading v <=3.2 DB’s no longer wanted.

armi.bookkeeping.db.database3._unpackLocationsV2(locationTypes, locData)[source]

Location unpacking implementation for minor version 3+. See release notes above.

class armi.bookkeeping.db.database3.Layout(version: Tuple[int, int], h5group=None, comp=None)[source]

Bases: object

The Layout class describes the hierarchical layout of the composite Reactor model in a flat representation.

A Layout is built up by starting at the root of a composite tree and recursively appending each node in the tree to the list of data. So for a typical Reactor model, the data will be ordered by depth-first search: [r, c, a1, a1b1, a1b1c1, a1b1c2, a1b2, a1b2c1, …, a2, …].

The layout is also responsible for storing Component attributes, like location, material, and temperatures (from blueprints), which aren’t stored as Parameters. Temperatures, specifically, are rather complicated beasts in ARMI, and more fundamental changes to how we deal with them may allow us to remove them from Layout.

Notes

As this format is liable to be consumed by other code, it is important to specify its structure so that code attempting to read/write Layouts can make safe assumptions. Below is a list of things to be aware of. More will be added as issues arise or things become more precise:

  • Elements in Layout are stored in depth-first order. This permits use of algorithms such as Pre-Order Tree Traversal for efficient traversal of regions of the model.

  • indexInData increases monotonically within each object type. This means that, for instance, the data for all HexBlock children of a given parent are stored contiguously within the HexBlock group, and will not be interleaved with data from the HexBlock children of any of the parent’s siblings.

  • Aside from the hierarchy itself, there is no guarantee what order objects are stored in the layout. “The Core” is not necessarily the first child of the Reactor, and is not guaranteed to use the zeroth grid.

_createLayout(comp)[source]

Populate a hierarchical representation and group the reactor model items by type.

This is used when writing a reactor model to the database.

Notes

This is recursive.

See also

_readLayout()

does the opposite

_readLayout(h5group)[source]

Populate a hierarchical representation and group the reactor model items by type.

This is used when reading a reactor model from a database.

See also

_createLayout()

does the opposite

_initComps(caseTitle, bp)[source]
writeToDB(h5group)[source]
static computeAncestors(serialNum, numChildren, depth=1) → List[Optional[int]][source]

Return a list containing the serial number of the parent corresponding to each object at the given depth.

Depth in this case means how many layers to reach up to find the desired ancestor. A depth of 1 will yield the direct parent of each element, depth of 2 would yield the elemen’s parent’s parent, and so on.

The zero-th element will always be None, as the first object is the root element and so has no parent. Subsequent depths will result in more Nones.

This function is useful for forming a lightweight sense of how the database contents stitch together, without having to go to the trouble of fully unpacking the Reactor model.

Parameters
  • serialNum (List of int) – List of serial numbers for each object/element, as laid out in Layout

  • numChildren (List of int) – List of numbers of children for each object/element, as laid out in Layout

Note

This is not using a recursive approach for a couple of reasons. First, the iterative form isn’t so bad; we just need two stacks. Second, the interface of the recursive function would be pretty unwieldy. We are progressively consuming two lists, of which we would need to keep passing down with an index/cursor, or progressively slice them as we go, which would be pretty inefficient.

armi.bookkeeping.db.database3.allSubclasses(cls)[source]

This currently include Materials… and it should not.

armi.bookkeeping.db.database3.packSpecialData(data: numpy.ndarray, paramName: str) → Tuple[Optional[numpy.ndarray], Dict[str, Any]][source]

Reduce data that wouldn’t otherwise play nicely with HDF5/numpy arrays to a format that will.

This is the main entry point for conforming “strange” data into something that will both fit into a numpy array/HDF5 dataset, and be recoverable to its original-ish state when reading it back in. This is accomplished by detecting a handful of known offenders and using various HDF5 attributes to store necessary auxiliary data. It is important to keep in mind that the data that is passed in has already been converted to a numpy array, so the top dimension is always representing the collection of composites that are storing the parameters. For instance, if we are dealing with a Block parameter, the first index in the numpy array of data is the block index; so if each block has a parameter that is a dictionary, data would be a ndarray, where each element is a dictionary. This routine supports a number of different “strange” things: * Dict[str, float]: These are stored by finding the set of all keys for all

instances, and storing those keys as a list in an attribute. The data themselves are stored as arrays indexed by object, then key index. Dictionaries lacking data for a key store a nan in it’s place. This will work well in instances where most objects have data for most keys.

  • Jagged arrays: These are stored by concatenating all of the data into a single, one-dimensional array, and storing attributes to describe the shapes of each object’s data, and an offset into the beginning of each object’s data.

  • Arrays with None in them: These are stored by replacing each instance of None with a magical value that shouldn’t be encountered in realistic scenarios.

Parameters
  • data – An ndarray storing the data that we want to stuff into the database. These are usually dtype=Object, which is how we usually end up here in the first place.

  • paramName – The parameter name that we are trying to store data for. This is mostly used for diagnostics.

armi.bookkeeping.db.database3.unpackSpecialData(data: numpy.ndarray, attrs, paramName: str) → numpy.ndarray[source]

Extract data from a specially-formatted HDF5 dataset into a numpy array.

This should invert the operations performed by packSpecialData().

Parameters
  • data – Specially-formatted data array straight from the database.

  • attrs – The attributes associated with the dataset that contained the data.

  • paramName – The name of the parameter that is being unpacked. Only used for diagnostics.

Returns

An ndarray containing the closest possible representation of the data that was originally written to the database.

Return type

numpy.ndarray

armi.bookkeeping.db.database3.replaceNonsenseWithNones(data: numpy.ndarray, paramName: str) → numpy.ndarray[source]

Replace special nonsense values with None.

This essentially reverses the operations performed by replaceNonesWithNonsense().

Parameters
  • data – The array from the database that contains special None nonsense values.

  • paramName – The param name who’s data we are dealing with. Only used for diagnostics.

armi.bookkeeping.db.database3.replaceNonesWithNonsense(data: numpy.ndarray, paramName: str, nones: numpy.ndarray = None) → numpy.ndarray[source]

Replace instances of None with nonsense values that can be detected/recovered when reading.

Parameters
  • data – The numpy array containing None values that need to be replaced.

  • paramName – The name of the parameter who’s data we are treating. Only used for diagnostics.

  • nones – An array containing the index locations on the None elements. It is a little strange to pass these, in but we find these indices to determine whether we need to call this function in the first place, so might as well pass it in, so that we don’t need to perform the operation again.

Notes

This only supports situations where the data is a straight-up None, or a valid, database-storable numpy array (or easily convertable to one (e.g. tuples/lists with numerical values)). This does not support, for instance, a numpy ndarray with some Nones in it.

For example, the following is supported:

[[1, 2, 3], None, [7, 8, 9]]

However, the following is not:

[[1, 2, 3], [4, None, 6], [7, 8, 9]]

See also

replaceNonsenseWithNones()

Reverses this operation.

armi.bookkeeping.db.database3._writeAttrs(obj, group, attrs)[source]

Handle safely writing attributes to a dataset, handling large data if necessary.

This will attempt to store attributes directly onto an HDF5 object if possible, falling back to proper datasets and reference attributes if necessary. This is needed because HDF5 tries to fit attributes into the object header, which has limited space. If an attribute is too large, h5py raises a RuntimeError. In such cases, this will store the attribute data in a proper dataset and place a reference to that dataset in the attribute instead.

In practice, this takes linkedDims attrs from a particular component type (like c00n00/Circle/id) and stores them in new datasets (like c00n00/attrs/1_linkedDims, c00n00/attrs/2_linkedDims) and then sets the object’s attrs to links to those datasets.

armi.bookkeeping.db.database3._resolveAttrs(attrs, group)[source]

Reverse the action of _writeAttrs.

This reads actual attrs and looks for the real data in the datasets that the attrs were pointing to.

armi.bookkeeping.db.database3.collectBlockNumberDensities(blocks) → Dict[str, numpy.ndarray][source]

Collect block-by-block homogenized number densities for each nuclide.

Long ago, composition was stored on block params. No longer; they are on the component numberDensity params. These block-level params, are still useful to see compositions in some visualization tools. Rather than keep them on the reactor model, we dynamically compute them here and slap them in the database. These are ignored upon reading and will not affect the results.

Remove this once a better viz tool can view composition distributions. Also remove the try/except in _readParams