GaPFlow.Database#

class GaPFlow.Database(md: Any, db: dict, num_extra_features: int = 1)#

Bases: object

Container for GP training datasets.

Handles dataset initialization, normalization, data addition, and optional dtool integration for persistent dataset storage.

Parameters:
  • md (GaPFlow.md.MolecularDynamics) – An instance of the MD runner object. Adding a data point will lead to calling its run method.

  • db (dict) –

    Configuration dictionary with keys:

    • 'dtool_path' : str, path where training data is stored and loaded from.

    • 'init_size' : int, minimum dataset size.

    • 'init_width' : float, relative sampling width.

    • 'init_method' : str, name of the (quasi-)random initialization method (‘lhc’, ‘rand’, ‘sobol’).

    • 'init_seed' : int, random seed for initialization.

  • num_extra_features (int, number of additional features (next to solution, gap height + gradients)) – stored with the database (default is 1)

__init__(md: Any, db: dict, num_extra_features: int = 1) None#

Methods

__init__(md, db[, num_extra_features])

add_data(Xnew)

Add new data entries to the database.

get_readme_list_local()

Get list of dtool README files for existing MD runs from a local directory.

get_readme_list_remote()

Get list of dtool README files for existing MD runs from a remote data server (via dtool_lookup_api)

initialize(Xtest[, dim])

Initialize database.

set_training_path(new_path[, check_temporary])

Set training path.

write()

Write the dataset arrays to disk (if the simulation output path is specified).

Attributes

X_scale

Normalization constants for input features.

X_shift

Normalization constants for input features.

Xtrain

Normalized input features of shape (Ntrain, Nfeat).

Xtrain_target

Normalized input features of shape (Ntrain, Nfeat).

Y_scale

Normalization constants for observations

Y_shift

Normalization constants for observations

Ytrain

Normalized observations of shape (Ntrain, 13).

Ytrain_err

Normalized observation error of shape (Ntrain, 13).

config

Configuration parameters of the database object.

has_mock_md

Flag that indicates whether the attached MD runner is a 'mock' object.

md_config

Configuration parameters of the attached MD runner object.

num_features

Number of possible features, actual ones are selected from GP's active_dims.

output_path

Simulation output path

size

Number of training samples currently stored.

training_path

Local storage location of dtool datasets.

property X_scale: Float[Array, 'Ntrain Nfeat']#

Normalization constants for input features.

property X_shift: Float[Array, 'Ntrain Nfeat']#

Normalization constants for input features.

property Xtrain: Float[Array, 'Ntrain Nfeat']#

Normalized input features of shape (Ntrain, Nfeat).

property Xtrain_target: Float[Array, 'Ntrain Nfeat']#

Normalized input features of shape (Ntrain, Nfeat).

property Y_scale: Float[Array, 'Ntrain 13']#

Normalization constants for observations

property Y_shift: Float[Array, 'Ntrain 13']#

Normalization constants for observations

property Ytrain: Float[Array, 'Ntrain 13']#

Normalized observations of shape (Ntrain, 13).

property Ytrain_err: Float[Array, 'Ntrain 13']#

Normalized observation error of shape (Ntrain, 13).

add_data(Xnew: Float[Array, 'Ntrain Nfeat']) None#

Add new data entries to the database.

Parameters:

Xnew (jax.Array) – New samples of shape (Nnew, Nfeat).

property config: dict#

Configuration parameters of the database object.

get_readme_list_local()#

Get list of dtool README files for existing MD runs from a local directory.

Returns:

List of dicts containing the readme content

Return type:

list

get_readme_list_remote()#

Get list of dtool README files for existing MD runs from a remote data server (via dtool_lookup_api)

In the future, one should be able to pass a valid MongoDB query string to select data.

Returns:

List of dicts containing the readme content

Return type:

list

property has_mock_md: bool#

Flag that indicates whether the attached MD runner is a ‘mock’ object.

initialize(Xtest: Float[Array, 'Ntrain Nfeat'], dim: int = 1) Float[Array, 'Ntrain Nfeat']#

Initialize database.

Parameters:
  • Xtest (jax.Array) – Candidate test points of shape (n_test, 6).

  • dim (int) – Dimension of the fluid problem (either 1 or 2, defaults to 1)

property md_config: dict#

Configuration parameters of the attached MD runner object.

property num_features: int#

Number of possible features, actual ones are selected from GP’s active_dims.

property output_path: str#

Simulation output path

set_training_path(new_path: str, check_temporary: bool = False) None#

Set training path.

This modifies the storage location of dtool basepaths, also for the attached MD runner object.

Parameters:

new_path (str) – Training path

property size: int#

Number of training samples currently stored.

property training_path: str#

Local storage location of dtool datasets.

write() None#

Write the dataset arrays to disk (if the simulation output path is specified).