GaPFlow.Database#
- class GaPFlow.Database(md: Any, db: dict, num_extra_features: int = 1)#
Bases:
objectContainer for GP training datasets.
Handles dataset initialization, normalization, data addition, and optional dtool integration for persistent dataset storage.
- Parameters:
md (GaPFlow.md.MolecularDynamics) – An instance of the MD runner object. Adding a data point will lead to calling its run method.
db (dict) –
Configuration dictionary with keys:
'dtool_path': str, path where training data is stored and loaded from.'init_size': int, minimum dataset size.'init_width': float, relative sampling width.'init_method': str, name of the (quasi-)random initialization method (‘lhc’, ‘rand’, ‘sobol’).'init_seed': int, random seed for initialization.
num_extra_features (int, number of additional features (next to solution, gap height + gradients)) – stored with the database (default is 1)
- __init__(md: Any, db: dict, num_extra_features: int = 1) None#
Methods
__init__(md, db[, num_extra_features])add_data(Xnew)Add new data entries to the database.
Get list of dtool README files for existing MD runs from a local directory.
Get list of dtool README files for existing MD runs from a remote data server (via dtool_lookup_api)
initialize(Xtest[, dim])Initialize database.
set_training_path(new_path[, check_temporary])Set training path.
write()Write the dataset arrays to disk (if the simulation output path is specified).
Attributes
Normalization constants for input features.
Normalization constants for input features.
Normalized input features of shape (Ntrain, Nfeat).
Normalized input features of shape (Ntrain, Nfeat).
Normalization constants for observations
Normalization constants for observations
Normalized observations of shape (Ntrain, 13).
Normalized observation error of shape (Ntrain, 13).
Configuration parameters of the database object.
Flag that indicates whether the attached MD runner is a 'mock' object.
Configuration parameters of the attached MD runner object.
Number of possible features, actual ones are selected from GP's active_dims.
Simulation output path
Number of training samples currently stored.
Local storage location of dtool datasets.
- property X_scale: Float[Array, 'Ntrain Nfeat']#
Normalization constants for input features.
- property X_shift: Float[Array, 'Ntrain Nfeat']#
Normalization constants for input features.
- property Xtrain: Float[Array, 'Ntrain Nfeat']#
Normalized input features of shape (Ntrain, Nfeat).
- property Xtrain_target: Float[Array, 'Ntrain Nfeat']#
Normalized input features of shape (Ntrain, Nfeat).
- property Y_scale: Float[Array, 'Ntrain 13']#
Normalization constants for observations
- property Y_shift: Float[Array, 'Ntrain 13']#
Normalization constants for observations
- property Ytrain: Float[Array, 'Ntrain 13']#
Normalized observations of shape (Ntrain, 13).
- property Ytrain_err: Float[Array, 'Ntrain 13']#
Normalized observation error of shape (Ntrain, 13).
- add_data(Xnew: Float[Array, 'Ntrain Nfeat']) None#
Add new data entries to the database.
- Parameters:
Xnew (jax.Array) – New samples of shape (Nnew, Nfeat).
- property config: dict#
Configuration parameters of the database object.
- get_readme_list_local()#
Get list of dtool README files for existing MD runs from a local directory.
- Returns:
List of dicts containing the readme content
- Return type:
list
- get_readme_list_remote()#
Get list of dtool README files for existing MD runs from a remote data server (via dtool_lookup_api)
In the future, one should be able to pass a valid MongoDB query string to select data.
- Returns:
List of dicts containing the readme content
- Return type:
list
- property has_mock_md: bool#
Flag that indicates whether the attached MD runner is a ‘mock’ object.
- initialize(Xtest: Float[Array, 'Ntrain Nfeat'], dim: int = 1) Float[Array, 'Ntrain Nfeat']#
Initialize database.
- Parameters:
Xtest (jax.Array) – Candidate test points of shape (n_test, 6).
dim (int) – Dimension of the fluid problem (either 1 or 2, defaults to 1)
- property md_config: dict#
Configuration parameters of the attached MD runner object.
- property num_features: int#
Number of possible features, actual ones are selected from GP’s active_dims.
- property output_path: str#
Simulation output path
- set_training_path(new_path: str, check_temporary: bool = False) None#
Set training path.
This modifies the storage location of dtool basepaths, also for the attached MD runner object.
- Parameters:
new_path (str) – Training path
- property size: int#
Number of training samples currently stored.
- property training_path: str#
Local storage location of dtool datasets.
- write() None#
Write the dataset arrays to disk (if the simulation output path is specified).