Kinetica.jl API
CRN Representation
Representing Chemical Species
Kinetica.SpeciesData
— TypeSpeciesData(smi_list, xyz_list[, unique_species=true])
SpeciesData(smi_list, xyz_list, level[, unique_species=true])
SpeciesData(xyz_file[, unique_species=true, fix_radicals=true])
SpeciesData(xyz_file, level[, unique_species=true, fix_radicals=true])
Bidirectional String-Int dictionary for chemical species.
Can either be constructed from an array of SMILES strings and their corresponding ExtXYZ frames, or from a single XYZ file with one or multiple species present. If unique_species=true
, will not include any duplicate species if present. If fix_radicals=true
in the XYZ file loading case, will attempt to tidy up radical SMILES with OBCR.
Both constuctors can optionally be given a level
argument, indicating the exploration level which a species (or set of species) was first discovered. If this is not provided, assumes species are entering the struct at level 1.
Contains fields for:
- SMILES string -> integer ID dictionary (
toInt
) - Integer ID -> SMILES string dictionary (
toStr
) - Number of species (
n
) - ExtXYZ structures of species (
xyz
) - Integer ID -> initial discovered level dictionary (
level_found
) - Dictionary of per-species cached values (
cache
)
Base.push!
— Methodpush!(sd::SpeciesData, smi::String, xyz::Dict{String, Any})
push!(sd::SpeciesData, smi::String, xyz::Dict{String, Any}, level::Int)
Add a species to SpeciesData
.
Optionally takes a specified exploration level (defaults to 1 if not provided). Does not account for smi
already existing within sd
. To ensure no overlap, use push_unique!
.
Base.push!
— Methodpush!(sd::SpeciesData, smis::Vector{String}, xyzs::Vector{Any})
push!(sd::SpeciesData, smis::Vector{String}, xyzs::Vector{Any}, level::Int)
Add an array of species to SpeciesData
.
Optionally takes a specified exploration level (defaults to 1 if not provided). Does not account for smi
already existing within sd
. To ensure no overlap, use push_unique!
.
Base.push!
— Methodpush!(sd::SpeciesData, xyz_file::String[, fix_radicals=true])
push!(sd::SpeciesData, xyz_file::String, level::Int[, fix_radicals=true])
Add all species in xyz_file
to sd
.
Optionally takes a specified exploration level (defaults to 1 if not provided). Does not account for smi
already existing within sd
. To ensure no overlap, use push_unique!
.
Kinetica.push_unique!
— Functionpush_unique!(sd::SpeciesData, smi::String, xyz::Dict{String, Any})
push_unique!(sd::SpeciesData, smi::String, xyz::Dict{String, Any}, level::Int)
Add a species SMILES to a SpeciesData
, as long as it does not already exist there.
Optionally takes a specified exploration level (defaults to 1 if not provided).
push_unique!(sd::SpeciesData, xyz_file::String[, fix_radicals=true])
push_unique!(sd::SpeciesData, xyz_file::String, level::Int[, fix_radicals=true])
Add species in xyz_file
to sd
, as long as they do not already exist there.
Optionally takes a specified exploration level (defaults to 1 if not provided).
push_unique!(sd::SpeciesData, smis::String, xyzs::Vector{Dict{String, Any}})
push_unique!(sd::SpeciesData, smis::String, xyzs::Vector{Dict{String, Any}}, level::Int)
Add an array of species to SpeciesData
, as long as each does not already exist there.
Optionally takes a specified exploration level (defaults to 1 if not provided).
Representing Reactions
Kinetica.RxData
— Typerd = RxData(sd::SpeciesData, reacs::Vector{Vector{String}}, prods::Vector{Vector{String}},
rsys::Vector{Dict{String, Any}}, psys::Vector{Dict{String, Any}},
dH::Vector{<:AbstractFloat}[, unique_rxns=true, max_molecularity=2])
rd = RxData(sd::SpeciesData, reacs::Vector{Vector{String}}, prods::Vector{Vector{String}},
rsys::Vector{Dict{String, Any}}, psys::Vector{Dict{String, Any}},
dH::Vector{<:AbstractFloat}, level::Int[, unique_rxns=true, max_molecularity=2])
Data container for reactions.
Constructor creates a new RxData
reaction data store from lists of reactant and product SMILES by cross-referencing species IDs with sd
. sd
should therefore already have all species input here loaded in.
reacs
and prods
should be the raw SMILES arrays from ingest_cde_run()
, i.e. without any duplicate species removed due to stoichiometry. This constructor will determine stoichiometry and output the unique form of each set of reactants/products.
Can optionally be given a level
argument, indicating the exploration level which a set of reactions was discovered. If this is not provided, assumes reactions are entering the struct at level 1.
By default, adds a maximum of 1 of each reaction type when unique_rxns = true
, and only admits reactions with a maximum molecularity of 2 (i.e. bimolecular reactions).
Contains fields for:
- Number of reactions encountered (
nr
) - Atom-mapped reaction SMILES for unambiguous linking of atom indices in reactants and products (
mapped_rxns
) - Unique IDs of reactants for each reaction (
id_reacs
) - Unique IDs of products for each reaction (
id_prods
) - Stoichiometries of reactants for each reaction (
stoic_reacs
) - Stoichiometries of products for each reaction (
stoic_prods
) - Reaction enthalpies (
dH
) - Reaction hashes, used for unique identification (
rhash
) - The exploration level on which reactions were first found (
level_found
)
Base.push!
— Methodpush!(rd::RxData, sd::SpeciesData, reacs::Vector{Vector{String}}, prods::Vector{Vector{String}},
rsys::Vector{Dict{String, Any}}, psys::Vector{Dict{String, Any}},
dH::Vector{<:AbstractFloat}[, unique_rxns=true, max_molecularity=2])
push!(rd::RxData, sd::SpeciesData, reacs::Vector{Vector{String}}, prods::Vector{Vector{String}},
rsys::Vector{Dict{String, Any}}, psys::Vector{Dict{String, Any}},
dH::Vector{<:AbstractFloat}, level::Int[, unique_rxns=true, max_molecularity=2])
Adds an array of reactions to rd
.
Extends an RxData
reaction data store from lists of reactant and product SMILES by cross-referencing species IDs with sd
. sd
should therefore already have all species input here loaded in.
reacs
and prods
should be the raw SMILES arrays from ingest_cde_run()
, i.e. without any duplicate species removed due to stoichiometry. This function will determine stoichiometry and output the unique form of each set of reactants/products.
Can optionally be given a level
argument, indicating the exploration level which a set of reactions was discovered. If this is not provided, assumes reactions are entering the struct at level 1.
By default, adds a maximum of 1 of each reaction type when unique_rxns = true
, and only admits reactions with a maximum molecularity of 2 (i.e. bimolecular reactions).
Base.splice!
— Methodsplice!(rd::RxData, rids::Vector{Int})
Removes reactions at indices rids
from rd
.
Kinetica.get_reverse_rhash
— Functionget_reverse_rhash(sd, rd, rid)
Returns the reverse reaction hash for the reaction at rid
in rd
.
Useful when needing to identify if a reverse reaction is already in a CRN without having to look through many species permutations.
CRN Initialisation
Kinetica.init_network
— Functioninit_network([iType=Int64, fType=Float64])
Initialises an empty reaction network.
Returns an empty SpeciesData{iType}
and RxData{iType, fType}.
CDE Interface
Kinetica.CDE
— TypeCDE(template_dir::String [, env_threads, cde_exec, sampling_seed, radius, nrxn, parallel_runs, parallel_exes, write_stdout, write_stderr, allow_errors])
CDE runner. Initialised through keyword-based struct, run through functor.
Struct contains fields for:
- CDE template directory (
template_dir
) - Environmental multithreading number of threads (Optional, defaults to 1 thread;
env_threads
) - Path to CDE executable (Optional, defaults to CDE packaged within CDEjll; `cdeexec`)
- Seed for CDE's RNG (Optional, setting to 0 indicates seed should be random;
sampling_seed
) - Radius for exploration of breakdown space (Optional, Default = 50;
radius
) - Number of mechanisms to generate within a single CDE run (Optional, Default = 1;
nrxn
) - Number of parallel CDE runs to execute (Optional, Default = 1;
parallel_runs
) - Maximum number of parallel CDE executables to run at any time (Optional, Default = 1;
parallel_exes
) - Whether to write CDE's stdout to file (Optional, Default =
false
;write_stdout
) - Whether to write CDE's stderr to file (Optional, Default =
false
;write_stderr
) - Whether to allow functions to continue running if CDE errors are detected (Optional, Default =
false
;allow_errors
)
Additionally, some fields are usually modified within Kinetica, and are not intended to be changed by users.
- Main reaction directory (
rdir
) - XYZ file of starting molecule/material (
init_xyz
)
Kinetica.ingest_cde_run
— Functioningest_cde_run(rdir::String, rcount[, fix_radicals=true])
Reads in the results from a CDE run.
Separates out fragment species from each available reaction's reactants and products, forming arrays of their SMILES strings and ExtXYZ geometries.
OBCanonicalRadicals can be enabled to tidy up radical SMILES using the fix_radicals
parameter.
Returns reac_smis, reac_xyzs, reac_systems, prod_smis, prod_xyzs, prod_systems, dH
, where:
reac_smis
andprod_smis
are arrays of the SMILES of each reaction's reactants and products;reac_xyzs
andprod_xyzs
are their corresponding geometries as ExtXYZ frames;reac_systems
andprod_systems
are the ExtXYZ frames of the systems of molecules that came out of CDE;dH
is an array of reaction energies.
Kinetica.import_mechanism
— Functionimport_mechanism(loc::ExploreLoc, rcount[, max_molecularity=2])
Create a CRN's initial SpeciesData
and RxData
from a CDE generated mechanism(s).
Reads in the results of a CDE run at the rcount
reaction directory specified by loc
. Returns a new SpeciesData
and RxData
containing the unique species and reactions within these results, provided these reactions do not exceed the maximum molecularity set by max_molecularity
, which defaults to only accepting unimolecular and bimolecular reactions.
Kinetica.import_mechanism!
— Functionimport_mechanism!(sd::SpeciesData, rd::RxData, loc::ExploreLoc, rcount[, max_molecularity=2])
Extend a CRN's SpeciesData
and RxData
from a CDE generated mechanism(s).
Reads in the results of a CDE run at the rcount
reaction directory specified by loc
. Extends sd
and rd
with the unique species and reactions within these results, provided these reactions do not exceed the maximum molecularity set by max_molecularity
, which defaults to only accepting unimolecular and bimolecular reactions.
Kinetica.import_network
— Functionimport_network(rdir_head::String)
Imports a network from a level tree within rdir_head
.
Recurses through level directories in rdir_head
, then recurses through subspace directories in each level. Within each subspace, imports all mechanisms within each CDE run.
Returns the resulting network (an instance of SpeciesData
and RxData
).
Exploration
Kinetica.ExploreLoc
— TypeExploreLoc(rdir_head::String, level::Int, subspace::Int)
Container for iterative exploration location data.
Struct contains fields for:
- Head directory of CRN ('rdir_head`)
- Current level index within head directory (
level
) - Current subspace index within level (
subspace
)
Base.pathof
— Methodpathof(loc::ExploreLoc[, to_level=false])
Returns the path of the current subspace in the current level in the head directory specified by loc
.
If to_level=true
, only returns the path to the current level, excluding the current subspace.
Kinetica.find_current_loc
— Functionfind_current_loc(rdir_head::String)
Finds the current exploration location in a partially explored CRN.
Returns an ExploreLoc
for the current exploration location.
Looks for level directories within rdir_head
. If none are found, points to the initial subspace of the initial level. Otherwise finds the latest level with a 'seeds.in' file. If subspaces are not found within this level, points to the initial subspace of this level. Otherwise finds the latest subspace without an 'isconv' convergence file, and points here. If all subspaces in this level are converged, emits a warning and points to the last subspace in this level.
Kinetica.DirectExplore
— TypeDirectExplore(rdir_head::String, reac_smiles::Vector{String}, cde::CDE[,
maxiters::Int=1000, rxn_convergence_threshold::Int=5,
modify_network_on_solve::Bool=true])
Keyword-based container for parameters used in direct CRN exploration.
Contains fields for:
- Top level of CRN exploration directory (
rdir_head
) - SMILES string(s) of main breakdown reactant(s) being studied (
reac_smiles
) CDE
instance (cde
)- Maximum number of iterations to perform (
maxiters
) - Number of iterations with no change in reactions to consider as converged (
rxn_convergence_threshold
) - Whether to allow CRN modification after solving (
modify_network_on_solve
)
Kinetica.IterativeExplore
— TypeIterativeExplore(rdir_head::String, reac_smiles::VEctor{String}, cde::CDE[, maxiters::Int=1000,
rxn_convergence_threshold::Int=5, seed_convergence_threshold::Int=3, seed_conc=0.05,
n_undirected_levels::Int=0, independent_blacklist::Vector{String}=[],
inert_species::Vector{String}=[], modify_network_on_solve::Bool=true])
Keyword-based container for parameters used in iterative kinetics-based CRN exploration.
Contains fields for:
- Top level of CRN exploration directory (
rdir_head
) - SMILES string(s) of main breakdown reactant(s) being studied (
reac_smiles
) CDE
instance (cde
)- Maximum number of iterations to perform (
maxiters
) - Number of subspace iterations with no change in reactions to consider a subspace converged (
rxn_convergence_threshold
) - Number of level iterations with no change in seeds to consider the network converged (
seed_convergence_threshold
) - Concentration above which species will be selected as seeds each level (
seed_conc
) - Number of undirected levels of exploration to perform at the start (
n_undirected_levels
) - Blacklist of species to avoid doing independent subspace explorations on (
independent_blacklist
) - Inert species that should not be considered for reaction (
inert_species
) - Whether to allow CRN modification after solving (
modify_network_on_solve
)
Kinetica.explore_network
— Functionexplore_network(exploremethod::DirectExplore, solvemethod[, savedir])
explore_network(exploremethod::IterativeExplore, solvemethod[, savedir])
Runs network exploration with one of the available methods.
If exploremethod isa DirectExplore
, runs a single-level network exploration to attempt to locate all relevant reactions in a radius of exploremethod.cde.radius
species from the starting system.
If exploremethod isa IterativeExplore
, runs a multi-level iterative network exploration using kinetic simulations to identify seed species for successive levels, in order to fully characterise the reaction space relevant to the conditions in solvemethod.conditions
.
Kinetica.load_past_seeds
— Functionload_past_seeds(loc::ExploreLoc)
Loads in SMILES of all seeds from previous levels.
Returns an array of these SMILES.
Kinetica.load_current_seeds
— Functionload_current_seeds(loc::ExploreLoc)
Loads in SMILES of all seeds from current level.
Returns an array of these SMILES.
Kinetica.identify_next_seeds
— Functionidentify_next_seeds(sol, sd::SpeciesData, seed_conc<:AbstractFloat[, elim_small_na::Int=0,
ignore::Vector{String}=[], saveto::Union{String, Nothing}=nothing])
identify_next_seeds(sol, sd::SpeciesData[, elim_small_na::Int=0, ignore::Vector{String}=[],
saveto::Union{String, Nothing}=nothing])
Selects seed species for the next level of network exploration.
Identifies species in sol
with a maximum concentration above seed_conc
and returns an array of the SMILES of these species. If seed_conc
is not provided, assumes all species in sd
should become seeds.
Species meeting selection criteria can be manually ignored by including their SMILES in the ignore
argument. Similarly, if species below a certain number of atoms are not desired as seeds (e.g. if they are too small to break down), this number of atoms can be set with elim_small_na
.
If saveto
is set to a file path, the selected seeds and their maximum concentrations are output to this file (usually generated automatically as a 'seeds.out' file in the current level of exploration).
Molecule System
Kinetica.system_from_smiles
— Functionsystem_from_smiles(smiles::String[, dmin::Float64=5.0, maxiters::Int=200])
system_from_smiles(smiles::String[, saveto::String, dmin::Float64=5.0, maxiters::Int=200])
Forms a single XYZ system out of the molecules in smiles
.
Useful for making unified molecular systems with no overlap for feeding into CDE. smiles
should be a single String
with individual species separated by '.'. dmin
represents the minimum molecule-molecule distance that should be allowed.
If the argument saveto
is provided, outputs the optimised system to a file at this path. If not, returns the optimised system as a single ExtXYZ dict.
Kinetica.system_from_mols
— Functionsystem_from_mols(mols::Vector{Dict{String, Any}[, dmin::Float64=5.0, maxiters::Int=200])
system_from_mols(mols::Vector{Dict{String, Any}[, saveto::String, dmin::Float64=5.0, maxiters::Int=200])
Forms a single XYZ system out of the molecules in mols
.
Useful for making unified molecular systems with no overlap for feeding into CDE. mols
should be a Vector
of ExtXYZ frames. dmin
represents the minimum molecule-molecule distance that should be allowed.
If the argument saveto
is provided, outputs the optimised system to a file at this path. If not, returns the optimised system as a single ExtXYZ dict.
Kinetica.molsys_opt
— Functionmolsys_opt(mols::Vector{Dict{String, Any}}, dmin::Float64, maxiters::Int)
Optimises positions of molecules in mols
to ensure they are all at least dmin
Angstroms apart.
Creates and solves an N-body spring-driven particle system and transforms molecular coordinates to these particles to check for proximity.
Returns a new translated Vector
of molecules when finished.