Kinetica.jl API

CRN Representation

Representing Chemical Species

Kinetica.SpeciesData — Type

SpeciesData(smi_list, xyz_list[, unique_species=true])
SpeciesData(smi_list, xyz_list, level[, unique_species=true])
SpeciesData(xyz_file[, unique_species=true, fix_radicals=true])
SpeciesData(xyz_file, level[, unique_species=true, fix_radicals=true])

Bidirectional String-Int dictionary for chemical species.

Can either be constructed from an array of SMILES strings and their corresponding ExtXYZ frames, or from a single XYZ file with one or multiple species present. If unique_species=true, will not include any duplicate species if present. If fix_radicals=true in the XYZ file loading case, will attempt to tidy up radical SMILES with OBCR.

Both constuctors can optionally be given a level argument, indicating the exploration level which a species (or set of species) was first discovered. If this is not provided, assumes species are entering the struct at level 1.

Contains fields for:

SMILES string -> integer ID dictionary (toInt)
Integer ID -> SMILES string dictionary (toStr)
Number of species (n)
ExtXYZ structures of species (xyz)
Integer ID -> initial discovered level dictionary (level_found)
Dictionary of per-species cached values (cache)

Base.push! — Method

push!(sd::SpeciesData, smi::String, xyz::Dict{String, Any})
push!(sd::SpeciesData, smi::String, xyz::Dict{String, Any}, level::Int)

Add a species to SpeciesData.

Optionally takes a specified exploration level (defaults to 1 if not provided). Does not account for smi already existing within sd. To ensure no overlap, use push_unique!.

Base.push! — Method

push!(sd::SpeciesData, smis::Vector{String}, xyzs::Vector{Any})
push!(sd::SpeciesData, smis::Vector{String}, xyzs::Vector{Any}, level::Int)

Add an array of species to SpeciesData.

Optionally takes a specified exploration level (defaults to 1 if not provided). Does not account for smi already existing within sd. To ensure no overlap, use push_unique!.

Base.push! — Method

push!(sd::SpeciesData, xyz_file::String[, fix_radicals=true])
push!(sd::SpeciesData, xyz_file::String, level::Int[, fix_radicals=true])

Add all species in xyz_file to sd.

Optionally takes a specified exploration level (defaults to 1 if not provided). Does not account for smi already existing within sd. To ensure no overlap, use push_unique!.

Kinetica.push_unique! — Function

push_unique!(sd::SpeciesData, smi::String, xyz::Dict{String, Any})
push_unique!(sd::SpeciesData, smi::String, xyz::Dict{String, Any}, level::Int)

Add a species SMILES to a SpeciesData, as long as it does not already exist there.

Optionally takes a specified exploration level (defaults to 1 if not provided).

push_unique!(sd::SpeciesData, xyz_file::String[, fix_radicals=true])
push_unique!(sd::SpeciesData, xyz_file::String, level::Int[, fix_radicals=true])

Add species in xyz_file to sd, as long as they do not already exist there.

Optionally takes a specified exploration level (defaults to 1 if not provided).

push_unique!(sd::SpeciesData, smis::String, xyzs::Vector{Dict{String, Any}})
push_unique!(sd::SpeciesData, smis::String, xyzs::Vector{Dict{String, Any}}, level::Int)

Add an array of species to SpeciesData, as long as each does not already exist there.

Optionally takes a specified exploration level (defaults to 1 if not provided).

Representing Reactions

Kinetica.RxData — Type

rd = RxData(sd::SpeciesData, reacs::Vector{Vector{String}}, prods::Vector{Vector{String}}, 
            rsys::Vector{Dict{String, Any}}, psys::Vector{Dict{String, Any}}, 
            dH::Vector{<:AbstractFloat}[, unique_rxns=true, max_molecularity=2])
rd = RxData(sd::SpeciesData, reacs::Vector{Vector{String}}, prods::Vector{Vector{String}}, 
            rsys::Vector{Dict{String, Any}}, psys::Vector{Dict{String, Any}}, 
            dH::Vector{<:AbstractFloat}, level::Int[, unique_rxns=true, max_molecularity=2])

Data container for reactions.

Constructor creates a new RxData reaction data store from lists of reactant and product SMILES by cross-referencing species IDs with sd. sd should therefore already have all species input here loaded in.

reacs and prods should be the raw SMILES arrays from ingest_cde_run(), i.e. without any duplicate species removed due to stoichiometry. This constructor will determine stoichiometry and output the unique form of each set of reactants/products.

Can optionally be given a level argument, indicating the exploration level which a set of reactions was discovered. If this is not provided, assumes reactions are entering the struct at level 1.

By default, adds a maximum of 1 of each reaction type when unique_rxns = true, and only admits reactions with a maximum molecularity of 2 (i.e. bimolecular reactions).

Contains fields for:

Number of reactions encountered (nr)
Atom-mapped reaction SMILES for unambiguous linking of atom indices in reactants and products (mapped_rxns)
Unique IDs of reactants for each reaction (id_reacs)
Unique IDs of products for each reaction (id_prods)
Stoichiometries of reactants for each reaction (stoic_reacs)
Stoichiometries of products for each reaction (stoic_prods)
Reaction enthalpies (dH)
Reaction hashes, used for unique identification (rhash)
The exploration level on which reactions were first found (level_found)

Base.push! — Method

push!(rd::RxData, sd::SpeciesData, reacs::Vector{Vector{String}}, prods::Vector{Vector{String}}, 
      rsys::Vector{Dict{String, Any}}, psys::Vector{Dict{String, Any}}, 
      dH::Vector{<:AbstractFloat}[, unique_rxns=true, max_molecularity=2])
push!(rd::RxData, sd::SpeciesData, reacs::Vector{Vector{String}}, prods::Vector{Vector{String}}, 
      rsys::Vector{Dict{String, Any}}, psys::Vector{Dict{String, Any}}, 
      dH::Vector{<:AbstractFloat}, level::Int[, unique_rxns=true, max_molecularity=2])

Adds an array of reactions to rd.

Extends an RxData reaction data store from lists of reactant and product SMILES by cross-referencing species IDs with sd. sd should therefore already have all species input here loaded in.

reacs and prods should be the raw SMILES arrays from ingest_cde_run(), i.e. without any duplicate species removed due to stoichiometry. This function will determine stoichiometry and output the unique form of each set of reactants/products.

Can optionally be given a level argument, indicating the exploration level which a set of reactions was discovered. If this is not provided, assumes reactions are entering the struct at level 1.

By default, adds a maximum of 1 of each reaction type when unique_rxns = true, and only admits reactions with a maximum molecularity of 2 (i.e. bimolecular reactions).

Base.splice! — Method

splice!(rd::RxData, rids::Vector{Int})

Removes reactions at indices rids from rd.

Kinetica.get_rhash — Function

get_rhash(sd::SpeciesData, rd::RxData, rid)

Returns the reaction hash for the reaction at rid in rd.

Kinetica.get_reverse_rhash — Function

get_reverse_rhash(sd::SpeciesData, rd::RxData, rid)

Returns the reverse reaction hash for the reaction at rid in rd.

Useful when needing to identify if a reverse reaction is already in a CRN without having to look through many species permutations.

CRN Initialisation

Kinetica.init_network — Function

init_network([iType=Int64, fType=Float64])

Initialises an empty reaction network.

Returns an empty SpeciesData{iType} and RxData{iType, fType}.

CDE Interface

Kinetica.CDE — Type

CDE(template_dir::String [, env_threads, cde_exec, sampling_seed, radius, nrxn, parallel_runs, parallel_exes, write_stdout, write_stderr, allow_errors])

CDE runner. Initialised through keyword-based struct, run through functor.

Struct contains fields for:

CDE template directory (template_dir)
Environmental multithreading number of threads (Optional, defaults to 1 thread; env_threads)
Path to CDE executable (Optional, defaults to CDE packaged within CDEjll; `cdeexec`)
Seed for CDE's RNG (Optional, setting to 0 indicates seed should be random; sampling_seed)
Radius for exploration of breakdown space (Optional, Default = 50; radius)
Number of mechanisms to generate within a single CDE run (Optional, Default = 1; nrxn)
Number of parallel CDE runs to execute (Optional, Default = 1; parallel_runs)
Maximum number of parallel CDE executables to run at any time (Optional, Default = 1; parallel_exes)
Whether to write CDE's stdout to file (Optional, Default = false; write_stdout)
Whether to write CDE's stderr to file (Optional, Default = false; write_stderr)
Whether to allow functions to continue running if CDE errors are detected (Optional, Default = false; allow_errors)

Additionally, some fields are usually modified within Kinetica, and are not intended to be changed by users.

Main reaction directory (rdir)
XYZ file of starting molecule/material (init_xyz)

Kinetica.ingest_cde_run — Function

ingest_cde_run(rdir::String, rcount[, fix_radicals=true, duplicate_reverse=true])

Reads in the results from a CDE run.

Separates out fragment species from each available reaction's reactants and products, forming arrays of their SMILES strings and ExtXYZ geometries.

OBCanonicalRadicals can be enabled to tidy up radical SMILES using the fix_radicals parameter. If duplicate_reverse=true, the reverse reactions are added to the output arrays.

Returns reac_smis, reac_xyzs, reac_systems, prod_smis, prod_xyzs, prod_systems, dH, where:

reac_smis and prod_smis are arrays of the SMILES of each reaction's reactants and products;
reac_xyzs and prod_xyzs are their corresponding geometries as ExtXYZ frames;
reac_systems and prod_systems are the ExtXYZ frames of the systems of molecules that came out of CDE;
dH is an array of reaction energies.

Kinetica.import_mechanism — Function

import_mechanism(loc::ExploreLoc, rcount[, max_molecularity=2, duplicate_reverse=true, unique_rxns=true])

Create a CRN's initial SpeciesData and RxData from a CDE generated mechanism(s).

Reads in the results of a CDE run at the rcount reaction directory specified by loc. Returns a new SpeciesData and RxData containing the unique species and reactions within these results, provided these reactions do not exceed the maximum molecularity set by max_molecularity, which defaults to only accepting unimolecular and bimolecular reactions.

duplicate_reverse specifies whether to add the reverse reactions of each of the imported reactions. unique_rxns specifies whether to only add unique reactions to RxData.

Kinetica.import_mechanism! — Function

import_mechanism!(sd::SpeciesData, rd::RxData, loc::ExploreLoc, rcount[, max_molecularity=2, duplicate_reverse=true, unique_rxns=true])

Extend a CRN's SpeciesData and RxData from a CDE generated mechanism(s).

Reads in the results of a CDE run at the rcount reaction directory specified by loc. Extends sd and rd with the unique species and reactions within these results, provided these reactions do not exceed the maximum molecularity set by max_molecularity, which defaults to only accepting unimolecular and bimolecular reactions.

duplicate_reverse specifies whether to add the reverse reactions of each of the imported reactions. unique_rxns specifies whether to only add unique reactions to RxData.

Kinetica.import_network — Function

import_network(rdir_head::String)

Imports a network from a level tree within rdir_head.

Recurses through level directories in rdir_head, then recurses through subspace directories in each level. Within each subspace, imports all mechanisms within each CDE run.

Returns the resulting network (an instance of SpeciesData and RxData).

Exploration

Kinetica.ExploreLoc — Type

ExploreLoc(rdir_head::String, level::Int, subspace::Int)

Container for iterative exploration location data.

Struct contains fields for:

Head directory of CRN ('rdir_head`)
Current level index within head directory (level)
Current subspace index within level (subspace)

Base.pathof — Method

pathof(loc::ExploreLoc[, to_level=false])

Returns the path of the current subspace in the current level in the head directory specified by loc.

If to_level=true, only returns the path to the current level, excluding the current subspace.

Kinetica.find_current_loc — Function

find_current_loc(rdir_head::String)

Finds the current exploration location in a partially explored CRN.

Returns an ExploreLoc for the current exploration location.

Looks for level directories within rdir_head. If none are found, points to the initial subspace of the initial level. Otherwise finds the latest level with a 'seeds.in' file. If subspaces are not found within this level, points to the initial subspace of this level. Otherwise finds the latest subspace without an 'isconv' convergence file, and points here. If all subspaces in this level are converged, emits a warning and points to the last subspace in this level.

Kinetica.DirectExplore — Type

DirectExplore(rdir_head::String, reac_smiles::Vector{String}, cde::CDE[, 
              maxiters::Int=1000, rxn_convergence_threshold::Int=5,
              modify_network_on_solve::Bool=true])

Keyword-based container for parameters used in direct CRN exploration.

Contains fields for:

Top level of CRN exploration directory (rdir_head)
SMILES string(s) of main breakdown reactant(s) being studied (reac_smiles)
CDE instance (cde)
Maximum number of iterations to perform (maxiters)
Number of iterations with no change in reactions to consider as converged (rxn_convergence_threshold)
Whether to allow CRN modification after solving (modify_network_on_solve)

Kinetica.IterativeExplore — Type

IterativeExplore(rdir_head::String, reac_smiles::VEctor{String}, cde::CDE[, maxiters::Int=1000, 
                 rxn_convergence_threshold::Int=5, seed_convergence_threshold::Int=3, seed_conc=0.05,
                 n_undirected_levels::Int=0, independent_blacklist::Vector{String}=[], 
                 inert_species::Vector{String}=[], modify_network_on_solve::Bool=true])

Keyword-based container for parameters used in iterative kinetics-based CRN exploration.

Contains fields for:

Top level of CRN exploration directory (rdir_head)
SMILES string(s) of main breakdown reactant(s) being studied (reac_smiles)
CDE instance (cde)
Maximum number of iterations to perform (maxiters)
Number of subspace iterations with no change in reactions to consider a subspace converged (rxn_convergence_threshold)
Number of level iterations with no change in seeds to consider the network converged (seed_convergence_threshold)
Concentration above which species will be selected as seeds each level (seed_conc)
Number of undirected levels of exploration to perform at the start (n_undirected_levels)
Blacklist of species to avoid doing independent subspace explorations on (independent_blacklist)
Inert species that should not be considered for reaction (inert_species)
Whether to allow CRN modification after solving (modify_network_on_solve)

Kinetica.explore_network — Function

explore_network(exploremethod::DirectExplore, solvemethod[, savedir])
explore_network(exploremethod::IterativeExplore, solvemethod[, savedir])

Runs network exploration with one of the available methods.

If exploremethod isa DirectExplore, runs a single-level network exploration to attempt to locate all relevant reactions in a radius of exploremethod.cde.radius species from the starting system.

If exploremethod isa IterativeExplore, runs a multi-level iterative network exploration using kinetic simulations to identify seed species for successive levels, in order to fully characterise the reaction space relevant to the conditions in solvemethod.conditions.

Kinetica.load_past_seeds — Function

load_past_seeds(loc::ExploreLoc)

Loads in SMILES of all seeds from previous levels.

Returns an array of these SMILES.

Kinetica.load_current_seeds — Function

load_current_seeds(loc::ExploreLoc)

Loads in SMILES of all seeds from current level.

Returns an array of these SMILES.

Kinetica.identify_next_seeds — Function

identify_next_seeds(sol, sd::SpeciesData, seed_conc<:AbstractFloat[, elim_small_na::Int=0, 
                    ignore::Vector{String}=[], saveto::Union{String, Nothing}=nothing])
identify_next_seeds(sol, sd::SpeciesData[, elim_small_na::Int=0, ignore::Vector{String}=[],
                    saveto::Union{String, Nothing}=nothing])

Selects seed species for the next level of network exploration.

Identifies species in sol with a maximum concentration above seed_conc and returns an array of the SMILES of these species. If seed_conc is not provided, assumes all species in sd should become seeds.

Species meeting selection criteria can be manually ignored by including their SMILES in the ignore argument. Similarly, if species below a certain number of atoms are not desired as seeds (e.g. if they are too small to break down), this number of atoms can be set with elim_small_na.

If saveto is set to a file path, the selected seeds and their maximum concentrations are output to this file (usually generated automatically as a 'seeds.out' file in the current level of exploration).

Molecule System

Kinetica.system_from_smiles — Function

system_from_smiles(smiles::Vector{String}[, dmin::Float64=5.0, maxiters::Int=200])
system_from_smiles(smiles::Vector{String}[, saveto::String, dmin::Float64=5.0, maxiters::Int=200])

Forms a single XYZ system out of the molecules in smiles.

Useful for making unified molecular systems with no overlap for feeding into CDE. smiles should be a single String with individual species separated by '.'. dmin represents the minimum molecule-molecule distance that should be allowed.

If the argument saveto is provided, outputs the optimised system to a file at this path. If not, returns the optimised system as a single ExtXYZ dict.

Kinetica.system_from_mols — Function

system_from_mols(mols::Vector{Dict{String, Any}[, dmin::Float64=5.0, maxiters::Int=200])
system_from_mols(mols::Vector{Dict{String, Any}[, saveto::String, dmin::Float64=5.0, maxiters::Int=200])

Forms a single XYZ system out of the molecules in mols.

Useful for making unified molecular systems with no overlap for feeding into CDE. mols should be a Vector of ExtXYZ frames. dmin represents the minimum molecule-molecule distance that should be allowed.

If the argument saveto is provided, outputs the optimised system to a file at this path. If not, returns the optimised system as a single ExtXYZ dict.

Kinetica.molsys_opt — Function

molsys_opt(mols::Vector{Dict{String, Any}}, dmin::Float64, maxiters::Int)

Optimises positions of molecules in mols to ensure they are all at least dmin Angstroms apart.

Creates and solves an N-body spring-driven particle system and transforms molecular coordinates to these particles to check for proximity.

Returns a new translated Vector of molecules when finished.