Step by Step instruction¶
This page illustrates how to run the simulation from the scratch.
from pyxtal_ff import PyXtal_FF
Define the source of data¶
TrainData = "pyxtal_ff/datasets/SiO2/OUTCAR_SiO2"
At the moment, we accept the various formats:
ase.db
json
OUTCAR
In principle, one can easily write a utility function to follow the style as shown in the utility section.
Among all different formats, we recommend the use of ase.db. Following ase db, you use need to add the following additional tags to each atoms
object,
from ase.db import connect
# Suppose you have the following variables
# - struc: ase atoms objects
# - eng: total DFT energy
# - forces: DFT Forces: N*3 array
# - stress: DFT Stress: 1*6 stress [in GPa, xx, yy, zz, xy, xz, yz]
# - db_name: the filename to store the information and pass to pyxtal_ff
data = {'dft_energy': eng,
'dft_force': forces,
'dft_stress': stress,
#'group': group,
}
with connect(db_name) as db:
db.write(struc, data=data)
Note that different codes arrange the stress tensor in different order and unit. For PyXtal_FF, we strictly use GPa
and the order of [xx, yy, zz, xy, xz, yz]
.
Choosing the descriptor¶
Four types of descriptors are available (see Atomic Descriptors). Each of them needs some additional parameters to be defined as follows.
BehlerParrinello
(ACSF, wACSF)
parameters = {'G2': {'eta': [0.003214, 0.035711, 0.071421,
0.124987, 0.214264, 0.357106],
'Rs': [0]},
'G4': {'lambda': [-1, 1],
'zeta': [1],
'eta': [0.000357, 0.028569, 0.089277]}
}
descriptor = {'type': 'ACSF',
'parameters': parameters,
'Rc': 5.0,
}
The wACSF
is also supported. In this case, the number of descriptors will linearly dependent on the number of atoms in the system.
EAD
parameters = {'L': 2, 'eta': [0.36],
'Rs': [0. , 0.75, 1.5 , 2.25, 3. , 3.75, 4.5]}
descriptor = {'type': 'EAD',
'parameters': parameters,
'Rc': 5.0,
}
SO4
descriptor = {'type': 'SO4',
'Rc': 5.0,
'parameters': {'lmax': 3},
}
SO3
descriptor = {'type': 'SO3',
'Rc': 5.0,
'parameters': {'lmax': 4, 'nmax': 3},
}
Defining your optimizer¶
The optimizer is defined by a dictionary which contains 2 keys:
method
parameters
Currently, the method
options are
L-BFGS-B
SGD
ADAM
If SGD
or ADAM
is chosen, the default learning rate is 1e-3.
Usually, one only needs to specify the method
.
If no optimizer is defined, L-BFGS-B
will be used.
Setting the NN parameters¶
model = {'system' : ['Si','O'],
'hiddenlayers': [30, 30],
'activation': ['tanh', 'tanh', 'linear'],
'batch_size': None,
'epoch': 1000,
'force_coefficient': 0.05,
'alpha': 1e-5,
'path': 'SiO2-BehlerParrinello/',
'restart': None, #'SiO2-BehlerParrinello/30-30-checkpoint.pth',
'optimizer': {'method': 'lbfgs'},
}
system
: a list of elements involved in the training, list, e.g., [‘Si’, ‘O’]hiddenlayers
: the nodes information used in the training, list or dict, default: [6, 6],activation
: activation functions used in each layer, list or dict, default: [‘tanh’, ‘tanh’, ‘linear’],batch_size
: the number of samples (structures) used for each iteration of NN; int, default: all structures,force_coefficient
: parameter to scale the force contribution relative to the energy in the loss function; float, default: 0.03,stress_coefficient
: balance parameter to scale the stress contribution relative to the energy. float, default: None,alpha
: L2 penalty (regularization term) parameter; float, default: 1e-5,restart
: dcontinuing Neural Network training from where it was left off. string, default: None.optimizer
: optimizers used in NN training.epoch
: A measure of the number of times all of the training vectors are used once to update the weights. int, default: 100.
Note that a lot of them have the default parameters. So the simplest case to define the model is to just define the system
key:
model = {'system' : ['Si','O']}
Also, you can just pick the values from a previous run by defining the restart
key:
model = {'restart': 'Si-O-BehlerParrinello/30-30-parameters.json'}
Setting the linear regression models¶
model = {'algorithm': 'PR',
'system' : ['Si'],
'force_coefficient': 1e-4,
'order': 1,
'alpha': 0,
}
alpha
: L2 penalty (regularization term) parameter; float, default: 1e-5,order
: linear regression (1) or quadratic fit (2)
Invoking the simulation¶
Finally, one just need to load the defined data, descriptors and NN model to PyXtal_FF and execute the run
function.
ff = PyXtal_FF(descriptors=descriptor, model=model)
ff.run(TrainData=TrainData, TestData=TestData)