Getting started¶
Hydride adds hydrogen atoms to molecular models where these are missing, for examples structures that were determined via low-resolution X-ray diffraction. Instead of using force field parameters to place hydrogen atoms at their optimum bond angle and distance, it uses a large fragment library to perform this task. This allows Hydride to assign hydrogen atoms to a broad range of molecules.
For each heavy atom in a molecule, i.e. an atom that is not hydrogen, a fragment is created. Each fragment contains the central heavy atom and all heavy atoms directly bonded to it. This fragment is searched in the fragment library, that contains for each fragment the number of bonded hydrogen atoms and their positions relative to the position of the heavy atom and its bond partners.
The fragment library is compiled from the Chemical Component Dictionary, containing all molecules that appear in the PDB. This means that hydrogen atoms can be assigned to any molecule/residue from the PDB and all molecules that share comparable groups. Special cases, which are not covered by the standard fragment library, can be resolved by adding a version of this molecule with hydrogen atoms to the fragment library.
After the hydrogen atoms are added, the conformation is not optimal, yet. In a second step Hydride relaxes the dihedral angles of terminal heavy atoms carrying hydrogen atoms. This method minimizes steric clashes and restores hydrogen bonds.
For an in-depth explanation of the underlying algorithm please refer to the journal article.
Installation¶
In order to use Hydride you need to have Python (at least 3.7) installed.
You can install Hydride via
$ pip install hydride
Alternatively, you can check out the offical repository and build and install the package via
$ pip install .
Note that this way the installation may take a few minutes, as the fragment library is built and the C-extensions are compiled.
Basic usage¶
The most simple invocation of the Hydride command line program is
$ hydride -i input_structure.pdb -o output_structure.pdb
Hydride reads a molecular model without hydrogen atoms from
input_structure.pdb
, adds hydrogen atoms and writes the resulting model to
output_structure.pdb
.
If hydrogen atoms remain in the input structure, these are automatically
removed.
Hydride supports the PDB, PDBx/mmCIF, MMTF, MOL and SDF
format.
If no input structure file path is given, the file is read from STDIN.
In this case the the format cannot be inferred from the file extension, so the
format must be explicitly given via the -I
parameter.
Conversely, the structure is written to STDOUT and the -O
parameter is
required, if no output file path is given.
This way Hydride can be used in a chain of commands.
$ some_tool | hydride | some_other_tool > output_structure.pdb
All command line parameters and their usage is listed in depth in Command line interface.
Often structure files miss proper formal charge values for each atom, leading to false additions of hydrogen atoms, for example protonated carboxy groups. This problem can be mitigated at least for amino acids by recalculating formal charges at a given pH value.
$ hydride --charges 7.0 -i input_structure.pdb -o output_structure.pdb
Note that only formal charges in amino acids are updated this way. For all other molecules the formal charge values from the input structure file is taken. Furthermore, the underlying method assigns charges based on the pK values of the free amino acid. The chemical environment of a residue is not taken into account.
Python API¶
Hydride is not only command line program, but also a Python library
extending on AtomArray
objects from the
Biotite package.
The hydride
package provides two central functions:
add_hydrogen()
and relax_hydrogen()
.
While the former adds hydrogen atoms with appropriate bond angles and
lengths using the fragment library, the latter takes a structure
containing hydrogen atoms and optimizes the hydrogen positions by
rotating about dihedral angles of terminal groups.
Usually, both functions are called subsequently, for example:
hydrogenated_atoms, _ = hydride.add_hydrogen(heavy_atoms)
hydrogenated_atoms.coord = hydride.relax_hydrogen(hydrogenated_atoms)
but these functions can also be used independently:
relax_hydrogen()
can be omitted, if a relaxation is not necessary
for the use case.
Conversely, add_hydrogen()
does not need to be called if the
AtomArray
already contains hydrogen atoms, but merely steric clashes
should be resolved.
Additional information¶
In-depth explanation of the underlying algorithm will be available in the upcoming journal article.