Families of Equivalent, Interconvertible Representations
Klotho computes with several informationally equivalent
representations
:
by an idea of
David Searls; it was the need to avoid coding
such a long representation for many molecules that led us to develop Klotho.
Notice that certain information is
implicit
in the first three representations, and is
inserted into the terminal form by the action of the
grammar
.
For example, only a few atom numbers are indicated in the configuration rule (just enough
to insure the resulting compound is definitely numbered the way biochemists do), and no
isotope; complete enumeration of the atoms occurs in the edge and key-pair descriptions;
and isotope and most charge information is inserted in the terminal form. The isotope is
something of a relic from the days when we thought we would trace the atoms
computationally the way one does experimentally, by labelling the compound with a
different isotope. In fact since all the atom numbers are distinct, all that is required
is to map atom numbers from one compound to another through the reactions, and tabulate
the mappings. Because we use substituent groups to
make even larger molecules
,
we explicitly generate all hydrogens not already indicated in the configuration rule in
the other three representations.
So that we can check the configuration rules for the compounds, Klotho also
generates isomeric SMILES strings; we sometimes call these representations ``external'' to
distinguish them from those of Klotho proper.
These, and the PDB files resulting from passing the strings through CONCORD
, are
not informationally equivalent to Klotho's
representations, though some of the information is
intensionally
coded by CONCORD and by
the conventions with which we view three-dimensional structures.