| ... Toolkits |
|
Digital Chemistry produces a range of Toolkit components,
based on BCI technology, which allow users
to develop custom applications exploiting a wide
range of chemoinformatics techniques. The components
are designed to offer many high-level functions
which enable applications to be rapidly built with
the minimum of source code. However, the Toolkit
API also contains many lower-level calls giving
users all the flexibility they may require. |
| Integrated
Components |
| Individually, each
Toolkit Component allows developers to quickly create
applications in specific areas of chemoinformatics
such as clustering or fingerprint generation. However,
the Components are also highly integrated with the
output of each being freely interchangeable with
others. For example, the Fingerprint Component can
be used to generate structure fingerprints for sets
of molecules, or the Markush Components to calculate
physicochemical properties or fingerprints for the
library members. These data may then be passed to
the Clustering or Diversity Components for further
analysis. |
| Support
for Third Party Toolkits and Standards |
| The Digital Chemistry
Toolkit integrates seamlessly with toolkits available
from both Accelrys and Daylight, with native data
being freely transferable between all systems. The
Toolkit also makes extensive use of common file
storage formats, for example those from MDL and
Daylight. |
| Platform
and Language support |
A number of language
wrappers are available allowing the development
of applications which may be subsequently deployed
in any single, or multi-tiered architecture. Each
language wrapper follows the same implementation
pattern, making use of different languages as
easy as possible. Current support is for:
Platform support: Windows (NT,
2000, XP), Sun Solaris (Developer and GCC) and
Linux (GCC).
Language support: C/C++, Java,
Visual Basic (6 and .NET), PERL and PYTHON.
|
Available
toolkits
|
| Markush
Toolkit Components |
| The Markush components
allow the rapid analysis of very large combinatorial
libraries. The components have been specifically
designed to very efficiently calculate properties
and perform library analysis without requiring any
enumeration until absolutely necessary. Key features
are: |
- Extremely rapid enumeration of library members,
typically > 500,000 structures per second.
- Calculation of physicochemical properties
(including the Lipinski "rule of 5"
properties), topological indices and structure
fingerprints for library members, without the
need for prior enumeration of the structures.
- Ability to define custom properties based
upon fragment weightings.
- Identification of the overlap between combinatorial
libraries (also without full enumeration).
- Full structure and substructure searching
within combinatorial libraries (also without
enumeration).
- Library data is interchangeable with other
parts of the toolkit, for example to allow clustering
of library members with the Clustering components.
|
| |
| Clustering
Toolkit Component |
| Digital Chemistry's
Clustering Component has been designed to deal with
very large datasets in as short a time as possible.
A range of different clustering algorithms is available,
including parallel implementations for suitable
hardware. Key features are: |
- Hierarchical clustering methods: Ward's,
Group_average and Divisive K-means.
- Non-hierarchical clustering methods: K-means,
Jarvis-Patrick.
- Clustering of floating point dataprints as
well as binary fingerprints.
- Capable of clustering millions of data points
rapidly.
- Simple hierarchy traversal allowing access
to each levels parents and siblings.
- A range of analysis functions allowing optimum
partition levels to be discovered, also the
ability to define stepped partitions based upon
varying levels.
- Ability to cluster more efficiently in parallel
on suitable hardware.
|
| |
| Diversity
Toolkit Component |
| Exploration of chemical
space, for tasks such as compound acquisition, high-throughput
screening and combinatorial library design, is assisted
by comparative measures of the structural and property
diversity of the datasets concerned. These measures
also enable the implementation of methods for the
selection of diverse subsets of compounds. Key features
are the rapid: |
- Identification of common features in large
datasets.
- Calculation of numerical diversity measures,
based on average intermolecular dissimilarity.
- Selection of subsets from large datasets,
based on maximising either the average or the
minimum distance between molecules selected
for the subset.
- Calculation of the change in diversity that
would occur if two datasets were to be merged
- Identification of redundant features used
as descriptors.
|
| |
| Fingerprint
and Dictionary Generation Toolkit Component |
| The Fingerprint
and Dictionary Generation Components allow the generation
of binary structure fingerprints and fragment dictionaries
from structures held in a number of industry standard
formats. The Fingerprint Component uses any of a
number of supplied dictionaries to create binary
fingerprints. The Dictionary Component allows the
creation and analysis of user dictionaries, tailored
for specific datasets. Key features are: |
- Rapidly generate fingerprints of any length
based upon generalised dictionary fragments.
- Ability to define custom dictionaries of
any number of user defined screens. Screens
need not only identify specific sub-structures,
but may also contain fragments at a number of
levels of generalisation.
- Source structures may be read from a number
of industry standard formats, e.g SMILES or
MOL files.
- A range of statistical functions to assist
with the creation of the most effective dictionaries.
- Generated fingerprints are easily transferred
to other Toolkit Components for Clustering or
Diversity analysis.
|
| |
| MOLSMART
Toolkit Component |
| This offers all
the function of the MOLSMART package, but as a programmable
component enabling it to be embedded in other applications.
This allows the conversion of structures or structural
queries represented in MDL formats into their equivalent
Daylight formats. Key features are: |
- Conversion of MDL structure and reaction
query formats to Daylight SMARTS and SMIRKS
strings
- Conversion of MDL MOL files to Daylight SMILES
strings
- Identification of complete aromatic rings
- Automatic calculation of atom-atom maps,
even for unbalanced reactions
|
| |
| Digital
Chemistry Products |
| If you would like
more information about the Digital Chemistry Toolkit,
licensing issues or pricing please contact us, our
details are given opposite. |
|