Markush Structures & Combinatorial LibrariesFingerprinting & DictionariesClusteringDiversity AnalysisChemical Query Conversion
Torus™ToolkitsMain ProgramsWeb ServicesThird Party Integration
  About Us
  Products
  Consulting
  Support
  News & Events
  Contact Us
  Sitemap
 
Click here to login
 
 
... Toolkits
Digital Chemistry produces a range of Toolkit components, based on BCI technology, which allow users to develop custom applications exploiting a wide range of chemoinformatics techniques. The components are designed to offer many high-level functions which enable applications to be rapidly built with the minimum of source code. However, the Toolkit API also contains many lower-level calls giving users all the flexibility they may require.
Integrated Components
Individually, each Toolkit Component allows developers to quickly create applications in specific areas of chemoinformatics such as clustering or fingerprint generation. However, the Components are also highly integrated with the output of each being freely interchangeable with others. For example, the Fingerprint Component can be used to generate structure fingerprints for sets of molecules, or the Markush Components to calculate physicochemical properties or fingerprints for the library members. These data may then be passed to the Clustering or Diversity Components for further analysis.
Support for Third Party Toolkits and Standards
The Digital Chemistry Toolkit integrates seamlessly with toolkits available from both Accelrys and Daylight, with native data being freely transferable between all systems. The Toolkit also makes extensive use of common file storage formats, for example those from MDL and Daylight.
Platform and Language support

A number of language wrappers are available allowing the development of applications which may be subsequently deployed in any single, or multi-tiered architecture. Each language wrapper follows the same implementation pattern, making use of different languages as easy as possible. Current support is for:
Platform support: Windows (NT, 2000, XP), Sun Solaris (Developer and GCC) and Linux (GCC).
Language support: C/C++, Java, Visual Basic (6 and .NET), PERL and PYTHON.



Available toolkits

Markush Toolkit Components
The Markush components allow the rapid analysis of very large combinatorial libraries. The components have been specifically designed to very efficiently calculate properties and perform library analysis without requiring any enumeration until absolutely necessary. Key features are:
  • Extremely rapid enumeration of library members, typically > 500,000 structures per second.
  • Calculation of physicochemical properties (including the Lipinski "rule of 5" properties), topological indices and structure fingerprints for library members, without the need for prior enumeration of the structures.
  • Ability to define custom properties based upon fragment weightings.
  • Identification of the overlap between combinatorial libraries (also without full enumeration).
  • Full structure and substructure searching within combinatorial libraries (also without enumeration).
  • Library data is interchangeable with other parts of the toolkit, for example to allow clustering of library members with the Clustering components.
Clustering Toolkit Component
Digital Chemistry's Clustering Component has been designed to deal with very large datasets in as short a time as possible. A range of different clustering algorithms is available, including parallel implementations for suitable hardware. Key features are:
  • Hierarchical clustering methods: Ward's, Group_average and Divisive K-means.
  • Non-hierarchical clustering methods: K-means, Jarvis-Patrick.
  • Clustering of floating point dataprints as well as binary fingerprints.
  • Capable of clustering millions of data points rapidly.
  • Simple hierarchy traversal allowing access to each levels parents and siblings.
  • A range of analysis functions allowing optimum partition levels to be discovered, also the ability to define stepped partitions based upon varying levels.
  • Ability to cluster more efficiently in parallel on suitable hardware.
Diversity Toolkit Component
Exploration of chemical space, for tasks such as compound acquisition, high-throughput screening and combinatorial library design, is assisted by comparative measures of the structural and property diversity of the datasets concerned. These measures also enable the implementation of methods for the selection of diverse subsets of compounds. Key features are the rapid:
  • Identification of common features in large datasets.
  • Calculation of numerical diversity measures, based on average intermolecular dissimilarity.
  • Selection of subsets from large datasets, based on maximising either the average or the minimum distance between molecules selected for the subset.
  • Calculation of the change in diversity that would occur if two datasets were to be merged
  • Identification of redundant features used as descriptors.
Fingerprint and Dictionary Generation Toolkit Component
The Fingerprint and Dictionary Generation Components allow the generation of binary structure fingerprints and fragment dictionaries from structures held in a number of industry standard formats. The Fingerprint Component uses any of a number of supplied dictionaries to create binary fingerprints. The Dictionary Component allows the creation and analysis of user dictionaries, tailored for specific datasets. Key features are:
  • Rapidly generate fingerprints of any length based upon generalised dictionary fragments.
  • Ability to define custom dictionaries of any number of user defined screens. Screens need not only identify specific sub-structures, but may also contain fragments at a number of levels of generalisation.
  • Source structures may be read from a number of industry standard formats, e.g SMILES or MOL files.
  • A range of statistical functions to assist with the creation of the most effective dictionaries.
  • Generated fingerprints are easily transferred to other Toolkit Components for Clustering or Diversity analysis.
MOLSMART Toolkit Component
This offers all the function of the MOLSMART package, but as a programmable component enabling it to be embedded in other applications. This allows the conversion of structures or structural queries represented in MDL formats into their equivalent Daylight formats. Key features are:
  • Conversion of MDL structure and reaction query formats to Daylight SMARTS and SMIRKS strings
  • Conversion of MDL MOL files to Daylight SMILES strings
  • Identification of complete aromatic rings
  • Automatic calculation of atom-atom maps, even for unbalanced reactions
Digital Chemistry Products
If you would like more information about the Digital Chemistry Toolkit, licensing issues or pricing please contact us, our details are given opposite.
Top

 

 
   
  search :
     
  For general enquiries, contact:
T: +44 (0)113 2181851
F: +44 (0)113 2181869
E: info@digitalchemistry.co.uk

The Iron Shed
Harewood House Estate
Harewood
Leeds LS17 9LF
United Kingdom