|
Anyone familiar with chemical structure databases will recognise that 'small-molecule' data essentially falls into two distinct categories: (i) patent structures and (ii) everything else! Not that there's anything inherently different in the chemistry of these two groups (or even in the individual molecules themselves), but simply in what the structural notation used for each represents.
Patent, or Markush, structures only differ from single-molecule structures in that each Markush structure represents a group of related single molecules. The benefit of using Markush notation is simply that the concise nature of the representation allows very large (potentially infinite!) numbers of molecules to be represented efficiently and, owing to the presence of a common core, the important structural information to be readily assimilated by a chemist. However, the additional complexity required to represent these structures in computer databases has meant that the storage technologies and search methods for each structural class have diverged - to such an extent that the wealth of information available in patent databases is largely inaccessible using the search & analysis tools available to most chemists.
Until now...
Imagine the advantages of being able to search patent databases, alongside corporate, public and project databases using a single substructure search drawn using your favourite drawing package. Torus users already have a taste of the benefits of such integration in that they can already search seamlessly between databases of single molecules and vast combinatorial libraries stored as 's-variant' Markush structures. |