Saturday 17 July 2010

Ultrafast Shape Recognition

A few days ago, I was asked by a colleague who is working in the field of QSAR and virtual screening, to write an implementation of Pedro Ballester's Ultrafast Shape Recognition (USR) descriptor using the Python programming language.

Ballester's descriptor is a fast way to find molecules that closely resemble leads based on their shape. It has been shown to avoid the alignment problem, and to be up to 1500 times faster to calculate than other current methodologies. The shape descriptor makes the assumption that a molecule's shape can be uniquely defined by the relative position of its atoms and that three-dimensional shape can be characterised by one-dimensional distributions.

The source code along with HTML API documentation can be found on my github I hope this is of use to people. I have also uploaded a trial dataset of A42731 (Substance P Antagonists) and the resultant USR descriptor file.

No comments:

Post a Comment