Sunday, 17 October 2010
The Future Is Bright, The Future is Orange
A recent post by Richard L. Apodaca on the use of Knime work flows in Eclipse for cheminformatics, provoked me to look at another piece of software, Orange. Orange has been around for some time, and is an opensource data visualization/ mining toolkit written in the Python programming language. The GUI is built on QT .
I recently downloaded the MAC OSx bundle, and was pleasantly surprised by the ease in which workflows could be created (see attached image). Using the Orange GUI is easy, it allows you to read in files of different formats, process or filter attributes, to cleanly visualize the data, data distributions, to classify data, show confusion matrices and ROC curves etc.
Since I am a big fan of Eclipse, I wanted to access the scripting side of the Orange library through Eclipse. Setting up a Pydev project is easy, however, when I came to run my program:
'''
Created on Oct 16, 2010
Example of using orange python -> constructs Naive Bayesian Classifier
@author: eoc21
'''
import os, sys, orange
class ClassifierExample():
def __init__(self,fileName):
self.data = orange.ExampleTable(fileName)
self.classifier = orange.BayesLearner(self.data[2:])
def runBayesLearner(self):
for i in range(2,20):
c = self.classifier(self.data[i])
print "original",self.data[i].getclass(),"classified as", c
def printProbabilities(self):
for i in range(2,20):
p = self.classifier(self.data[i],orange.GetProbabilities)
print "%d: %5.3f (originally %s)" % (i+1, p[1], self.data[i].getclass())
if __name__ == '__main__':
example = ClassifierExample(sys.argv[1])
example.runBayesLearner()
example.printProbabilities()
I came up against the error "orange.so can't work with 64 bit architecture", since I'm running Snow Leopard, which defaults to 64 bits, I had to set a variable called: VERSIONER_PYTHON_PREFER_32_BIT
to yes.
Everything then worked cleanly.
Orange is primarily for machine learning, however it also has tools to support workflows in bioinformatics, one can also use the molecule visualizer to view smiles strings from a file.
Subscribe to:
Post Comments (Atom)
And the AstraZeneca team in Mölndal/Sweden is developing a cheminformatics plugin for Orange:
ReplyDeletehttp://github.com/AZCompTox/AZOrange
Hi,
ReplyDeleteI am trying to setup Orange with eclipse. I couldn't find up any write up on this, If you have any available, could you please help me?