A little bit of computer science has made it possible to reduce the time it takes to find new genes from years to milliseconds. Researchers at Stanford University have developed a simple processing method for searching through large public databases of genes and their associated proteins. Using Boolean logic (if A then B, if not C then D, etc), they have found a way to suggest which genes are responsible for different stages of complex chemical processes in our cells. Biologists generally take months or even years to find these genes experimentally, but the Boolean search method takes just a fraction of a second. When widely applied, the new method may accelerate genetic research enormously. Great things happen when sciences come together.
There are tens of thousands of human genes, and we are still uncertain about the exact function of most of them. Still, scientists have mapped the proteins associated with many genes. Large databases of these gene and protein relations have been compiled, and many are open for researchers to access. There’s huge amounts of data here, but turning it into meaningful scientific insight isn’t easy. Luckily, Debashis Sahoo of Stanford had an insight: the expression of genes is often asymmetric. Some gene X will be expressed only when gene A is not expressed. This lets you use Boolean logic to sort genes.
From that simple beginning Sahoo was able to perform a sort of Boolean analysis to determine a gene’s importance in a given protein pathway. Say you know that a protein pathway starts with gene A and ends with gene B, but you don’t know much about what happens in between. You can look through all the other genes and their associated proteins and find some that are not expressed at the same time as A, but are expressed with B. These genes may code for proteins that occur inside the pathway.
That’s an overly simplified explanation of the Boolean net Sahoo assembled, but you get the idea. With the right “if then” filters Sahoo was able to sort through thousands of genes in a fraction of a second, finding just those which are likely to be important in a given protein pathway. Instantly he had a short list of genes that geneticists could examine.
Does all this computer science sorting actually yield results? You bet. As discussed in PNAS, Sahoo and other Stanford researchers used their Boolean method to comb through a database looking for genes related to B-cells (an immune system cell). They found 62 genes that could be involved in B-cell pathways. They then looked at the DNA from 41 strains of mice that had been modified to have disruptions (so called ‘knockouts’) in those genes. In 26 of those 41 strains, they found mice that had disrupted B-cells. In other words, the Boolean method looks to have been better than 50% accurate in suggesting genes that might be related to B-cell development. That may not sound like much, but think of all the thousands of genes that they looked through – it’s like finding needles in a haystack!
And this is just the beginning. Those databases of genes and proteins can now be sorted using Sahoo’s method to suggest new genes to explore. Think of an important gene out there, say the FOXO3A gene that may code for longevity. Wouldn’t you like to know which genes are related to it? Oh yes. Scientists are starting to suspect that most of the characteristics we want to investigate (longevity, intelligence, resistance to a certain disease) may rely on a very complex interaction of different genes. Sahoo’s Boolean method will help researchers hunt down related genes quickly and effectively and get a better understanding of those complex interactions. We already have large stores of genetic information, biobanks, ready to be sequenced and examined. As that information becomes available, analytical methods will allow us to turn raw data into meaningful insights into how we should perform genetic research. In the end, genetics is an information science and with the right application of computer skills we’ll be able to accelerate its progress. That’s going to mean quicker and better results. So give some praise to computer logic. If Boolean Then awesome.
[image credit: Singularity Hub]
[source: Stanford News, PNAS]