One of the ambitious goals of modern biology is to disentangle the links between genome and phenome in living organisms. In other words, how does the information coded into an organism's DNA relate to the functions and forms performed and expressed by that organism? There are many complexities involved in answering that question. One successful approach was to look simply at patterns of gene presence and absence in a balanced way across the tree of life and use supervised clustering to define sets of genes linking distant organisms to a common function (see here). We are looking to extend these models to metagenomic inference and to go beyond proteins to understand gene regulation across diversity.
The computational tool "PredictTrophicMode" was developed as one aspect of this project.