Biography Resume Advice Research Links Index


  Research  




SNPit (SNP Integration Tool)

Currently, I've been working on SNPit for my future dissertation.
SNPit is an integration system that allows you to search through multiple data sources and extract information on a whole range of possible predictors to functional SNPs. The Main Search link at the top of the page allows you to search all the available data sources, or you can use the navigation menu on the center to search individual sources.

SNPit comes in two query user interfaces: a graphical user interface and a text-based web servlet.






Study Design
In this study, we chose to look at a subset of functional predictors gathered by linking together different databases using the Biomediator data integration system. We extracted data from dbSNP, EntrezGene, UCSC Browser, HGMD, ECR Browser, Haplotter, and SIFT. A user interface that stresses ease of use will be created by conducting requirements assessment with multiple potential users.

A case study will also be conducted by using recent successful whole genome association studies which has been replicated and testing the SNPs that were used in the study to see if our system currently predicts the functionally relevant SNPs.



Biomediator Data Integration System
Biomediator is a federated data integration system: meaning that the owners of the data retain their ownership. The user has the ability to traverse over multiple databases, and query only the information pertinent to the question being asked. The owners determine what is or isn’t released, thus providing users with the ability to limit access to data.

Currently, the integration system has been implemented across various biological domains. Relevant to this proposal, data related to functional SNPs will be linked together for genetic research purposes. Biomediator has interfaces to over 15 public databases (such as Entrez, Swissprot, and OMIM) as well as many private databases of experimental results (including phenotypic databases, genetic databases, imaging databases, and expression array databases).

The central element to its generalizability is Biomediator’s source knowledge base (SKB). The SKB includes descriptions of the data sources, mappings from the source to the mediated schema, and the mediated schema itself. The mediated schema is a general outline that incorporates all the common objects and mappings for the data sources. The SKB is what gives the system flexibility: it can be customized for the various end users. The other three components to the system include:
1) generalized wrappers that translate the data sources syntactically; 2) a metawrapper that goes between the data source and the user query and translates semantically; and 3) the query processor which allows users to query against the mediated schema.