GO Database Perl Module

Introduction

The go-db-perl module is an object/relational API for querying the GO db and receiving perl objects. Using the API you can write perl scripts to:

go-db-perl comes with a number of applications for loading and querying the GO Database, including a Tk graphical user interface, and a powerful command line interface "GOshell"

You should have a solid understanding of object oriented perl before using the the go-db-perl modules

Installation

You should have a local copy of the GO mysql database. You will also need the go-perl modules

Please follow the installation instructions for go-perl

Scripts

go-db-perl comes with scripts in the scripts directory to help with querying and loading databases.

Command Line Options

The following command line options are available to all database scripts. If you are connecting to a local MySQL installation that is not password protected you should only need the -d option

Data access scripts

Data loading scripts

GO::AppHandle

The core class in the API is the GO::AppHandle object - it mediates between your code and the database.

After downloading go-db-perl, consult the POD documentation, either using the perldoc command,

perldoc GO/AppHandle.pm
      
or consult the Online Documentation

Fetching Objects from the DB

The AppHandle takes requests, queries the database, and turns the results into perl objects. See the go-perl documentation for a description of the object model

Database Loading

How database loading works

First of all a file (any ontology format or a gene assoc file) is parsed using GO::Parser. The parser will generate an Obo-xml stream. This stream is transformed using an XSLT Transformation into a different kind of XML that is isomorphic to the GO Database Schema. This godb-xml can be loaded into the database using a generic loader.

Database loading components

See also xml documentation.

Future Directions

We're currently looking at alternatives to the object/relational approach to querying the database via perl and other languages. On the one hand, the API allows us to reuse code and provide a simplified interface to some complex queries. On the other hand, it requires a lot of hard-to-maintain code. And whilst the API approach works well with queries that follow certain set patterns, it is not so good for arbitrary queries - for that you need to revert back to the full expressive power of a query language, such as SQL

DBStag

We are developing a library called DBStag (see Stag project page for details), which transforms the results of multijoin SQL queries into nested XML. It also allows for SQL reuse in the form of Stag SQL templates. We have provided a number of these templates for GO in the stag-templates directory

We expect to stop development on GO::AppHandle by 2005 and switch to an approach such as DBStag which combines the expressive power of a language such as SQL with hierarchical XML query results

DBStag also allows for querying of the go-database using SQL templates - see GODB SQL documentation


Chris Mungall
Last modified: Thu Feb 10 15:30:54 PST 2005