DAS Workshop 2010 ProServer Tutorial

Part 1

Andy Jenkinson, EMBL-EBI, 7th April 2010

Overview

This tutorial is intended to help you understand how ProServer, a DAS server for Perl, is structured. It will also guide you in setting up ProServer on your machine, and exploring the examples that come with the distribution.

The tutorial assumes you are familiar with Perl and are operating on a Linux platform.

Basic Architecture

ProServer is a standalone server, meaning you do not need to run a separate web server such as Apache. It handles all of the communications, query parsing and XML output functions in a set of "core" modules, and uses plugins to run actual data sources. Plugins are responsible for adapting actual data to the DAS protocol. The server and its plugins are configured using an INI file.

Each data source is represented in ProServer by an instance of a plugin module, called a SourceAdaptor. Simple data sources, especially those based on files, can sometimes be set up without requiring any code at all by using a pre-existing SourceAdaptor. More often, running a custom data source will require you to write your own. This will be covered in Part 2.

Procedure

The lifecycle of a typical DAS features request is as follows:

  1. Client issues request.
  2. Core parses and checks request content.
  3. Core obtains the data source's SourceAdaptor object.
  4. Core passes the extracted query parameters to the SourceAdaptor object via the das_features method.
  5. SourceAdaptor handles basic logic/iteration and delegates to the the build_features method (implemented in subclass).
  6. SourceAdaptor subclass extracts the relevant data from storage and returns a uniform Perl data structure.
  7. SourceAdaptor constructs an XML response and passes it back to the core.
  8. Core sends the response back to the client.

Downloading and Building ProServer

The best way to get ProServer is via the Subversion repository. The trunk typically contains the latest stable version, so includes the latest bugfixes. To download it to your home directory, open a terminal and type the following:

cd ~
svn checkout http://proserver.svn.sf.net/svnroot/proserver/trunk Bio-Das-ProServer

When the download is complete, enter the Bio-Das-ProServer directory that was created:

cd Bio-Das-ProServer

Take a moment to read the installation section of the README file. Proceed to build ProServer as per the instructions:

perl Build.PL
./Build

Running ProServer

The distribution contains a Perl script called proserver that you should use to run proserver. It is in the eg directory. During development, you should run this script with the -x option. This prevents the process from forking and directs log output to your terminal rather than to file. Try running the script in your terminal:

eg/proserver -x

If all is well, the server will start and output some information about its (default) configuration. If not, you should be able to diagnose the problem. Commonly errors arise from:

Exploring the Examples

With ProServer still running, point your browser to the URL and port where it is listening:

http://localhost:9000

You should see the ProServer homepage. From this page you can click the "SOURCES" link. This will execute the DAS sources command, via the URL:

http://localhost:9000/das/sources

You should see a table listing the DAS sources being hosted from your server. There are two examples configured by default: "mysimple" and "mygff". The sources command provides some metadata about each source that is useful for client software. In particular, the "coordinates" and "capabilities" properties help a client to know whether a source contains data that is relevant to it.

The sources command, like all DAS commands, has an XML output. The browser converts this XML to the coloured human readable output you see via an XSL stylesheet. To see the DAS XML output, use the "view source" option of your browser. (NOTE: If you see no output at all at this point, make sure you are running the server from the Bio-Das-ProServer directory so that ProServer can find its XSL stylesheets.)

Take some time to explore the capabilities of the example DAS sources:

mygff

mysimple

Modifying the Examples

ProServer uses an INI file to configure itself, which you can specify using the '-c' command-line option. This INI file defines lots of things such as the port number the server should listen on, the root directory to look for static content, and details of the DAS sources it is serving. In Part 2 you will write your own INI file, but for now take a quick look at the default one. It is located at eg/proserver.ini.

Each section of the INI file is denoted by square brackets. Server options such as port number are in the [general] section. All other sections are treated as DAS sources that the server hosts, each representing an individual source of data.

Find the [mygff] section in the INI file to see how the "mygff" DAS source is configured.

[mygff]
adaptor       = file
state         = on
description   = An example source using a GFF file
doc_href      = http://another.homepage.com
; Properties for the 'file' SourceAdaptor to allow it to read GFF2 files
filename      = eg/data/example.gff
cols          = segment,method,type,start,end,score,ori,phase,note,note
feature_query = field0 lceq %segment AND field4 >= %start AND field3 <= %end
comment       = ^#
separator     = \t|\s*;\s*
; Coordinate system and test range:
coordinates   = NCBIM_37,Chromosome,Mus musculus -> Y:1,100

Do not worry about the specifics of each property, though hopefully you will have a vague idea what they do. This exercise is merely to familiarise you with where some of the metadata comes from. From here you can see that the actual data is coming from a file, eg/data/example.gff - so have a look inside using a text editor.

Now download another GFF file and save it:

curl 'ftp://ftp.sanger.ac.uk/pub/wormbase/live_release/genomes/c_elegans/genome_feature_tables/GFF2/CHROMOSOME_MtDNA.gff' > CHROMOSOME_MtDNA.gff

[Alternative download: CHROMOSOME_MtDNA.gff]

Edit the mygff source section in proserver.ini to point to this new file and update the coordinates accordingly:

filename      = CHROMOSOME_MtDNA.gff
coordinates   = WS_200,Chromosome,Caenorhabditis elegans -> CHROMOSOME_MtDNA:1,100

Stop and re-start your server (Ctrl-C to stop your interactive session on the terminal) and take another look at your modified DAS source. If it doesn't work, keep an eye out for errors in the terminal output.