Building a Synchronous SADI Service With Perl and Generated OWL Modules (Advanced ... but more simple!)

This document summarises steps needed in order to develop (and to implement) a synchronous SADI web service using SADISeS (SADI Service Supoort). It should be noted that the service described in this document is about as complicated as your average service. With SADI, the hardest part of service implementation is producing the OWL output that your service advertises. To remove some of the complexity around OWL, we will be using Perl modules that were automatically generated using the script, sadi-generate-datatypes.

The main thing to understand is that SADISeS does not give you a full implementation of your service. You still need to program the business logic (e.g. to extract data from your database) - but you do not need to worry about the SADI or HTTP protocol details.

At the end of this tutorial, you should have 3 things: a SADI service definition file, a PERL module containing your business logic, and a PERL CGI entry script to your service.

The service definition file is a properties based file that contains the information needed to describe what a SADI service will do. In most cases, this file will be located in the ~/Perl-SADI/definitions directory* and can be manually edited to reflect what your SADI service does.

The SADISeS generated PERL module contains a place for you to insert the business logic of your service. In most cases, this file will be located in the ~/Perl-SADI/services/Service/ directory*.

Finally, the PERL CGI entry script provides the interface with which a user interacts with your SADI service. This script is also generated by SADISeS and is located in the ~/Perl-SADI/cgi directory*.

*For those of you developing your SADI services using MS Windows, the ~ refers to your home directory. On Windows, this is usually C:\Users\Your_Name.

These 3 things are all you need to create a single SADI web service and SADISeS helps you do just that without worrying too much about the SADI protocol.

Let's now move on towards building our SADI service. In step 1 below, we will make sure that you have all of the dependencies in order.

Table of Contents

Step 1: What is needed
Step 2: Service definition generation
Step 3: Service generation
Step 4: OWL to Perl module generation

Step 5: Service implementation
Step 6: Service testing
Step 7: Service deployment
Step 8: Service testing using HTTP
Step 9: Service registration

Step 1: What is needed

To implement SADI services using SADISeS, you need to have the following installed on your machine:

      1. Perl - perl has to be installed on your machine
      2. A web server - this document assumes that you are using Apache2
      3. Perl SADI - available on cpan
      4. ODO - also available on cpan

To implement this particular service, you will need in addition to the above requirements,

      1. LWP::Simple - available on cpan or your favorite perl package manager (bundled with Perl)
      2. XML::LibXML - also available on cpan or your favorite perl package manager

Once you have installed Perl SADI and all of its dependencies, i.e. ODO, on your machine, you will have to run through the SADISeS set up. This is only done once per user of SADISeS (unless you are upgrading Perl-SADISeS). For more information, please view the SADI::SADI documentation.

Step 2: Service definition generation

Before we can generate any code, we need to tell SADISeS a little bit about our service. This is done via a definitions file.

The service that we are going to implement in this document is one that given a uniprot record, will seek out the associated GO records and annotate the uniprot record with them.

To generate a definition file for your service, issue the following command at the command prompt:

$ sadi-generate-services.pl -D Uniprot2GoAuto

Basically, this tells SADISeS that we would like to generate a definition file for the service 'Uniprot2GoAuto'. The generated file can be found in ~/Perl-SADI/definitions/Uniprot2GoAuto.

If you open the generated definitions file, you will see something like the following:

# leave the following line as is!
ServiceName = Uniprot2GoAuto

# modify the values below as you see fit.
ServiceType = http://someontology.org/services/sometype
InputClass = http://someontology.org/datatypes\#Input1
OutputClass = http://someontology.org/datatypes\#Output1
Description = A implementation of the 'Uniprot2GoAuto' service
UniqueIdentifier = urn:lsid:myservices:Uniprot2GoAuto
Authority = authority.for.Uniprot2GoAuto
Authoritative = 1
Provider = myaddress@organization.org
ServiceURI = http://localhost/cgi-bin/Uniprot2GoAuto
URL = http://localhost/cgi-bin/Uniprot2GoAuto
SignatureURL = http://localhost/cgi-bin/Uniprot2GoAuto

SADISeS has done a nice job in preparing this file for us!

Beware, if you use the characters # or = you will need to escape them with a \.

We will have to edit it slightly, to ensure that we specify our actual inputs/outputs. For now, we will leave the URL as is.

Please specify that the InputClass is http://purl.oclc.org/SADI/LSRN/AnnotatedUniProt_Record and the OutputClass is http://purl.oclc.org/SADI/LSRN/GO_Record. Save and close the file!

The edit file is shown below for clarity:

# leave the following line as is!
ServiceName = Uniprot2GoAuto

# modify the values below as you see fit.
ServiceType = http://someontology.org/services/sometype
InputClass = http://purl.oclc.org/SADI/LSRN/UniProt_Record
OutputClass = http://purl.oclc.org/SADI/LSRN/AnnotatedUniProt_Record
Description = A implementation of the 'Uniprot2GoAuto' service
UniqueIdentifier = urn:lsid:myservices:Uniprot2GoAuto
Authority = authority.for.Uniprot2GoAuto
Authoritative = 1
Provider = myaddress@organization.org
ServiceURI = http://localhost/cgi-bin/Uniprot2GoAuto
URL = http://localhost/cgi-bin/Uniprot2GoAuto
SignatureURL = http://localhost/cgi-bin/Uniprot2GoAuto

Step 3: Service generation

Now that we have a definitions file, the next step in building a SADI service is to generate the actual service code!

To generate our service skeleton, issue the following command at the command prompt:

$ sadi-generate-services.pl Uniprot2GoAuto

SADISeS will then go ahead and generate 2 things for you! An entry script for your service (located in the ~/Perl-SADI/cgi/ directory) and a service implementation file (located in the ~/Perl-SADI/services/Service/ directory).

Your entry script will be called Uniprot2GoAuto. The implementation file is called Uniprot2GoAuto.pm. Go ahead and look at both files. Soon, we will be editing Uniprot2GoAuto.pm.

Step 4: OWL to Perl module generation

In order to utilize the OWL to Perl module feature of SADISeS, we need to use the sadi-generate-datatypes.pl script that is packaged with SADISeS. This script takes in either a filename or URL to an OWL file and produces Perl modules that you can use in your SADI services to populate your services output data.

To generate Perl modules based on an OWL file, issue the following command at the command prompt:

$ sadi-generate-datatypes.pl -u http://sadiframework.org/examples/uniprot2go.owl

Basically, this tells SADISeS that we would like to generate Perl modules based on the OWL file located at http://sadiframework.org/examples/uniprot2go.owl. The generated modules can be found in the ~/Perl-SADI/generated/ folder.

However, the uniprot2go.owl file is slightly incomplete (or our OWL parser is a bit lazy) so we need to also generate datatypes for the file located at http://purl.oclc.org/SADI/LSRN/

$ sadi-generate-datatypes.pl -u http://purl.oclc.org/SADI/LSRN/

So now we will have generated perl modules for the various types of OWL classes defined in the file. Now we can continue onto implementing our service!

Step 5: Service implementation

Now that we are ready to implement the business logic, we will have to find, open and edit the module Uniprot2GoAuto.pm (look in the folder ~/Perl-SADI/services/Service/).

SADISeS automatically created this file for you and left just the subroutine process_it for you to code your implementation. Fortunately, SADISeS provides some sample code for you to see how some operations are done!

One of the very first things that you will see in process_it is the line:

 foreach my $input (@inputs) { ...

Basically, our service is iterating over any and all inputs recieved that are of class InputClass. It is then up to us to use that data in our business logic and output the result as class OutputClass.

The input, $input, is of type RDF::Core::Resource.

Our business logic will be placed after the line:

# do something with $input ... (sorry, can't help with that)

Our business logic:

Before doing anything, we need to the following use statements:

# modules we use to get record and parse it
use LWP::Simple qw(!head);
use XML::LibXML;

# our generated owl stuff
use sadiframework::org::examples::uniprot2go::AnnotatedUniProtRecord;
use sadiframework::org::ontologies::predicates::hasGOTerm;
use purl::oclc::org::SADI::LSRN::GO_Record;

Reading the inputs:

# read the record id
my $record_id = $input->getURI;
#get just the id number
$record_id = $1 if $record_id =~ /\/(\w+)$/gi;

Now we get the associated Uniprot record from the UniProtKB if it exists and extract the GO record ids:

# now get the URL - we are using uniprot.org's restful methods to get the record
my $url = "http://www.uniprot.org/uniprot/$record_id.xml";
my $xml = get($url);
next unless defined $xml;

# xml was obtained, now use XPATH to extract GO ids
my $parser = XML::LibXML->new();
my $doc    = $parser->parse_string($xml);
my $xpath  = "//*[local-name() = 'dbReference'][\@type='GO']";
my $xpc    = XML::LibXML::XPathContext->new();
my $nodes  = $xpc->findnodes( $xpath, $doc->documentElement );

# before iterating over the nodes, let's create our AnnotatedUniProt_Record
my $record = new sadiframework::org::examples::uniprot2go::AnnotatedUniProtRecord($input->getURI);

# now iterate over the found nodes
foreach my $context ( $nodes->get_nodelist ) {
   my $go = $context->getAttribute('id');
   # for each id, use the add_hasGOTerm() method to add the id
   $record->add_hasGOTerm(
      purl::oclc::org::SADI::LSRN::GO_Record->new("http://lsrn.org/$go")
   );
}

Populating our output:

# fill in the output nodes
$core->addOutputData(
	  node => $record,
);

Notice how we call addOutputData() each time we wish to add something to our output document. Save and close the file. We will test our service in the next section.

Step 6: Service testing

Now that we have implemented our service, we will test it to make sure that it works. The input that we will be using is shown below:

<rdf:RDF
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:lsrn="http://purl.oclc.org/SADI/LSRN/"
   xmlns:uniprot2go="http://sadiframework.org/examples/uniprot2go.owl#">
  <lsrn:UniProt_Record rdf:about="http://purl.uniprot.org/uniprot/P12345"/>
</rdf:RDF>

Copy and save the input to a file (I will assume that the file is saved as uni2go-input.xml).

Assuming that you saved the file under the name uni2go-input.xml, our SADI service can be tested with the following command:

$ sadi-testing-service.pl Service::Uniprot2GoAuto uni2go-input.xml

The expected output for our service:

<rdf:RDF
xmlns:a="http://sadiframework.org/ontologies/predicates.owl#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>
<rdf:Description rdf:about="http://purl.uniprot.org/uniprot/P12345">
<rdf:type rdf:resource="http://sadiframework.org/examples/uniprot2go.owl#AnnotatedUniProtRecord"/>
<a:hasGOTerm>
<rdf:Description rdf:about="http://lsrn.org/GO:0005759">
<rdf:type rdf:resource="http://purl.oclc.org/SADI/LSRN/GO_Record"/>
</rdf:Description>
</a:hasGOTerm>
<a:hasGOTerm>
<rdf:Description rdf:about="http://lsrn.org/GO:0005886">
<rdf:type rdf:resource="http://purl.oclc.org/SADI/LSRN/GO_Record"/>
</rdf:Description>
</a:hasGOTerm>
<a:hasGOTerm>
<rdf:Description rdf:about="http://lsrn.org/GO:0004069">
<rdf:type rdf:resource="http://purl.oclc.org/SADI/LSRN/GO_Record"/>
</rdf:Description>
</a:hasGOTerm>
<a:hasGOTerm>
<rdf:Description rdf:about="http://lsrn.org/GO:0006869">
<rdf:type rdf:resource="http://purl.oclc.org/SADI/LSRN/GO_Record"/>
</rdf:Description>
</a:hasGOTerm>
<a:hasGOTerm>
<rdf:Description rdf:about="http://lsrn.org/GO:0006457">
<rdf:type rdf:resource="http://purl.oclc.org/SADI/LSRN/GO_Record"/>
</rdf:Description>
</a:hasGOTerm>
</rdf:Description>
</rdf:RDF>

If you have no errors, then proceed to step 7! If you have some errors, hopefully SADISeS stack trace will help you pinpoint the problem!

Step 7: Service deployment

Deploying our SADI service is extremely straight forward!

The only thing you need to do is to tell your Web Server where the cgi script that we generated is located.

If you recall, our services' cgi script was called Uniprot2GoAuto (one of the files generated using sadi-generate-services.pl)

Make a symbolic link from the cgi-bin directory of your Web Server (e.g on some Linux distributions, using Apache Web server, the cgi-bin directory is /usr/lib/cgi-bin) to the cgi-bin script.

For example:

cd /usr/lib/cgi-bin
sudo ln -s /home/ekawas/Perl-SADI/cgi/Uniprot2GoAuto .  

Every time that you generate a cgi service using Perl SADI, you will have to perform an operation similar to this one for the service that you created in order to deploy it.

Step 8: Service testing using HTTP

Now that the service has been deployed, you can test it using HTTP. The input that we will be using is shown below:

<rdf:RDF
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:lsrn="http://purl.oclc.org/SADI/LSRN/"
   xmlns:uniprot2go="http://sadiframework.org/examples/uniprot2go.owl#">
  <lsrn:UniProt_Record rdf:about="http://purl.uniprot.org/uniprot/P12345"/>
</rdf:RDF>

Copy and save the input to a file.

Assuming that you saved the file under the name regression-input.xml, our SADI service can be tested with the following command:

$ sadi-testing-service.pl -e http://localhost/cgi-bin/Uniprot2GoAuto uni2go-input.xml

Of course, you may need to modify the URL http://localhost/cgi-bin/Uniprot2GoAuto (to the actual address that you deployed the service to!).

When we call the script with the -e option, we tell the sadi-testing-service.pl script that we would like to call our service using HTTP Post. We then must provide the script with 1 (or an optional second) parameter:

      1. the url to the service
      2. an optional file containing the input to our service

The expected output should be very similar to the output we saw above when we tested our service:

HTTP/1.1 200 OK
Connection: close
Date: Fri, 30 Oct 2009 20:11:39 GMT
Server: Apache/2.2.8 (Ubuntu) PHP/5.2.4-2ubuntu5.3 with Suhosin-Patch
Vary: Accept-Encoding
Content-Type: text/xml; charset=ISO-8859-1
Client-Date: Fri, 30 Oct 2009 20:11:40 GMT
Client-Peer: 127.0.0.1:80
Client-Response-Num: 1
Client-Transfer-Encoding: chunked

<rdf:RDF
xmlns:a="http://sadiframework.org/ontologies/predicates.owl#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>
<rdf:Description rdf:about="http://purl.uniprot.org/uniprot/P12345">
<rdf:type rdf:resource="http://sadiframework.org/examples/uniprot2go.owl#AnnotatedUniProtRecord"/>
<a:hasGOTerm>
<rdf:Description rdf:about="http://lsrn.org/GO:0005759">
<rdf:type rdf:resource="http://purl.oclc.org/SADI/LSRN/GO_Record"/>
</rdf:Description>
</a:hasGOTerm>
<a:hasGOTerm>
<rdf:Description rdf:about="http://lsrn.org/GO:0005886">
<rdf:type rdf:resource="http://purl.oclc.org/SADI/LSRN/GO_Record"/>
</rdf:Description>
</a:hasGOTerm>
<a:hasGOTerm>
<rdf:Description rdf:about="http://lsrn.org/GO:0004069">
<rdf:type rdf:resource="http://purl.oclc.org/SADI/LSRN/GO_Record"/>
</rdf:Description>
</a:hasGOTerm>
<a:hasGOTerm>
<rdf:Description rdf:about="http://lsrn.org/GO:0006869">
<rdf:type rdf:resource="http://purl.oclc.org/SADI/LSRN/GO_Record"/>
</rdf:Description>
</a:hasGOTerm>
<a:hasGOTerm>
<rdf:Description rdf:about="http://lsrn.org/GO:0006457">
<rdf:type rdf:resource="http://purl.oclc.org/SADI/LSRN/GO_Record"/>
</rdf:Description>
</a:hasGOTerm>
</rdf:Description>
</rdf:RDF>

To see what else the service testing script can do, run it without parameters or with the -h parameter.

Step 9: Service registration

Before we can register our service, we will need to open up the service definition file. Once this file is open, we need to verify a few things first!

First of all, we need to ensure that our URL/SignatureURL both point to the remote HTTP address of our entry script Uniprot2GoAuto (during testing, this was http://localhost/cgi-bin/Uniprot2GoAuto).

Second of all, we ... actually, there is no second of all! We just need to make sure that if we enter the remote HTTP address of our entry script Uniprot2GoAuto in our web browser, we will see some XML (the SADI service signature) outputted.

To actually register our service, we need to open our browser to http://sadiframework.org/registry/ and enter our service URL into the textbox. Once we have done that, sadiframework.org will add our service to the list of services that it knows about!

That's all there is to constructing synchronous Perl SADI services!