# $Id: INSTALL,v 1.16 2002/10/11 09:48:42 birney Exp $ Bioperl Install Directions o System requirements - Tested on all common forms of Unix, Win9X/NT/2000, Mac OS X (see the PLATFORMS file for more details) - perl 5.005 or later *. - ANSI C or Gnu C compiler for optional extra XS extensions - External modules: Bioperl uses functionality provided in other Perl modules. Some of these are included in the standard perl install and some need to be obtained from the CPAN (www.cpan.org) site. The list of external modules is included at the bottom of the INSTALL document and in the bptutorial.pl tutorial. The Bioperl Bundle (Bundle::BioPerl) available through CPAN can make this process even easier. Simply install the bundle in your CPAN shell and all necessary modules will be installed. - Bioperl on ActiveState for Windows - An ActiveState PPM distribution is available for bioperl. Issue the following commands in your PPM shell to install Bioperl. PPM>repository add Bioperl http://bioperl.org/DIST/ Get the number of the Bioperl repository: PPM>repository Set the Bioperl repository, find Bioperl, install Bioperl: PPM>repository set PPM>search * PPM>install As of this writing the Bundle::BioPerl bundle is not found in any ActiveState package but the following command should work in Command Prompt: >perl -MCPAN -e "install Bundle::BioPerl" Do not use a Unix make to install Perl modules this way, use nmake. See "DEPENDENCIES AND Bundle::BioPerl" below for more information on this useful bundle. - Additional information using Bioperl with MacOS can be found at http://bioperl.org/Core/mac-bioperl.html - Additional information using Bioperl with OS X can be found at http://www.tc.umn.edu/~cann0010/Bioperl_OSX_install.html - External programs Bioperl can interface with some external programs for executing analyses. These include clustalw and t_coffee for Multiple Sequence Alignments (Bio::Tools::Run::Alignment::Clustalw and Bio::Tools::Run::TCoffee) and blastall,blastpgp, & bl2seq for BLAST analyses (Bio::Tools::Run::StandAloneBlast), and to all the programs in the EMBOSS suite (Bio::Factory::EMBOSS). Additionally Bioperl can submit and retrieve jobs from the NCBI Blast queue via the Bio::Tools::Run::RemoteBlast module. See the documentation for these modules for more information - Environment Variables [optional] Some modules which run external programs need certain environment variables set. If you do not have a local copy of the specific executeable you do not need to set these variables. Additionally the modules will attempt to locate the specific applications in your runtime PATH variable. Setting environment variables on unix means adding a line like the following to your shell initialization: (for bash,sh) export BLASTDIR=/data1/blast (for csh,tcsh) setenv BLASTDIR /data1/blast The environment variables include: Bio::Tools::Run::StandAloneBlast o BLASTDIR - which specifies where the NCBI blastall, blastpgp, bl2seq, etc.. are located. A 'data' directory is assumed to be present in this dir as well where the blastable databases are located as well as substitution matricies. o BLASTDATADIR or BLASTDB - (either is optional) if one does not want to locate the data dir within the same dir as where the BLASTDIR variable points, a BLASTDATADIR or BLASTDB variable can be set to point to a dir where BLAST database indexes are located. Bio::Tools::Run::Alignment::Clustalw o CLUSTALDIR - points to the directory where the clustalw executable is located. Bio::Tools::Run::Alignment::TCoffee o TCOFFEEDIR - points to the directory where the t_coffee executable is located. * Note that most modules will work with earlier versions of Perl. The only ones that will not are Bio::SimpleAlign.pm and the Bio::Index::* modules. If you don't need these modules and you want to install bioperl using an earlier version of Perl, edit the "require 5.004;" line in Makefile.PL as necessary. o Installing Bioperl THE EASY WAY The Bioperl modules are distributed as a tar file in standard perl CPAN distribution form. This means that installation is very simple. Once you have unpacked the tar distribution there is a directory called bioperl-xx/, which is where this file is. Move into that directory (you may well be already in the right place!) and issue the following commands: perl Makefile.PL # makes a system-specific makefile make # makes the distribution make test # runs the test code make install # [may need root access for system install. # See below for how to get around this.] This should build, test and install the distribution cleanly on your system. This installs the main perl part of bioperl, which is the majority of the bioperl modules. There is one module (Bio::Tools::pSW) which needs a compiled extension. This needs an extra installation step. The directions for installing this are given below - it is almost as easy as installing the standard distribution, so don't worry! You may have some errors from the pod2man part of the installation, such as /usr/bin/pod2man: Unrecognized pod directive in paragraph 168 of Bio/Tools/Blast.pm: head3 You don't need to worry about them: they do not affect the documentation processing. To install you need write permission in the perl5/site_perl/ source area. Quite often this will require you (or someone else) becoming root, so you will want to talk to your systems manager if you don't have the necessary access. It is possible to install the package outside of the standard Perl5 location. See below for details. To install the Compiled extension for pSW you will need to read the next section of the manual. INSTALLING THE OPTIONAL COMPILED EXTENSIONS (bioperl-ext package) This Installation works out-of-the box for all platforms except *BSD and Solaris boxes. For notes on this, read on. For other platforms, skip ahead to the next section, BUILDING THE COMPILED EXTENSIONS. INSTALLING for *BSD and Solaris boxes. You should add the line -fPIC to the CFLAGS line in Compile/SW/libs/makefile. This makes the compile generate position independent code, which is required for these architectures. In addition, on some Solaris boxes, the generated Makefile does not make the correct -fPIC/-fpic flags for the C compiler that is used. This requires manual editing of the generated Makefile to switch case. Try it out once, and if you get errors, try editing the -fpic line BUILDING THE COMPILED EXTENSIONS (OPTIONAL, ONLY FOR RUNNING LOCAL PROTEIN SMITH-WATERMAN ANALYSES FROM PERL) Move to the directory bioperl-ext. This is available as a separate package released from ftp://bioperl.org/pub/DIST. This is where the C code and XS extension for the bp_sw module is held and execute these commands: (possibly after making the change for *BSD and Solaris, as detailed above) perl Makefile.PL # makes the system specific makefile # Solaris/BSD users might need to edit the Makefile here make # builds all the libaries make test # runs a short test make install # installs the package correctly. This should install the compiled extension. The Bio::Tools::pSW module will work cleanly now. INSTALLING BIOPERL IN A PERSONAL OR PRIVATE MODULE AREA If you lack permission to install perl modules into the standard site_perl/ system area you can configure bioperl to install itself anywhere you choose. Ideally this would be a personal perl directory or standard place where you plan to put all your 'local' or personal perl modules. Note: you _must_ have write permission to this area. Simply pass a parameter to perl as it builds your system specific makefile. Example: perl Makefile.PL PREFIX=/home/dag/My_Local_Perl_Modules make make test make install This will cause perl to install the bioperl modules in: /home/dag/My_Perl_Modules/lib/perl5/site_perl/ And the bioperl man pages will go in: /home/dag/My_Perl_Modules/lib/perl5/man/ To specify a directory besides lib/perl5/site_perl, or if there are still permission problems, include an INSTALLSITELIB directive along with the PREFIX: perl Makefile.PL PREFIX=/home/dag/perl INSTALLSITELIB=/home/dag/perl/lib See below for how to use modules that are not installed in the standard Perl5 location. THE HARD WAY :) INSTALLING BIOPERL MODULES: LAST RESORT As a last resort, you can simply copy all files in Bio/ to any directory in which you have write privileges. This is generally NOT recommended since some modules may require special configuration (currently none do, but don't rely on this. You will need to set "use lib '/path/to/my/bioperl/modules';" in your perl scripts so that you can access these modules if they are not installed in the standard site_perl/ location. See below for an example. To get manpage documentation to work correctly you will have to configure man so that it looks in the proper directory. On most systems this will just involve adding an additional directory to your $MANPATH environment variable. The installation of the Compile directory can be similarly redirected, but execute the make commands from the Compile/SW directory. If all else fails or are unable to access the perl distribution directories, ask your system administrator to place the files there for you. You can always execute perl scripts in the same directory as the location of the modules (Bio/ in the distribution) since perl always checks the current working directory when looking for modules. USING MODULES NOT INSTALLED IN THE STANDARD PERL LOCATION You can explicitly tell perl where to look for modules by using the lib module which comes standard with perl. Example: #!/usr/bin/perl use lib "/home/users/dag/My_Local_Perl_Modules/"; use Bio::Seq; <...insert whizzy perl code here...> THE TEST SYSTEM The Bioperl test system is located in the t/ directory and is automatically run whenever you execute the 'make test' command. Alternatively if you want to investigate the behavior of a specific test such as the SeqIO test you would type: % perl -I. -w t/SeqIO.t The -I tells Perl to use the current directory as the include path - this makes sure you are testing the modules in this directory not ones installed elsewhere in your PERL5LIB path. The -w tells Perl to print all warnings. If you are trying to learn how to use a module, often the test suite is a good place to look. All good extreme programers try and write a test BEFORE they write the module to insure that their module behaves the way they expect. You'll notice some 'ok' and 'skip' commands in a test, this is part of the Perl test suite that signifies a passed test with an ok N where N is the test number. Alternatively you can tell Perl to skip tests. This is useful when, for example, your test detects that the network is not present and thus should skip, not fail, any tests that require a network connection. DEPENDENCIES AND Bundle::BioPerl The following packages are used by Bioperl. Not all are required for Bioperl to operate properly, however, some functionality will be missing without them. You can easily install all of these, except srsperl.pm, using the Bundle::BioPerl CPAN bundle. A command-line example: >perl -e shell -MCPAN cpan>install Bundle::BioPerl <...installation details...> cpan>quit or more simply: >perl -MCPAN -e "install Bundle::BioPerl" Module Where it is Used ------------------------------------------------------------------ HTTP::Request::Common GenBank+GenPept sequence retrieval, remote http Blast jobs. Bio::DB::*,Bio::Tools::Run::RemoteBlast LWP::UserAgent GenBank+GenPept sequence retrieval, remote http Blast jobs Bio::DB::*,Bio::Tools::Run::RemoteBlast Ace come from AcePerl Access to ACeDB databases Bio::DB::Ace IO::Scalar IO handle to read or write to a scalar/remote http Blast jobs Bio::Tools::Blast::Run::Remote, IO::String IO handle to read or write to a string. GenBank+GenPept sequence retrieval, Variation code, Bio::DB::*,Bio::Variation::*, Bio::Index::Blast XML::Parser Parsing of XML documents Bio::Variation code, GAME parser, SearchIO Bio::SeqIO::game,Bio::Variation::*, Bio::SearchIO::blastxml Bio::Biblio::IO::medlinexml XML::Writer Parsing + writing of XML documents Bio::Variation code, GAME parser Bio::SeqIO::game,Bio::Variation::* XML::Parser::PerlSAX Parsing of XML documents Bio::Variation code, GAME parser, SearchIO Bio::SeqIO::game,Bio::Variation::*, Bio::SearchIO::blastxml Bio::Biblio::IO::medlinexml XML::Twig Parsing of XML documents Module Bio::Variation::IO::xml File::Temp Temporary File creation Bio::DB::WebDBSeqI, Bio::Seq::LargePrimarySeq All analysis running SOAP::Lite SOAP protocol XEMBL Services, Bibliographic queries in Biblio:: Bio::DB::XEMBLService HTML::Parser HTML parsing of GDB page Bio::DB::GDB DBD::mysql Mysql driver loading and querying of Mysql-based GFF feature databases Bio::DB::GFF GD GD graphical drawing library Bio::Graphics Requires GD library from www.boutell.com/gd srsperl.pm Sequence Retrieval System (SRS) alternative way of retrieving sequences Bio::LiveSeq::IO::SRS.pm See README in Bio/LiveSeq/IO Storable Persistent object storage & retrieval Bio::DB::FileCache Parse::RecDescent Parsing of Unigene files Text::Shellwords Text parser Bio::Graphics::FeatureFile.pm XML::DOM XML parser Bio::SeqIO::bsml.pm DB_File Perl access to Berkeley DB Bio::DB::Flat Bio::SeqFeature::Collection