Version française


Circa is a search engine for your Web site, or for a list of sites. It indexes like Altavista does. It can read, add and parse all url's found in a page, if the page is on the same server.

Circa is free, under GNU license

Try-it !

Make a search on AlianWebServer :

Or try advanced search.


  • Full text indexing
  • Different weights for title, keywords, description and rest of page HTML read can be given in configuration
  • Boolean query language support : or (default) and ("+") not ("-"). Ex perl + faq -cgi : Documents with faq, eventually perl and not cgi.
  • Support protocol HTTP,FTP
  • Make index in MySQL
  • Client Perl or PHP
  • Read HTML and full text plain
  • Can do indexation of filesystem without talk to Web Server
  • Can browse site by directory / rubrique.
  • Several kinds of indexing : full, incremental, only on a particular server. Documents not updated are not reindexed. All requests for a file are made first with a head http request, for information such as validate, last update, size, etc.
  • Size of documents read can be restricted (Ex: don't get all documents > 5 MB). For use with low-bandwidth connections, or computers which do not have much memory.
  • HTML template can be easily customized for your needs.
  • Search for different criteria: news, last modified date, language, URL / site.
  • Admin functions available by browser interface or command-line.
  • Full support of standard robots exclusion (robots.txt). Identification with CircaIndexer/0.1, mail
  • Delay requests to the same server for 8 secondes. "It's not a bug, it's a feature!" Basic rule for HTTP serveur load.
  • Index the different links found in a CGI (all after name_of_file?)
  • Support proxy HTTP

To do

  • Support NNTP
  • Support of different character sets
  • Support of other bases
  • Requirement
  • MySQL
  • Perl
  • Modules DBI, DBD::mysql,LWP::RobotUA,HTML::LinkExtor;


Memory : Indexation : 5,5M
Processeur : on Sun SPARC Station 4 : (5 secondes à 2%, 2s. à 20%, 1s. à 30%) / url indexée.
Size on MySQL: 2-5 ko / url.

Make index is a big work so it's not for CGI protocol. Try to use to update index; if you don't have telnet acces, try to lunch processus on background with another CGI. Or install MySQL on local disk, make your index, and export index on you sarch machine.


  • Download one of archive file, uncompress it.
  • You must update search.cgi and (script for search) admin.cgi and (script for admin) for put your MYSQL param :user, password, database and ip adress if different from 'localhost'.
  • Run admin.cgi (CGI interface) or (command line) for add your url, drop or create tables, ... I suggest to prefer use on command line because indexation can take a lot of time and is not adapted for CGI
  • Run search.cgi. You can use the default form for use in your page. Only field 'words' is necessary.
  • For customized HTML result, look in file circa.htm


Documentation POD is available, use pod2html > name_of_file.html for read it.


If you have root privileges and can install Perl modules, you can install this two modules : Circa::Search et Circa::Indexer. See directory demo for how use this module. Install Circa::Indexer first.

Else, you can use this distrib :

Format ZIP or Format tar.gz




Rules and security with :

Feature :

Why ?

I read of this need, I needed one for AlianWebServer, and I think other people need it too.

Powered by AlianWebServer