The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
#!/usr/local/bin/perl

use MyCPAN::App::DPAN 1.28;

our $VERSION = '1.28_11';

if( grep /^-?-h(elp)?$/, @ARGV ) {
	require Pod::Usage;

	Pod::Usage::pod2usage(
		-exitval => 0
		);
	}
elsif( grep /^-?-v(ersion)?$/, @ARGV )
	{
	print "dpan $VERSION using MyCPAN::App::DPAN ",
		MyCPAN::App::DPAN->VERSION, "\n";

	exit 0;
	}

$ENV{MYCPAN_LOG4PERL_FILE} = $ENV{DPAN_LOG4PERL_FILE}
	if defined $ENV{DPAN_LOG4PERL_FILE};

$ENV{MYCPAN_LOGLOGLEVEL} = $ENV{DPAN_LOGLEVEL}
	if defined $ENV{DPAN_LOGLEVEL};

my $application = MyCPAN::App::DPAN->activate( @ARGV );

$application->activate_end;

exit( 0 );

=pod

=head1 NAME

dpan - create a DarkPAN from directories

=head1 SYNOPSIS

	# from the command line
	prompt> dpan [-l log4perl.config] [-f config] [directory [directory2]]

	# get some help
	prompt> dpan -h
	prompt> dpan --help

	# see the version
	prompt> dpan -v
	prompt> dpan --version

	# see the configuration (exits after printing config)
	prompt> dpan -c
	prompt> dpan -c -f config

=head1 DESCRIPTION

The C<dpan> script takes a list of directories, indexes any Perl
distributions it finds, and creates the PAUSE index files from what it
finds. Afterward, you should be able to point a CPAN tool at the
directory and install the distributions normally.

If you don't specify any directories, it works with the current
working directory.

At the end, C<dpan> creates a F<modules> directory in the first
directory (or the current working directory) and creates the
F<02package.details.txt.gz> and F<03modlist.data.gz>.

=head2 Command-line processing

=over 4

=item -c

Dump the configuration and exit.

=item -f config_file

Use the named configuration file.

=item -l log4perl_config_file

The path to the log4perl configuration file. You can also set this
with the <DPAN_LOG4PERL_FILE> environment variable or the C<log4perl_file>
configuration directive.

=back

=head2 Configuration options

If you don't specify these values in a configuation file, C<dpan> will
use its defaults. The behavior of these options only apply to the default
components. If you subclass a component or use a different component,
you might see something different.

Some of the configuration options come from C<MyCPAN::Indexer::App::BackPAN>.
The source of each directive is noted.

=over 4

=item alarm

The maximum amount of time allowed to index a distribution, in seconds.

Default: 15

Affected components: Worker

Config level: MyCPAN

=item author_map

C<dpan> will use this filename to add PAUSE IDs it finds in the
repository to F<authors/00whois.xml> and F<authors/01mailrc.txt.gz>
(see also C<pause_id> and C<pause_full_name>). Without this file,
C<dpan> will use the value of C<pause_full_name> for new PAUSE IDs it
finds.

The format is the file are lines that have the PAUSE ID followed by
the full name and email address in angle brackets:

	BUSTER Buster Bean <buster@example.com>

This option helps your DPAN work with C<CPAN::Mini::Webserver>, which
tries to pull values out of  F<authors/00whois.xml> and
F<authors/01mailrc.txt.gz> to insert into its templates.

Default: undef

Affected components: Collator

Config level: DPAN

=item collator_class

The Perl class to use as the Dispatcher component. See
C<MyCPAN::Indexer::Tutorial> for details on components.

Default: MyCPAN::App::DPAN::Reporter::Minimal

=item copy_bad_dists

If set to a true value, copy bad distributions to the named directory
so you can inspect them later.

Default: 0

Affected components: Queue

Config level: MyCPAN

=item dispatcher_class

The Perl class to use as the Dispatcher component. See
C<MyCPAN::Indexer::Tutorial> for details on components.

Default: MyCPAN::Indexer::Dispatch::Serial

Config level: MyCPAN

=item dpan_dir

The directory where C<dpan> will create the DPAN. It takes a single
directory. 

Default: the current working directory

Affected components: all of them

Config level: DPAN

=item error_report_subdir

The subdirectory under C<report_dir> to use for error reports.

Default: error

Affected components: Reporter, Collator

Config level: MyCPAN

=item extra_reports_dir

You can specify another directory that contains pre-indexed reports.
C<dpan> will add these reports to its queue to create
F<02packages.details.txt.gz>. So far, C<dpan> doesn't check that these
reports correspond to any files in the repository and could cause the
checks against F<02packages.details.txt.gz> to fail.

You probably want to use this as a way to inject information about
distributions that you skipped with C<skip_dists_regexes>. This is
also very handy for problematic distributions that resist indexing.

Default: undef

Affected components: Indexer

Config level: DPAN

=item fresh_start

Delete the report directory before indexing. This cleans out all
previous work in C<report_dir>, so you need to save that on your own.
You can also set this with the C<DPAN_FRESH_START> environment
variable.

Default: 0

Affected components: Queue, Reporter

Config level: DPAN

=item i_ignore_errors_at_my_peril

C<dpan> considers some problems to be too troublesome to continue,
such as when it thinks it created incomplete index files. If you don't
want C<dpan> to stop, set C<i_ignore_errors_at_my_peril> to a true
value. However, you might get an invalid mirror.

Default: 0

Affected components: Collator

Config level: DPAN

=item ignore_missing_dists

C<dpan> aborts the process if it finds distributions in the repository
that don't show up in F<02packages.details.txt.gz>, which it think
signals an incomplete index. If you want to risk an incomplete index,
set this to a true value.

Default: 0

Affected components: Collator

Config level: DPAN

=item ignore_packages

You can tell DPAN to ignore some namespaces. The indexer may still
record them, but they won't show up in F<02packages.details.txt.gz>.
It's a space-separated list of exact package names

Default: main MY MM DB bytes DynaLoader

Affected components: Indexer

Config level: DPAN

=item indexer_class

The Perl class to use as the Indexer component. See
C<MyCPAN::Indexer::Tutorial> for details on components.

Default: MyCPAN::App::DPAN::Indexer

=item indexer_id

Give yourself a name so people who who ran C<dpan>.

Default: Joe Example <joe@example.com>

Affected components: Reporter

Config level: MyCPAN

=item interface_class

The Perl class to use as the Interface component. See
C<MyCPAN::Indexer::Tutorial> for details on components.

Default: MyCPAN::Indexer::Interface::Text

Config level: MyCPAN

=item log_file_watch_time

The time, in seconds, to wait until C<Log4perl> checks for an
updated configuration file.

Default: 30

Affected components: all

Config level: MyCPAN

=item log4perl_file

The path to the log4perl configuration file. You can also set this
with the <-l> switch or the C<DPAN_LOG4PERL_FILE> environment
variable.

Default: undef

Affected components: all

Config level: MyCPAN

=item merge_dirs

A space separated list of directories containing distributions
to merge into DPAN. These will go into the directory for 
C<pause_id>. The files are copied, not renamed, so it works
across partitions and mount points.

Default: undef

Affected components: Queue

Config level: MyCPAN

=item organize_dists

Take all of the distributions C<dpan> finds and put the into a
PAUSE-like structure under F<authors/id/D/DP/DPAN> under
C<dpan_dir>. You can change the author ID with the C<pause_id>
directive.

Default: 0

Affected components: Queue

Config level: DPAN

=item parallel_jobs

The number of parallel jobs to run. This only matters for dispatcher
classes that can do more than one thing at a time. The default
dispatcher does not use parallel jobs. You need to change the
C<dispatcher_class> to C<MyCPAN::Indexer::Dispatcher::Parallel> (or
another class that supports this).

Default: 0

Affected components: Dispatcher

Config level: MyCPAN

=item pause_id

The author ID to use if organize_dists is set to a true value.

Default: DPAN

Affected components: Queue, Reporter, Collator

Config level: DPAN

=item pause_full_name

The full name to use for the default PAUSE ID and any unclaimed ID
C<dpan> finds in the repository. This shows up in the
F<01mailrc.txt.gz> and F<00whois.xml> files.

Default: "DPAN user <CENSORED>"

Affected components: Collator

Config level: DPAN

=item postflight_class

If defined, after C<dpan> has finished all of its work and is ready to
exit, it will try to load this class and execute its C<run()> method.
It passes C<run()> the application object, so you have full access to
everything. See C<MyCPAN::App::DPAN::NullPostFlight> for an example. To use
the application object, you need to know a little about the guts of 
C<MyCPAN::Indexer>.

Default: undef

Affected components: none

Config level: DPAN

=item prefer_bin

If set to a true value, C<dpan> will try to use an external binary
before it results to a pure Perl option. This matters mostly for
C<Archive::Extract>. Your C<tar> utility may have better luck than
C<Archive::Tar>.

This setting overrides the C<PREFER_BIN> environment variable. You
can also set that environment variable to change how C<Archive::Extract>
decides to extract an archive.

Default: 0

Affected components: Indexer

Config level: MyCPAN

=item queue_class MyCPAN::Indexer::SomeOtherQueue

The Perl class to use as the Queue component. See
C<MyCPAN::Indexer::Tutorial> for details on components.

Default: MyCPAN::Indexer::SkipQueue

Config level: MyCPAN

=item relative_paths_in_report

If true, supporting Reporter classes change the path to the
distribution file to be relative to I<authors/id>. Otherwise, the path
in the report is the absolute path to the distribution.

Supported by C<MyCPAN::App::DPAN::Reporter::Minimal>.

Default: true

Affected components: Reporter

Config level: DPAN

=item reporter_class

The Perl class to use as the Reporter component. See
C<MyCPAN::Indexer::Tutorial> for details on components.

Default: MyCPAN::App::DPAN::Reporter::Minimal

Config level: MyCPAN

=item report_dir

Where to store the distribution reports.

Default: a directory named F<indexer_reports> in the current working
directory

Affected components: Reporter

Config level: MyCPAN

=item retry_errors

Try to index a distribution even if it was previously tried and had an
error. This depends on previous reports being in C<report_dir>, so if
you don't set that configuration directive, it won't matter.

Default: 1

Affected components: Indexer

Config level: DPAN

=item skip_dists_regexes

You can specify a list of whitespace-separated regexes for C<dpan> to
use to filter the queue of distributions to index. You probably want
to use this to skip very large distributions. You can have a pre-made
index report by setting C<extra_reports_dir>.

This is only supported by C<MyCPAN::Indexer::SkipQueue>.

Default: null

Affected components: Queue

Config level: DPAN

=item skip_perl

When set to a true value, C<skip_perl> cause C<dpan> to ignore
distributions that match C</^(strawberry-?)perl-/>, since these can
take a long time to index. You can have a pre-made index report by
setting C<extra_reports_dir>.

This is only supported by C<MyCPAN::Indexer::SkipQueue>.

Default: 0

Affected components: Queue

Config level: DPAN

=item error_report_subdir

The subdirectory under C<report_dir> to use for success reports.

Default: success

Affected components: Reporter, Collator

Config level: MyCPAN

=item system_id macbookpro

Give the indexing system a name, just to identify the machine. This
shows up in the C<run_info>, although the default Reporter does not
include it in the report.

Default: 'an unnamed machine'

Affected components: Reporter

Config level: MyCPAN

=item temp_dir

Where to unpack the dists or create any temporary files.

Default: a temp directory in the current working directory

Affected components: Worker

Config level: MyCPAN

=item use_real_whois

When C<use_real_whois> is set to a true value, C<dpan> tries to update
the F<authors/00whois.xml> and F<authors/01mailrc.txt.gz> files from a
real CPAN mirror. This will overwrite the files already in the
repository, although it will update the CPAN versions with PAUSE IDs
C<dpan> finds in the repository (see C<author_map>).

The default case for most C<dpan> options is to avoid the real CPAN,
so by default C<dpan> will create stub files for these files instead
of fetching the real ones.

If you are using C<CPAN::Mini>, you can mirror C<00whois.xml> by adding
a line to your F<.minicpanrc>:

	also_mirror authors/00whois.xml

Default: 0

Affected components: Collator

Config level: DPAN

=item worker_class

The Perl class to use as the Worker component. See
C<MyCPAN::Indexer::Tutorial> for details on components.

Default: MyCPAN::Indexer::Worker

=back

=head1 Logging

C<dpan> uses Log4perl if it is available. Without Log4perl, it uses a
small internal logger that prints to the screen.

You can set the Log4perl levels on each of the components separately:

	log4perl.rootLogger               =    FATAL, Null

	log4perl.logger.backpan_indexer   =    DEBUG, File

	log4perl.logger.Indexer           =    DEBUG, File
	log4perl.logger.Worker            =    DEBUG, File

	log4perl.logger.Interface         =    DEBUG, File

	log4perl.logger.Dispatcher        =    DEBUG, File
	log4perl.logger.Queue             =    DEBUG, File

	log4perl.logger.Reporter          =    DEBUG, File
	log4perl.logger.Collator          =    DEBUG, File

=head1 ENVIRONMENT VARIABLES

=over 4

=item DPAN_FRESH_START

Delete the report directory before indexing. This cleans out all previous
work, so you need to save that on your own. You can also set this with
the C<fresh_start> configuration directive.

=item DPAN_LOG4PERL_FILE

The path to the log4perl configuration file. You can also set this with
the <-l> switch of the C<log4perl_file> configuration directive.

=item DPAN_LOGLEVEL

The Log4perl level for easyinit. This won't affect the logging level if
you specify a log configuration file.

=back

=head1 SEE ALSO

MyCPAN::Indexer, MyCPAN::Indexer::DPAN

=head1 SOURCE AVAILABILITY

This code is in Github:

      git://github.com/briandfoy/mycpan-indexer.git
      git://github.com/briandfoy/mycpan--app--dpan.git

=head1 AUTHOR

brian d foy, C<< <bdfoy@cpan.org> >>

=head1 COPYRIGHT AND LICENSE

Copyright (c) 2008-2009, brian d foy, All Rights Reserved.

You may redistribute this under the same terms as Perl itself.

=cut