package RTx::Shredder; use strict; use warnings; =head1 NAME RTx::Shredder - Cleanup RT database =head1 SYNOPSIS =head2 CLI rtx-shredder --force --plugin 'Tickets=queue,general;status,deleted' =head2 API Same action as in CLI example, but from perl script: use RTx::Shredder; RTx::Shredder::Init( force => 1 ); my $deleted = RT::Tickets->new( $RT::SystemUser ); $deleted->{'allow_deleted_search'} = 1; $deleted->LimitQueue( VALUE => 'general' ); $deleted->LimitStatus( VALUE => 'deleted' ); while( my $t = $deleted->Next ) { $t->Wipeout; } =head1 DESCRIPTION RTx::Shredder is extention to RT API which allow you to delete data from RT database. Now Shredder support wipe out of almost all RT objects (Tickets, Transactions, Attachments, Users...) =head2 Command line tools(CLI) L script that is shipped with the distribution allow you to delete objects from command line or with system tasks scheduler(cron or other). =head2 Web based interface(WebUI) Shredder's WebUI integrates into RT's WebUI and you can find it under Configuration->Tools->Shredder tab. This interface is similar to CLI and give you the same functionality, but it's available from browser. =head2 API L modules is extension to RT API which add(push) methods into base RT classes. API is not well documented yet, but you can find usage examples in L script code and in F files. =head1 CONFIGURATION =head2 $RT::DependenciesLimit Shredder stops with error if object has more then C<$RT::DependenciesLimit> dependencies. By default this value is 1000. For example: ticket has 1000 transactions or transaction has 1000 attachments. This is protection from bugs in shredder code, but sometimes when you have big mail loops you may hit it. You can change default value, in F add C =head2 $ShredderStoragePath By default shredder saves dumps in F, with this option you can change path, but B that value should be absolute path to the dir you want. =head1 API DESCRIPTION L class implements interfaces to objects cache, actions on the objects in the cache and backups storage. =head2 Dependencies =cut our $VERSION = '0.07'; use File::Spec (); BEGIN { # I can't use 'use lib' here since it breakes tests # because test suite uses old RTx::Shredder setup from # RT lib path ### after: push @INC, qw(@RT_LIB_PATH@); push @INC, qw(/opt/rt3/local/lib /opt/rt3/lib); use RTx::Shredder::Constants; use RTx::Shredder::Exceptions; require RT; require RTx::Shredder::Record; require RTx::Shredder::ACE; require RTx::Shredder::Attachment; require RTx::Shredder::CachedGroupMember; require RTx::Shredder::CustomField; require RTx::Shredder::CustomFieldValue; require RTx::Shredder::GroupMember; require RTx::Shredder::Group; require RTx::Shredder::Link; require RTx::Shredder::Principal; require RTx::Shredder::Queue; require RTx::Shredder::Scrip; require RTx::Shredder::ScripAction; require RTx::Shredder::ScripCondition; require RTx::Shredder::Template; require RTx::Shredder::ObjectCustomFieldValue; require RTx::Shredder::Ticket; require RTx::Shredder::Transaction; require RTx::Shredder::User; } our @SUPPORTED_OBJECTS = qw( ACE Attachment CachedGroupMember CustomField CustomFieldValue GroupMember Group Link Principal Queue Scrip ScripAction ScripCondition Template ObjectCustomFieldValue Ticket Transaction User ); =head2 GENERIC =head3 Init( %options ) Sets shredder defaults, loads RT config and init RT interface. B that this is function and must be called with C. B describe possible shredder options. =cut our %opt = (); sub Init { %opt = @_; RT::LoadConfig(); RT::Init(); } =head3 new( %options ) Shredder object constructor takes options hash and returns new object. =cut sub new { my $proto = shift; my $self = bless( {}, ref $proto || $proto ); $self->_Init( @_ ); return $self; } sub _Init { my $self = shift; $self->{'opt'} = { %opt, @_ }; $self->{'cache'} = {}; $self->{'resolver'} = {}; } =head3 CastObjectsToRecords( Objects => undef ) Cast objects to the C objects or its ancesstors. Objects can be passed as SCALAR (format C<< - >>), ARRAY, C ancesstors or C ancesstor. Most methods that takes C argument use this method to cast argument value to list of records. Returns array of the records. For example: my @objs = $shredder->CastObjectsToRecords( Objects => [ # ARRAY reference 'RT::Attachment-10', # SCALAR or SCALAR reference $tickets, # RT::Tickets object (isa RT::SearchBuilder) $user, # RT::User object (isa RT::Record) ], ); =cut sub CastObjectsToRecords { my $self = shift; my %args = ( Objects => undef, @_ ); my @res; my $targets = delete $args{'Objects'}; unless( $targets ) { RTx::Shredder::Exception->throw( "Undefined Objects argument" ); } if( UNIVERSAL::isa( $targets, 'RT::SearchBuilder' ) ) { # XXX: don't uncomment next line as it leads to terrible slowdown # due to select * from Tickets; #$targets->_DoSearch; push @res, @{$targets->ItemsArrayRef || []}; #while( my $tmp = $targets->SUPER::Next ) { push @res, $tmp }; } elsif ( UNIVERSAL::isa( $targets, 'RT::Record' ) ) { push @res, $targets; } elsif ( UNIVERSAL::isa( $targets, 'ARRAY' ) ) { foreach( @$targets ) { push @res, $self->CastObjectsToRecords( Objects => $_ ); } } elsif ( UNIVERSAL::isa( $targets, 'SCALAR' ) || !ref $targets ) { $targets = $$targets if ref $targets; my ($class, $id) = split /-/, $targets; $class = 'RT::'. $class unless $class =~ /^RTx?::/i; eval "require $class"; die "Couldn't load '$class' module" if $@; my $obj = $class->new( $RT::SystemUser ); die "Couldn't construct new '$class' object" unless $obj; $obj->Load( $id ); unless ( $obj->id ) { $RT::Logger->error( "Couldn't load '$class' object with id '$id'" ); RTx::Shredder::Exception::Info->throw( 'CouldntLoadObject' ); } die "Loaded object has different id" unless( $id eq $obj->id ); push @res, $obj; } else { RTx::Shredder::Exception->throw( "Unsupported type ". ref $targets ); } return @res; } =head2 OBJECTS CACHE =head3 PutObjects( Objects => undef ) Puts objects into cache. Returns array of the cache entries. See C method for supported types of the C argument. =cut sub PutObjects { my $self = shift; my %args = ( Objects => undef, @_ ); my @res; for( $self->CastObjectsToRecords( Objects => delete $args{'Objects'} ) ) { push @res, $self->PutObject( %args, Object => $_ ) } return @res; } =head3 PutObject( Object => undef ) Puts record object into cache and returns its cache entry. B that this method support B object or its ancesstor objects>, if you want put mutliple objects or objects represented by different classes then use C method instead. =cut sub PutObject { my $self = shift; my %args = ( Object => undef, @_ ); my $obj = $args{'Object'}; unless( UNIVERSAL::isa( $obj, 'RT::Record' ) ) { RTx::Shredder::Exception->throw( "Unsupported type '". (ref $obj || $obj || '(undef)')."'" ); } my $str = $obj->_AsString; return ($self->{'cache'}->{ $str } ||= { State => ON_STACK, Object => $obj } ); } =head3 GetObject, GetState, GetRecord( String => ''| Object => '' ) Returns record object from cache, cache entry state or cache entry accordingly. All three methods takes C (format C<< - >>) or C argument. C argument has more priority than C so if it's not empty then methods leave C argument unchecked. You can read about possible states and thier meaning in L docs. =cut sub _ParseRefStrArgs { my $self = shift; my %args = ( String => '', Object => undef, @_ ); if( $args{'String'} && $args{'Object'} ) { require Carp; Carp::croak( "both String and Object args passed" ); } return $args{'String'} if $args{'String'}; return $args{'Object'}->_AsString if UNIVERSAL::can($args{'Object'}, '_AsString' ); return ''; } sub GetObject { return (shift)->GetRecord( @_ )->{'Object'} } sub GetState { return (shift)->GetRecord( @_ )->{'State'} } sub GetRecord { my $self = shift; my $str = $self->_ParseRefStrArgs( @_ ); return $self->{'cache'}->{ $str }; } =head2 DEPENDENCIES RESOLVERS =cut sub PutResolver { my $self = shift; my %args = ( BaseClass => '', TargetClass => '', Code => undef, @_, ); unless( UNIVERSAL::isa( $args{'Code'} => 'CODE' ) ) { die "Resolver '$args{Code}' is not code reference"; } my $resolvers = ( ( $self->{'resolver'}->{ $args{'BaseClass'} } ||= {} )->{ $args{'TargetClass'} || '' } ||= [] ); unshift @$resolvers, $args{'Code'}; return; } sub GetResolvers { my $self = shift; my %args = ( BaseClass => '', TargetClass => '', @_, ); my @res; if( $args{'TargetClass'} && exists $self->{'resolver'}->{ $args{'BaseClass'} }->{ $args{'TargetClass'} } ) { push @res, @{ $self->{'resolver'}->{ $args{'BaseClass'} }->{ $args{'TargetClass'} || '' } }; } if( exists $self->{'resolver'}->{ $args{'BaseClass'} }->{ '' } ) { push @res, @{ $self->{'resolver'}->{ $args{'BaseClass'} }->{''} }; } return @res; } sub ApplyResolvers { my $self = shift; my %args = ( Dependency => undef, @_ ); my $dep = $args{'Dependency'}; my @resolvers = $self->GetResolvers( BaseClass => $dep->BaseClass, TargetClass => $dep->TargetClass, ); unless( @resolvers ) { die "Couldn't find resolver for dependency '". $dep->AsString ."'"; } foreach( @resolvers ) { eval { $_->( Shredder => $self, BaseObject => $dep->BaseObject, TargetObject => $dep->TargetObject, ) }; die "Resolver failed: $@" if $@; } return; } sub WipeoutAll { my $self = $_[0]; foreach ( values %{ $self->{'cache'} } ) { next if $_->{'State'} & (WIPED | IN_WIPING); $self->Wipeout( Object => $_->{'Object'} ); } } sub Wipeout { die "Couldn't begin transaction" unless $RT::Handle->BeginTransaction; eval { (shift)->_Wipeout( @_ ) }; if( $@ ) { $RT::Handle->Rollback('force'); die $@ if RTx::Shredder::Exception::Info->caught; die "Couldn't wipeout object: $@"; } die "Couldn't commit transaction" unless $RT::Handle->Commit; } sub _Wipeout { my $self = shift; my %args = ( CacheRecord => undef, Object => undef, @_ ); my $record = $args{'CacheRecord'}; $record = $self->PutObject( Object => $args{'Object'} ) unless $record; return if $record->{'State'} & (WIPED | IN_WIPING); $record->{'State'} |= IN_WIPING; my $object = $record->{'Object'}; unless( $object->BeforeWipeout ) { RTx::Shredder::Exception->throw( "BeforeWipeout check returned error" ); } my $deps = $object->Dependencies( Shredder => $self ); $deps->List( WithFlags => DEPENDS_ON | VARIABLE, Callback => sub { $self->ApplyResolvers( Dependency => $_[0] ) }, ); $deps->List( WithFlags => DEPENDS_ON, WithoutFlags => WIPE_AFTER | VARIABLE, Callback => sub { $self->_Wipeout( Object => $_[0]->TargetObject ) }, ); my $insert_query = $object->_AsInsertQuery; $object->__Wipeout; $self->DumpSQL( Query => $insert_query ); $record->{'State'} |= WIPED; delete $record->{'Object'}; $deps->List( WithFlags => DEPENDS_ON | WIPE_AFTER, WithoutFlags => VARIABLE, Callback => sub { $self->_Wipeout( Object => $_[0]->TargetObject ) }, ); return; } sub ValidateRelations { my $self = shift; my %args = ( @_ ); foreach my $record( values %{ $self->{'cache'} } ) { next if( $record->{'State'} & VALID ); $record->{'Object'}->ValidateRelations( Shredder => $self ); } } =head2 DATA STORAGE AND BACKUPS Shredder allow you to store data you delete in files as scripts with SQL commands. =head3 SetFile( FileName => '-XXXX.sql', FromStorage => 1 ) Calls C method to check and translate file name, then checks if file is empty, opens it. After this you can dump records with C method. Returns name and handle. B If file allready exists then file content would be overriden. Also in this situation method prints warning to the STDERR unless C shredder's option is used. Examples: # file from storage with default name format my ($fname, $fh) = $shredder->SetFile; # file from storage with custom name format my ($fname, $fh) = $shredder->SetFile( FileName => 'shredder-XXXX.backup' ); # file with path relative to the current dir my ($fname, $fh) = $shredder->SetFile( FromStorage => 0, FileName => 'backups/shredder.sql' ); # file with absolute path my ($fname, $fh) = $shredder->SetFile( FromStorage => 0, FileName => '/var/backups/shredder-XXXX.sql' ); =cut sub SetFile { my $self = shift; my $file = $self->GetFileName( @_ ); if( -s $file ) { print STDERR "WARNING: file '$file' is not empty, content would be overwriten\n" unless $opt{'force'}; } open my $fh, ">$file" or die "Couldn't open '$file' for write: $!"; ($self->{'opt'}->{'sqldump_fn'}, $self->{'opt'}->{'sqldump_fh'}) = ($file, $fh); return ($file, $fh); } =head3 GetFileName( FileName => '-XXXX.sql', FromStorage => 1 ) Takes desired C and flag C then translate file name to absolute path by next rules: * Default C value is C<< -XXXX.sql >>; * if C has C (exactly four uppercase C letters) then it would be changed with digits from 0000 to 9999 range, with first one notexistant value; * if C argument is true then result path would always be relative to C; * if C argument is false then result would be relative to the current dir unless it's allready absolute path. Returns file absolute path. See example for method C =cut sub GetFileName { my $self = shift; my %args = ( FileName => '', FromStorage => 1, @_ ); # default value my $file = $args{'FileName'}; unless( $file ) { require POSIX; $file = POSIX::strftime("%Y%m%dT%H%M%S-XXXX.sql", gmtime ); } # convert to absolute path if( $args{'FromStorage'} ) { $file = File::Spec->catfile( $self->StoragePath, $file ); } elsif( !File::Spec->file_name_is_absolute( $file ) ) { $file = File::Spec->rel2abs( $file ); } # check mask if( $file =~ /XXXX[^\/\\]*$/ ) { my( $tmp, $i ) = ( $file, 0 ); do { $i++; $tmp = $file; $tmp =~ s/XXXX([^\/\\]*)$/sprintf("%04d", $i).$1/e; } while( -e $tmp && $i < 9999 ); $file = $tmp; } if( -f $file ) { unless( -w _ ) { die "File '$file' exists, but is read-only"; } } elsif( !-e _ ) { unless( File::Spec->file_name_is_absolute( $file ) ) { $file = File::Spec->rel2abs( $file ); } # check base dir my $dir = File::Spec->join( (File::Spec->splitpath( $file ))[0,1] ); unless( -e $dir && -d _) { die "Base directory '$dir' for file '$file' doesn't exist"; } unless( -w $dir ) { die "Base directory '$dir' is not writable"; } } else { die "'$file' is not regular file"; } return $file; } =head3 StoragePath Returns absolute path to storage dir. By default it's F (in default RT install would be F), but you can change this value with config option C<$RT::ShredderStoragePath>. See C sections in this doc. See C and C methods description. =cut sub StoragePath { return $RT::ShredderStoragePath if $RT::ShredderStoragePath; return File::Spec->catdir( $RT::VarPath, qw(data RTx-Shredder) ); } sub DumpSQL { my $self = shift; return unless exists $self->{'opt'}->{'sqldump_fh'}; my %args = ( Query => undef, @_ ); $args{'Query'} .= "\n" unless $args{'Query'} =~ /\n$/; my $fh = $self->{'opt'}->{'sqldump_fh'}; return print $fh $args{'Query'} or die "Couldn't write to filehandle"; } 1; __END__ =head1 NOTES =head2 Database indexes To speed up shredding you can add several indexes to your DB. CREATE INDEX SHREDDER_CGM1 ON CachedGroupMembers(MemberId, GroupId, Disabled); CREATE INDEX SHREDDER_CGM2 ON CachedGroupMembers(ImmediateParentId, MemberId); CREATE UNIQUE INDEX SHREDDER_GM1 ON GroupMembers(MemberId, GroupId); CREATE INDEX SHREDDER_TXN1 ON Transactions(ReferenceType, OldReference); CREATE INDEX SHREDDER_TXN2 ON Transactions(ReferenceType, NewReference); CREATE INDEX SHREDDER_TXN3 ON Transactions(Type, OldValue); CREATE INDEX SHREDDER_TXN4 ON Transactions(Type, NewValue); If shredding is slow anyway then you have to get list of slow queries, for example mysql has special options to turn on log of slow queries, queries that takes more than one second can be considered as slow, then send the log to the L. =head2 Database transactions support Since RTx-Shredder-0.03_01 extension uses database transactions and should be much safer to run on production servers. =head2 Foreign keys Mainstream RT doesn't use FKs, but at least I posted DDL script that creates them in mysql DB, note that if you use FKs then this two valid keys don't allow delete Tickets because of bug in MySQL: ALTER TABLE Tickets ADD FOREIGN KEY (EffectiveId) REFERENCES Tickets(id); ALTER TABLE CachedGroupMembers ADD FOREIGN KEY (Via) REFERENCES CachedGroupMembers(id); L =head1 BUGS AND HOW TO CONTRIBUTE I need your feedback in all cases: if you use it or not, is it works for you or not. =head2 Testing Don't skip C step while install and send me reports if it's fails. Add your own tests, it's easy enough if you've writen at list one perl script that works with RT. Read more about testing in F. =head2 Reporting Send reports to L or to the RT mailing lists. =head2 Documentation Many bugs in the docs: insanity, spelling, gramar and so on. Patches are wellcome. =head2 Todo Please, see Todo file, it has some technical notes about what I plan to do, when I'll do it, also it describes some problems code has. =head2 Repository You can find repository of this project at L =head1 AUTHOR Ruslan U. Zakirov =head1 COPYRIGHT This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. The full text of the license can be found in the Perl distribution. =head1 SEE ALSO L, L =cut