TinyWikiHave Fun! - ScottWalters$Id: HomePage,v 1.156 2003/06/21 07:57:24 httpd Exp $
package WebsafeColors; sub new { ... }; sub getIterator { my $parentThis = shift; return eval { package WebsafeColors::Iterator; # this mini sub-package only knows how to iterate over our data structure @ISA=(Iterator); sub new { my $type = shift; my $this = { currentIndex=>0 }; bless $this, $type; } sub hasNext { my $this = shift; return @{$parentThis->{'colors'}} > $this->{'currentIndex'}; } sub getNext { my $this = shift; die unless $this->hasNext(); return $parentThis->{'colors'}->[$this->{'currentIndex'}++]; } __PACKAGE__; }->new(); } # there should be two underscores on either side of PACKAGE. TinyWiki is having a bug. sorry.WebsafeColors::Iterator implements all of the functions required to be an instance of Iterator. If something takes an argument, and insists it implement Iterator, it will accept the result of calling getIterator() on a WebsafeColors object. However, WebsafeColors itself does not implement these methods, or inherit the base abstract class for Iterators. The package that does is contained entirely inside WebsafeColors's getIterator() method. This technique lets you localize the impact of having to provide an interface, and keep code related to supporting that interface together and away from the rest of the code. This supports the basic idea of putting code where it belongs.
my $subtotal;
foreach my $item (@cart) {
$subtotal += $item->query_price();
}
my $weight;
foreach my $item (@cart) {
$weight += $item->query_weight();
}
# and so on
Representing individual objects, when the application is concerned about the general state of several objects, is an ImpedenceMismatch. This is a common mismatch: programmers feel obligated to model the world in minute detail then are pressed with the problem of giving it all a high level interface. LayeringPattern tells us to employ increasing levels of abstraction. package Cart::Basket; use base 'Cart::Item'; sub query_price { my $self = shift; my $contents = $self->{contents}; foreach my $item (@$contents) { } } sub add_item { my $self = shift; my $contents = $self->{contents}; my $item = shift; $item->isa('Cart::Item') or die; push @$contents, $item; return 1; } # query_ routines: sub query_price { my $self = shift; my $contents = $self->{contents}; my $subtotal; foreach my $item (@$contents) { $subtotal += $item->query_price(); } return $subtotal; } sub query_price { my $self = shift; my $contents = $self->{contents}; my $weight; foreach my $item (@$contents) { $weight += $item->query_weight(); } return $weight; }The aggregation logic, in this case, totalling, need only exist in this container, rather than being strewn around the entire program. Less code, less CodeMomentum, fewer depencies, more flexibility.
package LinkedList::Link; sub new { bless { prev => undef, next => undef }, $_[0]; } sub next { $_[0]->{next} } sub set_next { $_[0]->{next} = $_[1] } sub prev { $_[0]->{prev} } sub set_prev { $_[0]->{prev} = $_[1] }See AccessorsPattern for an explanation of this style of code, if you must. The objects place in the sequence makes sense to be part of the object. Each object can point you at the next one, following the LawOfDemeter. Should the object be part of two linked lists, or three linked lists, or an arbitrary number of linked lists, no fixed method can be called to deturmine the "next" object in the sequence, because no assumption can be made about which sequence you're talking about. An access would have to exist for previous and next for each sequence the object is part of. It makes more sense to seperate the linking from the object. Rather than adding the code to do whatever to LinkedList::Link, LinkedList::Link should delegate to it: see DelegationConcept. The object would be bare of any linked list logic, though several LinkedList::Link objects may hold a reference to it, and it might be part of an arbitrary number of linked lists, or other data structures. See ObjectsAndRelationalDatabaseSystems for more on the problems of complex inter-object relationships.
CategoryRefactoring
# in a file named Taco.pm: package Taco; use ImplicitThis; ImplicitThis::imply(); sub new { bless { price=>5.95}, $_[0]; } sub query_price { return $price; } # in a file named TacoWithLettuce.pm: package TacoWithLettuce; use ImplicitThis; ImplicitThis::imply(); @ISA = qw(Taco); sub query_price { return $this->Taco::query_price() + 0.05; } # in a file named TacoWithTomato.pm: package TacoWithTomato; use ImplicitThis; ImplicitThis::imply(); @ISA = qw(Taco); sub query_price { return $this->Taco::query_price() + 0.10; } # in a file named TacoWithTomatoAndLettuce.pm: package TacoWithTomatoAndLettuce; use ImplicitThis; ImplicitThis::imply(); @ISA = qw(Taco); sub query_price { return $this->Taco::query_price() + 0.10; }To do it this way, they would have to create a class for each and every topping, as well as each and every combination of toppings! With two toppings this isn't out of hand. With 8 toppings, you've got 256 possible combinations. With 12 toppings, you've 4096 combinations. Creating a permanent inheritance is the root of the problem, here. If we could do something similar, but on the fly, we wouldn't need to write out all of the possible combinations in advance. We could also make the inheritance chain deeper and deeper as we needed to.
# in a file named Taco.pm: package Taco; use ImplicitThis; ImplicitThis::imply(); sub new { bless { price=>5.95, first_topping=>new Topping::BaseTaco }, $_[0]; } sub query_price { return $first_topping->query_price(); } sub add_topping { my $topping = shift; $topping->isa('Topping') or die "add_topping requires a Topping"; $topping->inherit($first_topping); $first_topping = $topping; } # in a file named Topping.pm: package Topping.pm; # this is just a marker class # in a file named Topping/BaseTaco.pm: package Topping::BaseTaco; @ISA = qw(Topping); sub query_price { return 5.95; } # in a file named Topping/Lettuce.pm: package Topping::Lettuce; @ISA = qw(Topping); use ImplicitThis; ImplicitThis::imply(); sub query_price { return 0.05 + $this->SUPER::query_price(); } sub inherit { my $parent = shift; unshift @ISA, $parent; return 1; } # and so on for each topping...The astute reader will notice that this isn't much more than a linked list. Since inheritance is now dynamic, we've gotten rid of needing to explicit create each combination of toppings. We use inheritance and a recursive query_price() method that calls its parent's version of the method. When we add a topping, we tell it to inherit it from the last topping (possibly the base taco). When someone calls query_price() on the taco, we pass off the request to our first topping. That topping passes it on down the line, adding them up as it goes.
# in a file named Taco.pm: package Taco; use ImplicitThis; ImplicitThis::imply(); sub new { bless { price=>5.95, top_topping=>new Topping::BaseTaco }, $_[0]; } sub query_price { return $price; } sub add_topping { my $new_topping = shift; # put the new topping on top of existing toppings. this new topping is now our top topping. $new_topping->top($top_topping); $top_topping = $new_topping; return 1; } # in a file named Topping.pm: package Topping.pm; use ImplicitThis; ImplicitThis::imply(); sub new { my $type = shift; bless { we_top=>undef }, $type; } sub top { my $new_topping = shift; $new_topping->isa('Topping') or die "top must be passed a Topping"; $we_top = $new_topping; return 1; } # in a file named Topping/BaseTaco.pm: package Topping::BaseTaco; @ISA = qw(Topping); sub query_price { return 5.95; } # in a file named Topping/Lettuce.pm: package Topping::Lettuce; use ImplicitThis; ImplicitThis::imply(); @ISA = qw(Topping); sub query_price { return 0.05 + ($we_top ? $we_top->query_price() : 0); }There! We finally have something that passes as workable! This solution is good for something where we want to change arbitrary features of the object without the containing object (in this case, taco) knowing before hand. We don't make use of this strength in this example. The query_price() method of the taco object just passes the request right along, we any math we want can be done. A two-for-taco-tappings-Tuesday, where all toppings were half price on Tuesdays, would show off the strengths of the DecoratorPattern. With a press of a button, a new object could be pushed onto the front of the list that defined a price method that just returns half of whatever the price_method() in the next object returns. The important thing to note is that we can stack logic by inserting one object in front of another when "has-a" relationships.
package GenericProxy; sub new { my $type = shift; my $this = { }; my $obj = shift; ref $obj or die; $this->{'obj'} = $obj; $type .= '::' . ref $obj; # copy inheritance info. @{ref($this).'::ISA'} = @{ref($obj).'::ISA'}; bless $this, $type; } # bug XXX - autoload is only used after @ISA is searched! sub AUTOLOAD { my $this = shift; (my $methodName) = $AUTOLOAD m/.*::(\w+)$/; return if $methodName eq 'DESTROY'; $this->{'obj'}->$methodName(@_); }This simple idea has many uses:
$foo->do($arg, $str, $bleah, $blurgh);Should the arguments do() accepts be changed, every place it is called would need to be changed as well to be consistent. Failure to do so results in no warning and erratic bugs. TypeSafety helps, but this is still no compile time check - missing an a call can lead a program killing bug.
foreach my $class ( qw(NAME SYNOPSIS CODE) ) {
no strict 'refs';
push @{ "POD::${class}::ISA" }, "POD::POD";
}
Not having to use a different method call in each behavior object is key. That would prvent us from using them interchangably. It would introduce need for hardcoded dependencies. We would no longer be able to easily add new behavior objects. Assuming that each behavior object has exactly one method, each method should have the same name. Something generic like ->go() is okey, I suppose. Naming it after the data type it operators on makes more sense, though. If there is a common theme to the behavior objects, abstract it out into the name. ->top_taco() is a fine name. package Taco::Topper; sub top_taco { my $self = shift; die "we're an abstract class, moron. use one of our subclasses" if ref $self eq __PACKAGE__; die "method strangely not implemented in subclass"; } sub new { my $class = shift; bless [], $class; } package Taco::Topper::Beef; sub top_taco { my $self = shift; my $taco = shift; if($taco->query_flags()) { die "idiot! the beef goes on first! this taco is ruined!"; } $taco->set_flags(0xdeadbeef); $taco->set_cost($taco->query_cost() + 0.95); } package Taco::Topper::Cheese; sub top_taco { my $self = shift; my $taco = shift; if(! $taco->query_flag(0xdeadbeef) and ! $taco->query_flag(0xdeadb14d)) { # user is a vegitarian. give them a sympathy discount because we feel # bad for them for some strange reason, even though they'll outlive us by 10 years $taco->set_cost($taco->query_cost() - 1.70); } $taco->set_flags(0xc43323); $taco->set_cost($taco->query_cost() + 0.95); } package Taco::Topper::Gravy; # and so on...Gravy? On a taco? Yuck! In real life, places in the mall that serve "tacos" also tend to serve fries, burgers, hotdogs, and other dubiously non-quasi-Mexican food. It doesn't make sense to have one vat of cheese for the nachos, another for tacos, and yet another for cheesy-gravy-fries. The topper should be able to apply cheese to any of them. Keep in mind that these behavior classes work on a general class of objects, not merely one object. A burger could be a subclass of a taco. See StateVsClass for some thoughts on what makes a good subclass.
$topping_counter->get_cheese_gun()->top_taco($self);... where $topping_counter holds our different topping guns, and get_cheese_gun() returns a cached instance of Taco::Topper::Cheese. This creates a sort of a cow-milking-itself problem. The taco shouldn't be cheesing itself, some other third party should make the connection. Assuming that the topping counter has been robotized and humans enslaved by the taco craving robots, perhaps the topping counter could cheese the taco. [Verbing nouns is considered bad style - I'm sorry].
# using TypeSafety: sub set_day { die unless $_[0]->isa('Day'); $day = shift; return 1; } # using a plain old hash: sub set_day { die unless exists $daysref->$_[0]; $day = shift; return 1; }Everything from this set passes the "isa" test, so we can use TypeSafety to check our arguments. In any other language, it would be impossible to add to the set after being created this way, but we could do revisit the package (see RevisitingNamespaces) or redefine the constructor in Perl, so this shouldn't be considered secure.
package Day; use ImplicitThis; ImplicitThis::imply(); $mon = new Day 'mon'; $tues = new Day 'tues'; my @days; sub new { die unless caller eq __PACKAGE__; my $me = { id=>$_[1] } bless $me, $_[0]; push @days, $me; return $me; } sub get_id { return $id }; sub get_days { return @days; } # in Apopintment.pm: package Appointment; my $day; sub set_day { die unless $_[0]->isa('Day'); $day = shift; return 1; }XXX examples of use, what you can and cannot do, etc.
$mon eq $mon; # true $mon eq $tues; # falseThis behavior, too, is shared with the SingletonPattern. The same effect could be acheived using OverloadOperators. This approach is simplier and more clear.
package Pocket::Computer; sub record_audio { # implemented in some subclasses but not others } sub take_a_memo { # that we can do } sub make_a_call { die "don't know how, and the FCC would have a cow"; } package Pocket::Phone; sub record_audio { # some do, some don't. most don't. } sub take_a_memo { die "i'm not a PDA"; } sub make_a_call { # this we can do }Some devices can do some things, others can do other things. Each device does not have to check to see if it is the kind of device that can - it just knows, because thats what it is, and identity is a large part of ObjectOrientation.
At a certain level of complexity the concept of a StateChange is introduced. Cars suffer from this complexity. You may go from parked to idling, or you may go from idling to accelerating, but not from parked to accelerating. Going from accelerating to parked is also known as an insurance claim. Each state knows the states that are directly, immediately attainable. BreadthFirstRecurssion or DepthFirstRecurssion is needed to plan out anything more complex.
package Memento;
sub new {
my $type = shift;
my %opts = @_;
die __PACKAGE__ . " requires an object passed on its constructor: new Memento object=>\$obj"
unless $opts{'object'};
my $this = { object=>$opts{'object'}, checkPoint=>undef };
bless $this, $type;
}
sub mementoCheckPoint {
my $this = shift;
$this->{'checkPoint'} = $this->deepCopy($this->{'object'});
}
sub mementoRestore {
my $this = shift;
$this->{'object'} = $this->{'checkPoint'};
}
sub AUTOLOAD {
my $this = shift;
(my $method) = $AUTOLOAD =~ m/.*::(\w+)$/;
return if $method eq 'DESTROY';
return $this->{'object'}->$method(@_);
}
sub deepCopy {
my $this = shift;
my $ob = shift;
die unless caller eq __PACKAGE__; # private
return $ob if(!ref $ob);
if(ref $ob eq 'SCALAR') {
my $value = $$ob; return \$value;
}
if(ref $ob eq 'HASH') {
my %value = %$ob; return \%value;
}
if(ref $ob eq 'ARRAY') {
my @value = @$ob; return \@value;
}
# FILEHANDLE, GLOB, other cases omitted
# assume its an object based on a hash
# XXX man perlfunc say that $ob->isa('HASH') works...?
my $type = ref $ob;
my $newself = { };
foreach my $i (keys %$ob) {
$newself->{$i} = $this->deepCopy($ob->{$i});
}
return $newself;
}
While this is a generic Memento package, it cannot possibly know how to correctly deal with objects contained inside the object given it. A version of this (possibly subclassed) tailored to a specific package would handle this situation correctly. Here, we replicate objects mercilessly. This code also violates the encapsulation rules of OO. Use it as a starting point for something that doesn't. package MountRushmore; my $oneTrueSelf; sub new { if($oneTrueSelf) { return $oneTrueSelf; } else { my $type = shift; my $this = {presidents => ['George Washington', 'Thomas Jefferson', 'Theodore Roosevelt', 'Abraham Lincoln'] }; $oneTrueSelf = bless $this, $type; return $this->new(); } } sub someMethod { ... }Singletons are a special case of StaticObjects+.
package Roulette::Table; sub new { my $class = shift; my $this; # if new() is called on an existing object, we're providing additional # constructors, not creating a new object if(ref $class) { $this = $class; } else { $this = { }; bless $this, $class; } # read any number of and supported type of arguments foreach my $arg (@_) { if($arg->isa('Roulette::Color')) { $this->{'color'} = $arg; } elsif($arg->isa('Roulette::Number')) { push @{$this->{numbers}}, $arg; } elsif($arg->isa('Money')) { if($this->{money}) { $this->{money}->combine($arg); } else { $this->{money} = $arg; } } } return $this; } sub set_color { new(@_); } sub add_number { new(@_); } sub add_wager { new(@_); }The constructor, new(), accepts any number or sort of object of the kinds that it knows about, and skuttles them off to the correct slot in the object. Our set routines are merely aliases for new(). new() may be called multiple times, directly or indirectly, to spread our wager over more numbers, change which color we're betting on, or plunk down more cash. I don't play roulette - I've probably butched the example. Feel free to correct it. Use the little edit link. People won't be doing everything for you your entire life, atleast I hope.
package Roulette::Table; use MessageMethod; sub new { my $class = shift; my $this; my $curry; bless $this, $class; $curry = MessageMethod sub { my $msg = shift; if($msg eq 'spin_wheel') { die "Inconsistent state: not all arguments have been specified"; } if($msg eq 'set_color') { $this->{'color'} = shift; } if($msg eq 'add_number') { $this->{'numbers'} ||= []; my $numbers = $this->{'numbers'}; push @$numbers, $arg; } if($msg eq 'add_add_money') { if($this->{'money'}) { $this->{'money'}->combine($arg); } else { $this->{'money'} = $arg; } } if($msg eq 'is_ready') { return 0; } if($this->{'money'} and $this->{'color'} and $this->{'numbers'}) { return $this; } else { return $curry; } }; return $curry; } sub spin_wheel { # logic here... } sub is_ready { return 1; }This second example doesn't support repeated invocations of new() to further define an unfinished object. It could, but it would detract from the example. Add it for backwards compatability if for any reason. More radically, we don't accept any constructors. We return an entirely new object that has the sole purpose of accepting data before letting us at the actual object.
sub create_roulette_table {
my $color;
my $money;
my $numbers;
return sub {
$color = shift;
return sub {
$money = shift;
return sub {
push @$numbers, shift;
return sub {
# play logic here
};
};
};
};
}
# to use, we might do something like:
my $table = create_roulette_table()->('red')->('500')->(8);
$table->(); # play
$table->(); # play again
# or we might do something like:
my $table_no_money = create_roulette_table()->('red')->('500');
my $table;
$table = $table_no_money->(100);
$table->(); # play
$table->(); # play again -- oops, lost everything
$table = $table_no_money->(50);
$table->(); # play some more
This is stereotypical of currying as you'd see it in a language like Lisp. The arguments are essentially untyped, so we take them one at a time, in a specific order. Also like Lisp, the code quickly migrates across the screen then ends aburptly with a large number of block closes (the curley brace in Perl, paranthesis in Lisp). The Lisp version makes heavy use of RunAndReturnSuccessor. If we wanted to adapt this logic to spew out GeneratedMethods, where each method generated wasn't tied to other generated methods, we would need to explicitly copy the accumulated lexical variables rather than simply binding to them. For example, my $color = $color; my $money = shift; would prevent each anonymous routine returned from sharing the same $color variable, although without further logic, they would all have the same value. This amounts to the distinction between instance and class data.
package Mumble;
sub new { ... }; # standard constructor
sub clone {
my $self = shift;
my $copy = { %$self };
bless $copy, ref $self;
};
Note that this is a ShallowCopy+, not a DeepCopy+: clone() will return an object that holds additional references to things that the object being copied holds onto. If it were a DeepCopy+, the new copy would have it's own private copies of things. This is only an issue when the object being copied refers to other objects, perhaps delegating to them. A DeepCopy+ is a recursive copy. It requires that each and every object in this network implement ->clone(), though we could always fall back on reference sharing and fake it.
my $copy = { %$self };
%$self expands the hash reference, $self, into a hash. This is done in a list context, so all of the key-value pairs are expanded returned out - this is done by value, creating a new list. This happens in side of the { } construct, which creates a new anonymous hash. This is assigned to $copy. $copy will then be a reference to all of the same data as $this, The end result is a duplicate of everything in side of $self. This is the same thing as:
sub clone {
my $self = shift;
my $copy;
foreach my $key (keys %$self) {
$copy->{$key} = $self->{$key};
}
bless $copy, ref $self;
}
If we wanted to do a DeepCopy+, we could modify this slightly:
sub clone {
my $self = shift;
my $copy;
foreach my $key (keys %$self) {
if(ref $self->{$key}) {
$copy->{$key} = $self->{$key}->clone();
} else {
$copy->{$key} = $self->{$key};
}
}
bless $copy, ref $self;
}
This assumes that $self contains no hashrefs, arrayrefs, and so on - only scalar values and other objects. This is hardly a reasonable assumption, but this example illustrates the need for and implementation of recursion when cloning nested object structures. package FooFlyweight; my $objectCache; sub new { my $type = shift; my $value = shift; # just a scalar if(exists $objectCache->{$type}->{$value}) { return $objectCache->{$type}->{$value}; } else { my $this = { value => $value, moonPhase=>'full' }; bless $this, $type; $objectCache->{$type}->{$value} = $this; return $this; } }This example returns an object if we have one for that type and value. If not, it creates one, and caches it. An observant reader will note that if we cache objects, give it to two people, and one person changes it, the other will be affected. There are two solutions: pass out read-only objects, or preferably, use ImmutableObjects+.
package TinyNumberOb; sub new { my $type = shift; my $value = shift; # scalar value my $this = \$value; # scalar reference bless $this, $type; } sub getValue { my $self = shift; return $$self; } sub setValue { my $self = shift; $$self = shift; return 1; }This is kind of like Perl's Autovivication of variables and hash and array entries: things spring into existance at the moment a user asks for them.
$number->add(10);You'll write instead:
$number = $number->add(10);Other modules using the old $number can continue doing so in confidence, while every time you change yours, you get a brand new one all your own. If your class is a blessed scalar, your add() method might look like:
sub add {
my $me = shift;
my $newval = $$me + shift;
return bless \$newval, ref $me;
}
Returning new objects rather than changing ones that someone else might have a reference to avoids the problems of ActionAtADistance with pointers - so long as you're using variables which the correct scope to store the pointers. [Explain this - scope for pointers?] package Car::Factory; sub create_car { my $self = shift; my $passengers = shift; my $topspeed = shift; return new Car::Ford if $topspeed < 100 and $passengers >= 4; return new Car::Honda if $topspeed < 120 and $passengers <= 2; return new Car::Porsche if $topspeed > 160 and $passengers <= 2; # etc } # in main.pl: package main; use Car::Factory; my $car = Car::Factory->create_car(2, 175); $car->isa('Car') or die;To be ObjectOriented "pure", each kind of car should do push @ISA, 'Car', so that they pass the $ob->isa('Car') test required by TypeSafety. This lets programs know that it is a car (reguardless of kind) and can thus be used interchangably. See ObjectOriented, PolymorphismConcept, TypeSafety.
package Car::Factory; sub create_car { # this way we can do Car::Factory->create_car(...) or $carfactoryref->create_car(...) # see NewObjectFromExisting my $package = shift; $package = ref $package if ref $package; my $car = new Car::GenericAmericanCar; my $kind = ucfirst(shift()); push @{$kind.'::ISA'}, 'Car', 'Car::GenericAmericanCar'; return bless $car, 'Car::' . $kind; }There! No matter what kind of car the user requests, we create it - even if it didn't exist before we created it. We set the @ISA array to inherit from Car and Car::GenericAmericanCar. Even if the package was completely empty, it now contains the minimal amount of definition to make it useful: an inheritance. You probably don't want to do exactly this, unless you really want the same product rebadged with a bizarre variety of different names.
if ($topsped < 100 and $passengers >= 4) {
require Car::Ford;
return new Card::Ford ;
}
- WilCooley
my $factory = new FordFactory;
my $wifes_car = $factory->create_car();
$wifes_car->isa('Car') or die;
# later:
$factory = new ChevyFactory;
my $husbands_car = $factory->create_car();
$husbands_car->isa('Car') or die;
Code need not be concerned with where the cars come from, only that a Car materialize upon demand. Having a second source available for things is important. If there were only one auto manufacturer, a lot fewer people would be happy with their ride. Ralph Nader never would have won a law suit against them. The same goes for programs. Hacking up an entire program to change which implementation you use is undesireable. Sometimes you have an implementation you really want to get rid of. CategoryPattern, CategoryIntermediate
# Non ObjectOriented: my $parser = do { my $html; # HTML to parse my $tag; # name of the current HTML tag my $name; # name of current name=value pair we're working on my $namevalues; # hashref of name-value pairs inside of the current tag my $starttag = sub { if($html =~ m{\G(<!--.*?-->)}sgc) { return $starttag; } if($html =~ m{\G<([a-z0-9]+)}isgc) { $tag = $1; $namevalues = {}; return $middletag; } if($html =~ m{\G[^<]+}sgc}) { return $starttag; } return undef; }; my $middletag = sub { if($html =~ m{\G\s+}sgc) { return $middletag; } if($html =~ m{\G<(/[a-z0-9]*)>}isgc) { $name = $1; return $middlevalue; } if($html =~ m{\G>}sgc) { $namevalues->{$name} = 1 if $name; return $starttag; } return undef; }; my $middlevalue = sub { if($html =~ m{\G=\s*(['"])(.*?)\1}isgc) { $namevalues->{$name} = $1 if $name; return $middletag; } if($html =~ m{\G\s+}sgc) { return $middlevalue; } return $middletag; }; return sub { $html = shift; return $starttag; }; }; open my $f, 'page.html' or die $!; read my $f, my $page, -s $f; close $f; $parser = $parser->($page); $parser = $parser->() while($parser);Of course, rather than iterating through $parser and using it as a generator, we could blow the stack and make it do the recursive calls itself. In general, return $foo; would be replaced with return $foo->();.
my $ob = bless { color => 'yellow', size => 'large' }, 'GetAndSet';
Of course, we need to back this up with some implementation:
package GetAndSet;
sub AUTOLOAD {
my $this = shift;
(my $method) = $AUTOLOAD =~ m/::(.*)$/;
return if $method eq 'DESTROY';
(my $request, my $attribute) = $method =~ m/^([a-z]+)_(.*)/;
if($request eq 'set') {
$this->{$attribute} = shift;
return 1;
}
if(request eq 'get') {
return $this->{$attribute};
}
die "unknown operation '$method'";
}
Of course, this is considered BadStyle. You should always use ConstructorPattern. Okey, usually. /* * If you are going to copy this file, in the purpose of changing * it a little to your own need, beware: * * First try one of the following: * * 1. Do clone_object(), and then configure it. This object is specially * prepared for configuration. * * 2. If you still is not pleased with that, create a new empty * object, and make an inheritance of this objet on the first line. * This will automatically copy all variables and functions from the * original object. Then, add the functions you want to change. The * original function can still be accessed with '::' prepended on the name. * * The maintainer of this LPmud might become sad with you if you fail * to do any of the above. Ask other wizards if you are doubtful. * * The reason of this, is that the above saves a lot of memory. */- Comment as seen on core library objects in LPMud 2.4.5
package Parrot;
sub new {
my $type = shift;
my $me = { @_ };
bless $me, $type;
}
sub perch {
my $this = shift;
$this->{perch} = shift;
$this->{perch}->add_weight(38);
return 1;
}
sub squak {
print "Eeeeeeeeeeek!\n";
}
package Parrot::African;
use base 'Parrot';
sub squak {
print "EEEEEEEEEEEEEEEEEEEEEEEEK!\n";
}
package Parrot::Pining;
use base 'Parrot';
sub perch {
my $this = shift;
return SUPER::perch(@_) if $this->{at_fjords};
return undef;
}
sub squak {
my $this = shift;
return SUPER::squak(@_) if $this->{at_fjords};
return undef;
}
A call to squak() in a parrot is a notification that it should squak, or a request that it sqauk, never a garantee that a squak will be emitted.
# example of a switch style arrangement:
sub doCommand {
my $me = shift;
my $cmd = shift; $cmd->isa('BleahCommand') or die;
my $instr = $cmd->getInstructionCode();
if($instr eq 'PUT') {
# PUT logic here
} elsif($instr eq 'GET') {
# GET logic here
}
# etc
}
# example of a variable method call arrangement:
sub doCommand {
my $me = shift;
my $cmd = shift; $cmd->isa('BleahCommand') or die;
my $instr = $cmd->getInstructionCode();
my $func = "process_" . $instr;
return undef unless defined &$func;
return $func->($cmd, @_);
}
# example of a variable subclass arrangement.
# this assumes that %commandHandlers is set up with a list of object references.
sub doCommand {
my $me = shift;
my $cmd = shift; $cmd->isa('BleahCommand') or die;
my $insr = $cmd->getInstructionCode();
my $objectRef = $commandHandlers{$instr};
return $objectRef ? $objectRef->handleCommand($cmd, @_) : undef;
}
Since Perl offers AUTOLOAD, this idea could be emulated. If a package wanted to process an arbitrary and growing collection of commands to the best of its ability, it could catch all undefined method calls using AUTOLOAD, and then attempt to dispatch them (this assumes %commandHandlers is set up with a list of object references keyed by method name):
sub AUTOLOAD {
my $me = shift;
(my $methodName) = $AUTOLOAD m/.*::(\w+)$/;
return if $methodName eq 'DESTROY';
my $objectRef = $commandHandlers{$methodName};
return $objectRef ? $objectRef->handleCommand($methodName, @_) : undef;
}
This converts calls to different methods in the current object to calls to a handleCommand() method is different objects. This is an example of using Perl to shoehorn a Command Object pattern onto a non Command Object interface.
package Iterator;
sub hasNext { die; }
sub getNext { die; }
Other packages can come along and add Iterator to their @ISA list. They will need to redefine these methods. Now we have a uniform way of doing something. If a method in an object is expecting an Iterator as its argument, it has a way of checking to see if its argument really is an Iterator. It can be an Iterator and anything, else, too. This supports Type Safety. package SampleTree; sub new { my $type = shift; my $this = { @_ }; bless $this, $type; } sub getIterator { my $this = shift; return new Foo::Iterator node=>$this; } sub get_left { my $this = shift; return $this->{'leftNode'}; } sub get_right { my $this = shift; return $this->{'rightNode'}; } package SampleTree::Iterator; sub new { my $type = shift; my %opts = @_; my $this = {state=>0, iterator=>undef, node=>$opts{'node'}; bless $this, $type; } sub getNext { my $this = shift; my $result; if($this->{'iterator'}) { $result = $this->{'iterator'}->getNext(); } if($result) { return $result; } elsif($this->{'state'} == 0) { # try the left node $this->{'iterator'} = $this->{'node'}->get_left(); $this->{'state'} = 1; return $this->getNext(); } elsif($this->{'state'} == 1) { # try the right node $this->{'state'} = 2; $this->{'iterator'} = $this->{'node'}->get_right(); return $this->getNext(); } else { # state == 2 return undef; } }This [untested XXX] code allows a network of objects having the getIterator method to cooperatively and transparently work together. Each object in the network may have a different way of iterating. This example represents a tree datastructure. The tree may contain other tree nodes, array objects, queues, and so forth. As long the network consists of objects with a getIterator() method that returns an object that implements the Iterator iterface, we can crawl through the whole thing. Thats composition you can take to the bank and smoke!
# slurp everything into memory, then work on it:
open my $file, 'dataset.cvs' or die $!;
read $file, my $data, -s $file or die $!;
close $file;
foreach my $i (split /\n/, $data) {
# process
}
# process as we read:
my $process = sub {
# process
};
open my $file, 'dataset.cvs' or die $!;
while(my $record = <$file>) {
$process->($record);
}
close $file;
Returning all of the data from a get_ method fosters slurping everything into memory. This fosters programers which are limited by memory in how large of datasets they can work on. You can chuckle and say that virtual memory will take up the slack, but if I can tell you that there are a heck of a lot of multi terrabyte data warehouses kicking around the world. Dealing with data in place, where your storage is essentially at capacity at all times, or having multiple clients process a very large dataset in parallel demands efficiency. There are still a few applications for good programmers and a few applications for good programmers to write. package RecordReader; use ImplicitThis; @ISA = qw(Interface); sub new { my $type = shift; my $file = shift; open my $filehandle, $file or die $!; my $me = { handle => $filehandle, next => undef }; bless $me, $type; } sub getNext { return $next if defined $next; return <$handle>; } sub hasNext { return 1 if defined $next; $next = <$me>; if($next) { return 1; } else { close $fh; return 0; } }Compare this to Java's IO Filters, which will superimpose read of datastructures, international characters, and so forth on top of IO strems: you'll find it to be remarkably similar. It lets users mix and match IO processing facilities.
$a = "aaa"; $a++; print $a, "\n"; # prints "aab"See OverloadOperators for how to create constructs like this yourself in Perl according to this formula:
# note: no package statement use DBI; use CGI; use Mail::Sendmail;Back in the main program:
use config; my $userid = CGI::param('userid'); # etc...my variables are file-global when declared outside of any code blocks, which means that we can't easily declare lexical variables in config.pm and have them show up in the main program. We can co-opt the import() method of config.pm to create local variables in the main program, though:
# back in config.pm:
my %config = (
maxusers => 100,
retriespersecond => 2,
loglevel => 5
);
sub import {
my $caller = caller;
foreach my $i (keys %config) {
local ${$caller.'::'.$i};
*{$caller.'::'.$i} = $config{$i};
}
}
This will atleast squelsh any warnings Perl would otherwise emit and let us return to importing configuration dependent values from a configuration file.
package Preferences;
sub new {
my $class = shift;
my %args = @_;
bless {color=>$args{'color'}, petname=>$args{'petname'}, street=>{'street'} }, $class;
}
sub query_color { return $_[0]->{'color'}; }
sub set_color { return $_[0]->{'color'} = $_[1]; }
# other accessors here
1;
package main;
$| = 1;
print "Whats your favorite color? "; my $color = <STDIN>;
print "Whats your pets name? "; my $petname = <STDIN>;
print "What street did you grow up on? "; my $street = <STDIN>;
my $foo = new Preferences (color=>$color, petname=>$petname, street=>$street);
The string "color" appears ten times. Ten! In Perl, no less. If I wrote out the constructors for the other arguments, this would be repeated for each variable. Shame. If we trust the user to pass in the right things to the constructor, we can get rid of two. Still, even typing each thing eight times is begging for a typo to come rain on your parade.
package main;
$| = 1;
sub get_preferences {
print "Whats your favorite color? "; my $color = <STDIN>;
print "Whats your pets name? "; my $petname = <STDIN>;
print "What street did you grow up on? "; my $street = <STDIN>;
return MessageMethod sub {
my $arg = shift;
({
query_color => sub { return $color; }
set_color => sub { $color = shift; return 1; }
# etc
}->{$arg} || sub { die "Unknown request: $arg" })->(@_);
};
}
my $ob = get_preferences();
print $ob->get_street(), "\n";
First, the { query_name => sub { } }->{$arg}->(@_) is a sort of switch/case statement. It creates an anonymous hash of names to functions, then looks up one of the functions by name, using the first argument passed in. Once we have that code reference, we execute it and pass it our unused arguments. Then we've added a default case to it, so we don't try to execute undef as code. This could have been coded using if/elsif/else just as easily. package MessageMethod; sub new { my $type = shift; return $type->new(@_) if ref $type eq __PACKAGE__; my $ref = shift; ref $ref eq 'CODE' or die; bless $ref, $type; } sub AUTOLOAD { my $me = shift; (my $method) = $AUTOLOAD =~ m/::(.*)$/; return undef if $method eq 'DESTROY'; return wantarray ? ($me->($method, @_)) : scalar $me->($method, @_); } 1;Given a code reference, MessageMethod blesses it into its own package. There are no methods aside from new() and AUTOLOAD(). AUTOLOAD() handles undefined methods for Perl, and since there are no methods, it handles all of them. (There is an exception to that, where new() has to pass off requests). AUTOLOAD() merely takes the name of the function it is standing in for and sends that as the first argument to a call to the code reference, along with the rest of the arguments. We're translating $ob->foo('bar') into $ob->('foo', 'bar'). This does nothing but let us decorate our code reference with a nice OO style syntax.
# place this code in hashclosure.pm
# tell Perl how to find methods in this object - run the lambda closures the object contains
sub AUTOLOAD {
(my $method) = $AUTOLOAD =~ m/::(.*)$/;
return if $method eq 'DESTROY';
our $this = shift;
if(! exists $this->{$method}) {
my $super = "SUPER::$method";
return $this->$super(@_);
}
$this->{$method}->(@_);
}
1;
This code translates method calls into invocations of anonymous subroutines by the same name inside of a blessed hash: when a method is called, we look for a hash element of that name, and if we find it, we execute it as a code reference.
title: "Dispatch Order"
color: lightcyan
manhattan_edges: yes
edge.color: lilac
scale: 90
node: { title:"A" label: "$foo = new Foo(); \n$foo->bar();" }
node: { title:"A1" label: "Foo::new()" }
node: { title:"B" label: "Foo::AUTOLOAD()" }
node: { title:"C" label: "$foo->{'bar'}->() runs" }
edge: { sourcename:"A" targetname:"A1" anchor: 1}
edge: { sourcename:"A" targetname:"B" anchor: 2}
edge: { sourcename:"B" targetname:"C" }
}
package Foo;
sub new {
my $class = shift;
my %args = @_;
our $this;
my $foo;
my $bar;
bless {
get_foo => sub { return $foo },
set_foo => sub { $foo = shift },
get_bar => sub { return $bar },
set_bar => sub { $bar = shift },
get_foo_bar_qux => sub {
return $this->get_foo(), $this->get_bar(), get_qux();
},
dump_args => sub {
foreach my $i (keys %args) {
print $i, '=', $args{$i}, "\n";
}
},
}, $class;
}
sub get_qux { return 300; }
This blesses an anonymous hash reference into our package, Foo. This hash reference contains method names as keys and anonymous subroutines as values. AUTOLOAD() knows how to look into our hash and find methods by name, and run them, rather than looking for methods in the normal place.
my $class = new class sub{
my $field = shift;
$this->field = $field;
$this->arrayref = [1,2,3];
$this->hashref = {a => b, c => d};
$this->method = sub{ return $this->field };
};
...allowing the anonymous, inline construction of classes. package BaseballPlayer::Pitcher; { use vars '@ISA'; @ISA = 'BaseballPlayer'; my (%ERA, %Strikeouts); sub ERA : lvalue {$ERA {+shift}} sub Strikeouts : lvalue {$Strikeouts {+shift}} sub DESTROY { my $self = shift; delete $ERA {$self}, $Strikeouts {$self} } }Taking this apart, lexical data is used instead of nametable variables, which doesn't seem to make any difference. Rather than indexing the blessed reference by a constant field name to come up with a per-object, per-field storage slot, one of these lexicals is indexed by the stringified object reference.
package Constrain;
# component - anonymous functions that exert force on each other.
# these are generated by various functions, much as an
# object in OO Perl would be created.
sub new {
my $type = shift;
my $subtype = shift;
return new Constrain::Adder(@_) if $subtype eq 'adder';
return new Constrain::Constant(@_) if $subtype eq 'constant';
return new Constrain::Probe(@_) if $subtype eq 'prober';
return new Constrain::Connector(@_) if $subtype eq 'connector';
warn "Unknown Constrain subtype: $subtype";
}
package Constrain::Adder;
sub new {
my $type = shift;
my $a1 = shift; # the name of our first connector
my $a2 = shift; # the name of 2nd connector we are tied to
my $sum = shift; # the name of 3rd connector we are tied to
my $obj = { a1=>$a1, a2=>$a2, sum=>$sum };
bless $obj, $type;
$a1->xconnect($obj);
$a2->xconnect($obj);
$sum->xconnect($obj);
return $obj;
}
sub forgetvalue {
my $this = shift;
$a1->forgetvalue($obj);
$a2->forgetvalue($obj);
$sum->forgetvalue($obj);
$this->set_value(undef);
}
sub setvalue {
my $this = shift;
local *a1 = \$this->{a1};
local *a2 = \$this->{a2};
local *sum = \$this->{sum};
if($a1->hasvalue() and $a2->hasvalue()) {
$sum->setvalue($a1->getvalue() + $a2->getvalue(), $this);
} elsif($a1->hasvalue() and $sum->hasvalue()) {
$a2->setvalue($sum->getvalue($sum) - $a1->getvalue($a1), $this);
} elsif($a2->hasvalue() and $sum->hasvalue()) {
$a1->setvalue($sum->getvalue() - $a2->getvalue(), $this);
}
}
sub dump {
my $this = shift;
local *a1 = \$this->{a1};
local *a2 = \$this->{a2};
local *sum = \$this->{sum};
print("a1 has a value: ", $a1->getvalue(), "\n") if $a1->hasvalue();
print("a2 has a value: ", $a2->getvalue(), "\n") if $a2->hasvalue();
print("sum has a value: ", $sum->getvalue(), "\n") if $sum->hasvalue();
}
package Constrain::Constant;
sub new {
my $type = shift;
my $value = shift; # our value. we feed this to anyone who asks.
my $connector = shift; # who we connect to.
my $obj = { value => $value, connector => $connector };
bless $obj, $type;
$connector->xconnect($obj);
$connector->setvalue($value, $obj);
return $obj;
}
sub setvalue {
my $this = shift;
my $value = shift;
$this->{connector}->setvalue($value, $this);
}
sub getvalue {
my $this = shift;
return $this->{value};
}
package Constrain::Probe;
sub new {
my $type = shift;
my $connector = shift;
my $name = shift;
my $obj = { connector => $connector, name => $name };
bless $obj, $type;
$connector->xconnect($obj);
return $obj;
}
sub setvalue {
my $this = shift;
my $name = $this->{name};
print "Probe $name: new value: ", $this->{connector}->getvalue(), "\n";
}
sub forgetvalue {
my $this = shift;
my $name = $this->{name};
print "Probe $name: forgot value\n";
}
package Constrain::Connector;
sub new {
my $type = shift;
my $obj = { informant=>undef, value=>undef, dontreenter=>0, constraints=>[] };
bless $obj, $type;
}
sub hasvalue {
my $this = shift;
return $this->{informant};
}
sub getvalue {
my $this = shift;
return $this->{value};
}
sub setvalue {
my $this = shift;
local *constraints = \$this->{constraints};
my $newval = shift;
my $setter = shift or die;
return if $this->{dontreenter}; $this->{dontreenter} = 1;
$this->{informant} = $setter;
$this->{value} = $newval;
foreach my $i (@$constraints) {
$i->setvalue($newval, $this) unless $i eq $setter;
}
$this->{dontreenter} = 0;
}
sub forgetvalue {
my $this = shift;
local *constraints = \$this->{constraints};
my $retractor = shift;
if($this->{informant} eq $retractor) {
$this->{informant} = undef;
foreach my $i (@$constraints) {
$i->forgetvalue($this) unless $i eq $retractor;
}
}
}
sub xconnect {
my $this = shift;
local *constraints = \$this->{constraints};
local *value = \$this->{value};
my $newconstraint = shift or die;
push @$constraints, $newconstraint;
$newconstraint->setvalue($value, $obj) if $value;
}
package main;
my $a = Constrain::Connector->new();
my $a_probe = Constrain::Probe->new($a, 'a_probe');
my $b = Constrain::Connector->new();
my $b_probe = Constrain::Probe->new($b, 'b_probe');
my $c = Constrain::Connector->new();
my $c_probe = Constrain::Probe->new($c, 'c_probe');
my $a_b_adder = Constrain::Adder->new($a, $b, $c);
my $a_const = Constrain::Constant->new(128, $a);
my $b_const = Constrain::Constant->new(256, $b);
XXX - constraint system example - IK system using X11::Protocol?
*{'ExistingPackage::new_function} = sub {
# new accessor
};
sub ExistingPackage::new_function {
# new accessor
};
Any object created from ExistingPackage+ will instantly have a method, new_function(), after this code is run. Both examples do essentially the same thing. The first is uglier, but allows closures to be taken. Perl still considers the new function to be the package it was defined in [B::Generate example to change newstate ops to fix this.]. This means that we can't use lexical data that was in scope when ExistingPackage+ was originally created, nor can we use UseVars+ and OurVariables+ that exist in ExistingPackage+. sub ExistingPackage::new_function { my $self = shift; local *existing_var = \${ref($self) . '::existing_var'}; # code here that uses $existing_var freely, as if it were in # out package scope. $existing_var++; }The local *glob = selfref idioms is, well, ugly. We compute the name of the variable - find the package that $self was blessed into, concatonated with "::existing_var", and then used as a soft reference. A reference is then taken to that soft reference using the backslash operator. See ComputedReferences.
local $ExistingPackage::new_variable;$new_variable will be static - individual objects won't have their own copy. See StaticVariables. This is usually not the desired result.
do {
my $oldnew = \&ExistingPackage::new;
*ExistingPackage::new = sub {
my $self = $oldnew->(@_);
$self->{new_variable} = compute_value();
$self;
};
};
This defines a new() routine in ExistingPackage+ that invokes the old new() routine using the reference we saved in $oldnew. This reference is passed all of the arguments given to the replacement new() routine. This assumes that the datastructure underlieing objects defined by ExistingPackage+ is a hash reference: $self->{new_variable} would need to be changed to something similar to $self->[num] if it were an array. compute_value() is a place holder for whatever logic you really want to do. We insert this value forcefully, disreguarding AccessorsPattern. Finally, we return the modified $self. The return operator breaks the tieing on perl 5.6.1 and perhaps later, so we just let the last value of the block fall through. package DBI::Record; my $foreign_keys = {}; sub import { # read foreign key information # translates a foreign column name to a table to its table # $foreign_keys{'FooID'} = 'Foo'; while(my $i = shift) { $foreign_keys{$i} = shift; } } sub new { my $type = shift; $type = ref $type if ref $type; my $me = { }; my $usage = 'usage: new DBI::Record $dbh, $sql | ($sth, $sth->fetchrow_hashref())'; my $dbh = shift; ref $dbh or die $usage; my $rs = shift; my $sth; my $sql; die $usage unless @_; if(ref $_[0]) { $sth = shift; $rs = shift or $rs = $sth->fetchrow_hashref(); } else { $sql = shift; $sth = $dbh->prepare($sql); $sth->execute(); $rs = $sth->fetchrow_hashref(); } $me->{'database_handle'} = $dbh; $me->{'record_set'} = $rs; $me->{'statement_handle'} = $sth; # generate accessors foreach my $i (keys %$rs) { *{$i} = sub { my $me = shift; my $sth = $dbh->prepare("select * from $foreign_keys{$i} where $i = $rs->{$i}"); $sth->execute(); my $newrs = $sth->fetchrow_hashref; return $me->new($dbh, $newrs, $sth); } } bless $me, $type; } sub next { my $me = shift; my $sth = $me->{'statement_handle'} or return undef; my $newrs = $sth->fetchrow_hashref() or return undef; return $me->new($me->{'database_handle'}, $sth, $newrs); } package main; use DBI::Record CustomerID => Customers, BillID => Bills; use DBI; my $dbh = DBI->connect("DBI:Pg:dbname=geekpac", 'ingres', '') or die $dbh->errstr; my $customer = new DBI::Record $dbh, "select * from Users limit 1"; my $bill = $customer->BillID(); while($bill) { print $bill->{'BillID'}, " ", $bill->{'Amount'}, "\n"; $bill = $bill->next(); }This makes it easy to navigate relationships in a relational database system, but doesn't do a lot for us in the way of reporting.
select self1 as foo, self2 as bar
from self as self1,
self as self2
where self1.name = self2.param
Note how the table self is being joined against the table self. This is where the name comes from.
foreach my $i (keys %hash) {
if(exists $hash{$i} and exists $hash{$hash{$i}}) {
push @results, [$i, $hash{i}, $hash{$hash{$i}}];
}
}
Ugly, slow, crude, effective. People have been known to write code generators and SQL generators when faced with degenerate cases like these that automate ugliness production. I guess you could categories this as an AntiPattern in the form of a CodeSmell.
select p1.value as email,
p2.value as name,
p3.value as gender
from form
parameters as p1,
parameters as p2,
parameters as p3
where form.formid = ?
and p1.formid = form.formid
and p1.name = 'email'
and p2.formid = form.formid
and p2.name = 'name'
and p3.formid = form.formid
and p3.name = 'gender'
Each additional field requires 4 additional lines in our query. If we were joining the additional tables in, it would take 2:
select emails.email as email,
names.name as name,
genders.gender as gender
from forms, emails, names, genders
where forms.formid = ?
and forms.nameid = names.nameid
and forms.emailid = emails.emailid
and forms.genderid = gender.genderid
Obviously, lumping everything in one table would simplify further in this case, and in this case would be perfectly acceptable. When not all of the columns describe the primary key and only the primary key, the database design degenerates. SelfJoiningData usually comes about as a means to cope with trying to report on such degenerate databases.
<TakeFive> Juerd: I think I'm going to go with multiple tables after all.
It will save me headaches in the future.
And I can pull them (assuming the 'header' record is in %h) with:
"select value from $h{datatype} where id = $h{id} order by sequence"
<Juerd> subqueries, subqueries, subqueries, joins.
subqueries, subqueries, subqueries, joins.
<TakeFive> :)
<Juerd> Ideally, you don't use a query just to have enough information to do the next
<scrottie> Except for meta applications like database admins, usually you
don't want a variable table name.
<Juerd> That is correct.
Same goes for column names.
<Juerd> TakeFive: Think symbolic references
<scrottie> If all of your things are of essentially of the same type, put atleast
the parts of them that describe the primary key in one table. You
can always OutterJoin a lot of other tables, so you get kind of an
ObjectOriented like thing going on - everything "is a" foo, but
you have some MixIns going on as well.
Not that MixIns are encouraged in OO, but it is kind of the same idea.
<TakeFive> scrottie: the problem is (going back to the oceanographic implementation)
right now, with the dataset I have, all the actual data is floating
point numbers --
<scrottie> i've always said we should dump our problems in the ocean =)
<TakeFive> salinities, depths, current speeds and such.
but now i've been told i need to support character fields,
latitude/longitude pairs, and timestamps, and ultimately, I'll
need to be able to generate pictures of buoys as they float, or
purely text output.
If I use a single table, I'll always have to check what kind of data
I have.
<scrottie> n:1 relationships break out into another table,
so if you have a bunch of buoys for one given primary record (what
is the primary record anyway?), then throw them all in another
table.
If you have an arbitrary number of other things of types that you can't
anticipate, you could promote everything to the same object type
and allow recurisve references between objects ;)
Perlers tend to write databases like that... like perlmonks's
codebase... but it is best not to talk of such things
<Juerd> It's called Everything
<scrottie> If you have a lot of different things, you can set up an attribute-value
pairs table. Think of HTML forms. Someone posts a form. That
gets a record in a Posts table, lets say. it has a bunch of name value
pairs. Each of those gets a record in the Attribute table,
where each record references the Posts table entry.
<TakeFive> scrottie: ah, add a column like: "datatype" for each record.
<scrottie> Yeah. You lose the ability to cleanly join at that point -
everything is nested subqueries with another self-join for each
(lag) record you want from Attributes. ugly. So, that way is
sometimes - seldom but sometimes - better.
<Juerd> subqueries, subqueries, subqueries, joins.
<scrottie> Well, the value in the attribute-value pair will always be the
largest thing - if you're holding binary data, it will be a blob.
Few databases index blobs.
<scrottie> You probably don't want SelfJoiningData, and you don't want to
promote all records to the same type. That leaves creating a lot
of tables, one for each type of thing, and doing a lot of joins
and OutterJoins. It kind of sucks, but it is powerful, and a lot
less ugly in the end than any alternative.
The relation between tables is based purely on references
between fields. Never list table names in a database as a means
of creating references.
Juerd is right. Use lots of joins and subqueries to pull your data together from multiple different tables. As Juerd says, ideally, you should get your result in one query. Use only IDs in auxiliary tables or HingeTables+. You can easily create more auxiliary tables and reference the primary table from them. Only queries that want this information will know about and know to ask for it to be joined in. [This conversion essentially took place. However, it has been edited, and everyone and their dog has access to Wiki, so it should be considered ficticious at this point. Characters are real, words are not.] select count(*) as isDongle from Product, Category, ProductToCategory where Product.ProductID = ProductToCategory.ProductID and ProductToCategory.CategoryID = Category.CategoryID and Category.Name = 'Dongle'This query returns the number of dongles in the database. Replacing count(*) with a specific field list would return details of each dongle.
my $output = new Output; my $backend = new Backend($output); $output->set_backend($backend);Or:
my $output = new Output($this);Refactor as a:
my $output = new Output; my $backend = new Backend($output->get_backend_adapter()); $output->set_backend($backend->get_output_adapter());Or...
my $output = new Output($this->get_output_adapter());ModelViewController
sub foo {
my %args = @_;
my $color = $args{color};
my $number = $args{number};
# ...
}
foo(color=>'red', number=>13);
The || operator lets you easily provide defaults and error checking:
sub foo {
my %args = @_;
my $color = $args{color} || 'red';
my $number = $args{number} || die 'number=> paramer required';
# ...
}
Or, you may explicitly list the argument names and defaults, providing a self-documenting framework:
sub foo {
my %args = (
Arg1 => 'DefaultValue',
Blah => 42,
Bleh => 60*60*24,
Hostname => undef,
@_
);
# Handle error-checking here
defined $args{Hostname} or die 'Required parameter "Hostname" not defined';
}
See Also
my $context = {
increment => sub { my $context = shift; $context->{sum}++; return ''; },
currentvalue => sub { my $context = shift; return $context->{sum}; }
};
sub expand_macros {
my $context = shift;
my $text = shift;
my $macro = qr{([A-Z][A-Z0-9]{2,})};
$text =~ s/$macro/$context->{lc($1)}->($context)/ge;
return $text;
}
expand_macros($context, "INCREMENT INCREMENT The current value is: CURRENTVALUE");
This is fairly strightfoward: We can pass $context and some text containing the macros "INCREMENT" and "CURRENTVALUE" to expand_macros(), and the macros will increment the current value of $context->{sum} and return the value. This is a simple template parser that finds escapes in text and replaces them with the result of a peice of code passed in through a hash. However, since we're maintaing our context in a hash reference, we can do this recursively:
$context->{doubleincrement} = sub {
my $context = shift;
expand_macros($context, "INCREMENT INCREMENT CURRENTVALUE");
}
expand_macros($context, "The current value is: DOUBLEINCREMENT");
Maintaining state in a hashref rather than the symbol table only requires us to be vigilent in passing the hash ref around. We have access to the updated state in the hashref after evaluation has finished. We can take this same context and pass it off again. In our example, we could template something else, reusing our same state and collection of macro definitions.
# defining our mini language:
# format of our macro escapes. returns the name of the macro.
$macro = qr{([A-Z][A-Z0-9]{2,})};
sub fetchvalue() {
my $symbol = lc(shift());
my $ob = shift;
return $ob->{$symbol} if defined $ob->{$symbol};
return $symbol->($ob) if defined &{$symbol}; # if its available as a function, recurse into it
return $$symbol; # assume its a scalar
}
sub createtemplate {
my $name = shift;
my $text = shift;
*{$name} = sub {
my $ob = shift;
my $text = $text; # private copy, so we don't ruin the original
$text =~ s{$macro}{ fetchvalue($1, $ob); }oges;
return $text;
};
}
sub createquery {
my $name = shift; # name of function to create
my $sql = shift; # query this function will execute
my $inner = shift; # name of function to call with each result, optional
my @queryargs; $sql =~ s{('?)$macro\1}{push @queryargs, lc($2);'?'}oges;
my $sth = $dbh->prepare($sql, @queryargs);
*{$name} = sub {
my $ob = shift;
my $row;
my $ret;
$sth->execute(map { fetchvalue($1, $ob); } @args);
my @names = @{$sth->{'NAME'}};
while($row = $sth->fetchrow_arrayref()) {
# store each item by its column name
for(my $i=0;$i < @names; $i++) {
$ob->{$names[$i]} = $row->[$i];
}
# if we're supposed to send each record to someone, do so.
$ret .= $inner->($ob) if($inner);
}
$sth->finish();
return $ret;
};
}
# writing code in our mini language:
createquery('readnames', qq{
select Name as name from Users where Name is not null
});
createquery('readnumberbyageinstate', qq{
select count(*) as number, Age as agearoup
from Users where State = STATE
group by Age
}, 'drawbargraph');
createtemplate('drawbargraph', qq{
<div align="left"><img src="reddot.png" height="20" width="NUMBER"></div>
});
print readnames();
print readnumberbyageinstate({state=>'MD'});
Lets take a look at what we've factored out in this example:
print createquery($readnumberbystatesql, {drawpiechart => createpiechart() }, 'drawpiechart');
It is traditional in languages like Lisp and Scheme to skip naming things unless actually necessary.
eval {
run_query();
};
if($@) {
$dbh = DBI->connect("DBI:Pg:dbname=blog;host=localhost;port=5432", 'scott', 'foo');
run_query();
}
See Also
my $lock;
sub notify_all {
if($lock) {
warn "Don't respond to an event with an event!";
$lock++;
}
foreach my $listener (@listeners) {
$listener->send_event(@_);
}
$lock = 0;
}
In most cases, it is never an error to be called back by the object that you just called. Some times re-entry isn't an error at all, and you can silently refuse it. ConstraintSystem uses this idea to propogate values across a network where some nodes are willing to budge and others aren't. Usually this manifests as a list of notification recipients that receive a notification, and one needs to send yet another notice to all of them except the sender of the original message, but doesn't happen to know which originated. This situation crops up with the Gnutella protocol, where nodes replay messages to every peer except the originating one, but the mesh of connections can cause the message to be accidentally routed to the originator anyway. Simpily tracking which messages you originated yours and ignoring requests to forward them again pervents a condition where a host transmits the same message out onto the net over and over.
sub notify_all {
if($testing) {
# never do this in production code!
my $calldepth = 0;
$callerdepth++ while(caller($calldepth));
die "arbitrary limit exceeded: stack depth too deep, possible runaway recursion detected"
if $callerdepth > 100;
}
foreach my $listener (@listeners) {
$listener->send_event(@_);
}
}
Recursion and Locking on User Data
# expand includes in HTML templates
# eg, <!-- #include virtual="/includes/header_logo.html" -->
my $numfound;
FOUNDSOME:
$numfound = 0;
$tmpl =~ s{<!--\s*#include\s*virtual\s*=\s*('")(.*?)\1\s*-->}{
die "invalid include path: '$2'" if $2 =~ m{\/\.\./\/};
open my $f, "$inputdir/$2" or die "include not found: $inputdir/$2 $!";
read $f, my $repl, -s $f;
close $f;
$numfound++;
return $repl;
}gie;
goto FOUNDSOME if($numfound);
This would run indefinately (if permitted by the universe) if a user tried the A includes B, B includes A attack. Preventing reentry into some method wouldn't work. If we created a method, we would need to be able to reenter it to include more than one file deep. Of course, we could make it non-recursive, but it wouldn't do TheRightThing. Things that seem like they should work, don't.
my $numfound;
my %done; # added this
FOUNDSOME:
$numfound = 0;
$tmpl =~ s{<!--\s*#include\s*virtual\s*=\s*('")(.*?)\1\s*-->}{
die "invalid include path: '$2'" if $2 =~ m{\/\.\./\/};
die "file '$2' included entirely too many times" if $done{$2}++ > 30; # added this
open my $f, "$inputdir/$2" or die "include not found: $inputdir/$2 $!";
read $f, my $repl, -s $f;
close $f;
$numfound++;
return $repl;
}gie;
goto FOUNDSOME if($numfound);
Another solution is to maintain a stack, perhaps a SimpleStack, and continiously examine it for repeated sequences. Such attempts are prone to occurance of a RaceCondition, and there is usually an upper limit on how large of the stack segment it will compare to the rest of the stack. For example, if the code only checks for repeated patterns of two through 300 stack frame entries, someone need only create a circulation inclusion attack that 301 pages. D'oh!
my $shbit = 1 << fileno($sh);
my $sibit = 1 << fileno($si);
my $inbitmask = $shbit | $sibit;
# select(readtest, writetest, exceptiontest, max wait)
select($inbitmask, undef, undef, 0);
if($inbitmask & $shbit) {
# $sh is ready for read
}
if($inbitmask & $sibit) {
# $si is ready for read
}
Done in a loop, several sources of input - perhaps the network, a GUI interface, pipes connected to other processes - could all be managed. The last argument to select() is typically 0 or undef, though it is sometimes other numbers. If it is undef, select() will wait indefinately for input. If it is 0, select() will return immediately, input ready or not. Any other number is a number of seconds to wait, floating point numbers accepted. As soon as a any monitored input or output handle becomes ready, select() will return. select() doesn't return a value in the normal sense: it motifies the bit mask, turning off any bits that correspond to fileno() bit positions that aren't ready. Each bit that we set must be tested to see if it is still on. If it is, that filehandle is ready for read or write. Filehandles that we want to monitor for read are passed as a bitmask in the first argument position of select(). The second argument of select() is the filehandles to monitor for write, and the third, for exceptions.
if($inbitmask & $sibit) {
$si->process_input();
}
Filehandles may be blessed into classes [a subclass of IO::Handle would be an intelligence choice], and then methods called to handle the event where input becomes available for read. This is easy to implement, simple, and sane - to implement. Using it is another story. package IO::Network::GnutellaConnection; use base 'IO::Handle'; sub process_input() { my $self = shift; $self->read(...); }Each access must promptly return for other handles to be served. This is a big requirement. Unheaded, a user interface could repeatedly cause network traffic to time out, or one unresponsive process reading on a pipe to lock up the process writing on the pipe - see PerlGotchas for more. These cases are more numerous and insideous than thread CPU starvation issues.
use Socket;
use Fcntl qw(F_GETFL F_SETFL O_NONBLOCK);
use POSIX qw(:errno_h :fcntl_h);
my $proto = getprotobyname('tcp');
socket($server, PF_INET, SOCK_STREAM, $proto) or die "socket: $!";
setsockopt($server, SOL_SOCKET, SO_REUSEADDR, pack("l", 1)) or die "setsockopt: $!";
bind($server, sockaddr_in($port, INADDR_ANY)) or die "bind: $!";
listen($server, SOMAXCONN) or die "listen: $!";
# non blocking listens:
fcntl($client, F_SETFL, fcntl($server, F_GETFL, 0) | O_NONBLOCK) or die "fcntl: $!";
while(1) {
my $paddr = accept($client, $server);
(my $remoteaddress, my $remoteport) = sockaddr_in($paddr);
my $remotehostname = gethostbyaddr($iaddr,AF_INET);
}
XXX - very dubious, could be written cleaner, probably doesn't work.
package Xfor;
sub new {
my $pack = shift;
my $filecache; # holds all of the name->value pairs for each item in each file
my $buffered; # same format: data to write to file yet
bless {
# open a flatfile database. create it if needed.
open => sub {
my $fn = $_[0];
unless(-f $fn) {
open F, '>>'.$fn or return 0;
close F;
}
$self->openorfail($fn);
},
# open a flatfile database. fail if we are unable to open an existing file.
openorfail => sub {
my $file = shift; # which file the data is in
open my $f, $file or die $!;
my $k; my $v;
while(<$f>) {
chomp;
%thingy = split /\||\=/, 'key='.$_;
while(($k, $v) = each %thingy) {
$filecache->{$file}->{$thingy{'key'}}->{$k} = $v;
}
}
close $f;
return 1;
},
# fetch a value for a given key
get => sub {
my $file = shift; # which file the data is in
my $thingy = shift; # which record in the file - row's primary key
my $xyzzy = shift; # which column in that record
$logic->openflatfile($file) unless(exists $filecache->{$file});
return $filecache->{$file}->{$thingy}->{$xyzzy};
},
keys => sub {
my $rec = $filecache;
while(@_) {
$rec = $rec->{$_[0]}; shift;
}
if(wantarray) {
keys %{$rec};
} else {
$rec;
}
},
set => sub {
my $file = shift; # which file the data is in
my $thingy = shift; # which record in the file - row's primary key
my $x = shift; # which column in that record
my $val = shift; # new value to store there
$filecache->{$file}->{$thingy}->{$x} = $val;
$buffered->{$file}->{$thingy}->{$x} = $val;
1;
},
close => sub {
my $file = shift; # which file the data is in
my $thingy; # which record in the file - row's primary key
my $x; # which column in that record
my $val; # new value to store there
my $line; # one line of output to the file
open my $f, '>>'.$file or die "$! file: $file";
foreach $thingy (keys %{$buffered->{$file}}) {
$line = $thingy;
foreach $x (keys %{$buffered->{$file}->{$thingy}}) {
$line .= '|' . $x . '=' . $buffered->{$file}->{$thingy}->{$x};
}
print F $line, "\n";
}
$buffered->{$file} = ();
close $f;
},
recreate => sub {
my $file = shift; # which file the data is in
my $thingy; # which record in the file - row's primary key
my $x; # which column in that record
my $val; # new value to store there
my $line; # one line of output to the file
open my $f, ">$file.$$" or die "$! file: $file.$$";
foreach $thingy (keys %{$filecache->{$file}}) {
$line = $thingy;
foreach $x (keys %{$filecache->{$file}->{$thingy}}) {
$line .= '|' . $x . '=' . $filecache->{$file}->{$thingy}->{$x};
}
print $f $line, "\n";
}
close F;
rename "$file.$$", $file or die "$! on rename $file.$$ to $file";
},
} , $pack;
}
To use, do something like:
use Xfor;
my $hash = new Xfor;
$hash->open('carparts.nvp');
# read:
$hash->get('carparts.nvp', 'xj-11', 'muffler'); # which muffler does the xj-11 use?
# write:
$hash->set('cartparts.nvp', 'xj-11', 'muffler', 'c3p0');
# then later:
$hash->close('carparts.nvp');
# or...
$hash->recreate('carparts.nvp');
Xfor.pm reads files from beginning to end, and goes with the last value discovered. This lets us write by kind-of journeling: we can just tack updated information on to the end. we can also regenerate the file with only the latest data, upon request. Since we read in all data, we're none too speedy. Reading is as slow as Storable or the like, but writing is much faster.
# go out of our way to include sid=$sid:
print qq{<a href="otherprog.cgi?foo=bar&color=red&sid=$sid">Go To Otherprog</a>};
print qq{
<form action="anotherprog.cgi" method="post">
<input type="hidden" name="sid" value="$sid">
Enter answer: <input type="text" name="answer"><br>
<input type="submit">
</form>
};
Forgetting to do this in even one link or form causes the site to forget any and all information about a user as soon as they click it. Additionally, since the sessionid is part of the HTML, it lives on in the browser cache. For this reason, session id tokens should be expired after a period of time by the server. This means having the server simply record the date that it issued a session id number and refusing to honor it after a period of time has elapsed, forcing the user to re-login.
$oOo =~ s/<(a|frame)([^>]*) (src|href)=(['"]?)(?!javascript)([^'"]+\.htm)(l)?(\?)?([^'">]*)?\4(?=\w|>>)/<$1$2 $3="$5$6\?$8\&sid=$sid"/ig;
# $1: 'a' or 'frame'
# $2: any random name=value pairs (exa 'name="mainFrame"')
# $3: 'src' or 'href'
# $4: any begin qouting character, be it ' or "
# $5: whatever.htm
# $6: optional 'l'
# $7: optional '?' (discarded)
# $8: optional cgi get string
# $9: 0-width lookahead assertion: > or space isn't matched but is looked for
# Sample validateuser.pm: use CGI; use CGI::Carp qw/fatalsToBrowser/; use DBI; use lib "/home/scott/cgi-bin/DBD"; BEGIN { $dbh = DBI->connect("DBI:Pg:dbname=sexcantwait;host=localhost;port=5432", 'scott', 'pass') or die $DBI::errstr; } use TransientBaby::Postgres; use TransientBaby; createquery('validateuser', qq{ select UserID as userid from Users where Name = [:username:] and Pass = [:userpass:] }); sub validated { $userid = -1; my $sid=CGI::cookie(-name=>"sid"); return 0 unless $sid; ($username, $userpass) = split /,/, $sid; validateuser(); return $userid == -1 ? 0 : 1; } sub is_guest { return $username =~ /^guest/; } sub offer_login { print qq{ Sorry, you aren't logged in. Please enter your name and password:<br><br> <form action="login.cgi" method="post"> <input type="hidden" name="action" value="login"> User name: <input type="text" name="username"><br> Password: <input type="password" name="password"><br> Are you a new user? <input type="checkbox" name="newuser"><br> <input type="submit" value="Log in"><br> </form> }; exit 0; } 1;Instead of declaring a package and using Exporter, we're merely continuing to operate in the namespace of the module that invoked us. The methods we define - validated(), validateuser(), offer_login() and is_guest() show up in their package, ready for use. As a side effect, we're using CGI.pm and DBI.pm on behalf of our caller, letting us list all of the modules we want in only one place, rather than in every .cgi script. This module could be used with:
print qq{Content-type: text/html\n\n};
use validateuser;
validated() or offer_login();
# rest of the script here, for users only
offer_login() never returns once we call it. It handles exiting the script for us.
#!/usr/bin/perl
# example login/create user script that uses validateuser.pm.
# this should be named login.cgi to match the form in validateuser.pm, unless of course
# that form's action is changed.
use validateuser;
createquery('userexists', qq{
select count(*) as num
from Users
where Users.Name = [:name:]
});
createquery('createuser', qq{
insert into Users
(Name, Pass, CreationIP)
values
([:name:], [:pass:], [:creationip:])
});
my $action = CGI::param('action');
my $newuser = CGI::param('newuser');
if(!$action) {
offer_login();
} elsif($action eq 'login' and !$newuser) {
$username = CGI::param("username");
$userpass = CGI::param("userpass");
validateuser();
if($userid != -1) {
my $cookie=CGI::cookie(
-name=>'sid', -expires=>'+18h', -value=>qq{$username,$userpass},
-path=>'/', -domain=>'.sexcantwait.com'
);
print CGI::header(-type=>'text/html', -cookie=>$cookie);
print qq{Login successful.\n};
} else {
sleep 1; # frustrate brute-force password guessing attacks
print qq{Content-type: text/html\n\n};
print qq{Login failed! Please try again.<br>\n};
offer_login();
}
} elsif($newuser and $action eq 'login') {
local $name = CGI::param("username");
local $pass = CGI::param("userpass");
userexists(); if($num) {
print qq{User already exists. Please try again.<br>\n};
offer_login();
}
local $creationip = $ENV{REMOTE_ADDR};
createuser();
validateuser(); # sets $userid
print qq{Creation successful! Click on "account" above to review your account.<br>\n};
}
These examples make heavy use of my TransientBaby.pm module. That module creates recursive routines that communicate using global variables - ick. I need to change that, and then this example. Then I'll put that code up. XXX.
<gogamoga> well, i`ll ask: how do i fetch attached file from the query?
<scrottie> ask to ask?
<Perl-fu> Don't ask to ask. Don't ask if anybody can help you with x.
Just ask! Omit any irrelevant details. If nobody answers then we
don't know or are busy for a few minutes. Wait and don't bug us.
If you must ask again wait until new people have joined the channel.
<scrottie> my $fh = CGI::upload($fn); my $buffer; while (read($fh,$buffer,length($buffer)) { };
<scrottie> where $fn is the name of the CGI param. make sure the from has the right enctype.
<scrottie> i don't remember the enctype, but "perldoc CGI" will tell you
<scrottie> unless the form uses that special enctype, file uploads won't be uploaded, rather mysteriously
<gogamoga> THANK YOU SOOOOOOOOO MUCH
<gogamoga> i got lost in cgi.pm reference :(
<scrottie> heh, you're welcome. let me know if you get stuck.
<scrottie> yeah, someone really needs to slim that down.
<gogamoga> i use only jpg enctype so i wont even check it
<gogamoga> just fetch the file and save it
<scrottie> you don't understand.
<scrottie> hang on. let me find it.
<gogamoga> ok
<scrottie> if your form doesn't say
<form method="post" enctype="multipart/form-data">, then
<input type="file"> tags wont work. they won't upload the file.
<scrottie> reguardless of the type of the file, the file won't be uploaded.
<scrottie> Netscape 2 introduced the ability to upload files, and in order to
support this feature, they had to introduce a
new format for sending data to the server - the old
application/x-www-form-urlencoded one couldn't handle large
blocks of arbitrary data
<gogamoga> ah
<gogamoga> damn, it wont upload it but it still takes ages as it uploads it :)
<gogamoga> ah, sorry i am dumb
<scrottie> no, we all have to work through the standard mistakes ;)
<gogamoga> dreamweaver adds multipart/form-data by default
<gogamoga> :)
<scrottie> good. no one uses Netscape 1 anymore ;)
use TransientBaby::Forms; use TransientBaby; my $accessor; my %opts; my @table; my $tablerow; my $tablecol = -1; parse_html($document, sub { $accessor = shift; %opts = @_; if($opts{tag} eq 'tr') { # create a new, blank array entry on the end of @table $tablerow++; $table[$tablerow] = []; $tablecol = 0; } elsif($opts{tag} eq 'td') { # store the text following the <td> tag in $table[][] $table[$tablerow][$tablecol] = $accessor->('trailing'); $tablecol++; } });I've gone out of my way to avoid the nasty push @{$table[-1]} construct as I don't feel like looking at it right now. $tablerow and $tablecol could be avoided otherwise. This code watches for HTML table tags and uses those to build a 2 dimentional array.
select table1.a, table2.b, table3.c from table1, table2, table3 where table1.id = table2.id and table2.param = table3.id order by table1.a, table2.b, table3.cWe can't recover the id or param fields from the output of this query, but we can generate our own.
aaa aab aac aad aba aca ada baa bab (And so on...)Add this clause to the if statement in the sub passed to parse_html() above, remembering to declare the introduced variables in the correct scope:
} elsif($opts{tag} eq '/tr') {
if(!$tablerow or $table[$tablerow][0] ne $table[$tablerow-1][0]) {
$dbh->execute("insert into tablea (a) values (?)", $table[$tablerow][0]);
$table_a_id = $dbh->insert_id();
# else $table_a_id will retain its value from the last pass
}
if(!$tablerow or $table[$tablerow][1] ne $table[$tablerow-1][1]) {
$dbh->execute("insert into tableb (b, id) values (?, ?)", $table[$tablerow][1], $table_a_id);
$table_b_id = $dbh->insert_id();
# else $table_b_id will retain its value from the last pass
}
if(!$tablerow or $table[$tablerow][2] ne $table[$tablerow-1][2]) {
$dbh->execute("insert into tablec (c) values (?, ?)", $table[$tablerow][1], $table_b_id);
$table_c_id = $dbh->insert_id();
# else $table_c_id will retain its value from the last pass
}
}
This code depends on $dbh being a properly initialized database connection. I'm using ->insert_id(), a MySQL extention, for clarity. Unlike the previous code, this code is data-source specific. Only a human looking at the data can deturmine how best to break the single table up into normalized, relational tables. We're assuming three tables, each having one column, aside from the id field. Assuming this counting pattern, we insert records into tablec most often, linking them to the most recently inserted tableb record. tableb is inserted into less frequently, and when it is, the record refers to the most recently inserted record in tablea. When a record is inserted into tablea, it isn't linked to any other records.
{
local $/ = undef;
open FH, "<$file";
$data = <FH>;
close FH;
}
Pros: Everyone seems to know this one. Reads in entire file in one gulp without an array intermediary. Cons: $data cannot be declared with my because we have to create a block to localize the record seperator in. Ugly. @ARGV = ($file); my $data = join '', <>;Pros: Short. Sweet. Cons: Clobbers @ARGV, poor error handling, inefficient for large files.
my $data = `cat $file`;Pros: Very short. Makes sense to sh programmers. Cons: Secure problem - shell commands may be buried in filenames. Creates an additional process - poor performance for files small and large. No error handling. Is not portable.
open my $fh, '<', $file or die $!; read $fh, my $data, -s $fh or die $!; close $fh;Pros: Good error handling. Reasonably short. Efficient. Doesn't misuse Perl-isms to save space. Uses lexical scoping for everything. Cons: None.
use Sys::Mmap; new Mmap my $data, -s $file, $file or die $!;Pros: Very fast random access for large files as sectors of the file aren't read into memory until actually referenced. Changes to the variable propogate back to the file making read/write, well, cool. Cons: Requires use of an external module such as Sys::Mmap, file cannot easily be grown. Difficult for non-Unix-C programmers to understand.
require 'config.pl';We've all seen it a million times. It's as old as Perl itself. You make a little Perl program that does nothing but assign values to variables. Users can "easily" edit the file to change the values without having to wade through your source code. It is extremely easy to read configuration files of this format, and you can store complex datastructures in there, along with comments.
# config.pl:
$config = {
widgets=>'max',
gronkulator=>'on',
magic=>'more'
};
# configTest.pl:
use Data::Dumper;
require 'config.pl';
$config->{gronkulator} = 'no, thanks';
open my $conf, '>config.pl' or die $!;
print $conf Data::Dumper->Dump($config);
close $conf;
Data::Dumper.pm comes with Perl, and can even store entire objects. In fact, it can store networks of object. $dumping = "xterm -display $display";You could (if you wanted) make that a closure. That would let you use the multiple arg version of system(), which is good security practice, and the closure would bind to my variables, so if the config changes at run time, they change there too.
$dumping = sub { system 'xterm', $arg, $arg; };
XXX CategoryToDo - dumping active config using B::Deparse open my $f, 'file.txt' or die $!;or die should litterally dot your code. Thats how you communicate to Perl and your readership that it is not okey for the statement to silently fail. Most languages make such error geeration default; in Perl, you must request it. This is no excuse for allowing all errors to sneak by silently.
# from the Fatal.pm perldoc:
use Fatal qw(open close);
sub juggle { . . . }
import Fatal 'juggle';
Fatal.pm will place wrappers around your methods or Perl built in methods, changing their default behavior to throw an error. A module which does heavy file IO on a group of files need not check the return value of each and every open(), read(), write(), and close(). Only at key points - on method entry, entry into worker functions, etc - do you need to handle error conditions. This is a more reasonable request, one easily acheived. Should an error occur and be cought, the text of the error message will be in $@.
use Fatal qw(open close read write flock seek print);
sub update_data_file {
my $this = shift;
my $data = shift;
my $record;
local *filename = \$this->{filename};
local *record = \$this->{record};
eval {
open my $f, '>+', $filename;
flock $f, 4;
seek $f, $record, 0;
print $f, $data;
close $f;
};
return 0 if $@; # update failed
return 1; # success
}
Alternatively, rather than using eval { } ourselves, following AssertPattern, we could trust that someone at some point installed a DIE handler. The most recently installed local handler gets to try to detangle the web.
sub generate_report {
local $SIG{__DIE__} = {
print "Whoops, report generation failed. Tell your boss it was my fault. Reason: ", @_;
}
foreach my $i ($this->get_all_data()) {
$data->update_data_file($i);
}
}
sub checkpoint_app {
local $SIG{__DIE__} = {
print "Whoops, checkpoint failed. Correct problem and save your data. Reason: ", @_;
}
$data->update_data_file($this->get_data());
}
Using local scoped handlers this way allows you to provide context-sensitive recoverory, or atleast diagnostics, when errors are thrown. This is easy to do and all that is required to take full advantage of Fatal.pm. Fatal.pm was written by Lionel.Cons@cern.ch with prototype updates by Ilya Zakharevich ilya@math.ohio-state.edu.
RETRY:
eval {
alarm 30; # send a $SIG{ALRM} after 30 seconds - default is death
# do something that might time-out
alarm 0; # disable alarm
};
if($@) {
# there was an error - error text is in $@ - do what you will - perhaps retry:
goto RETRY;
}
select() provides an alternative for timeouts on I/O, and is especially safe when coupled with non-blocking I/O. See SelectPollPattern.
# don't do this
sub barf {
print "something went wrong!\n", @_;
exit 1;
}
# ...
barf("number too large") if($number > $too_large);
die() has a useful default behavior that depends on no external modules, but can easily be overriden with a handler to do more complex cleanup, reporting, and so on. If you don't use die(), you can't easily localize which handler is used in a given scope.
# send diagnostic output to the end of a log
open my $debug, '>>bouncemail.debug';
$SIG{__WARN__} = sub { print $debug $_, join(" - ", @_); };
$SIG{__DIE__} = sub { print $debug $_, join(" - ", @_); exit 0; };
Some logic will want to handle its own errors - some times a fatal condition in one part of code doesn't really matter a hill of beans on the grand scale of the application. A command line print utility may want to die if the printer is off line [or cought fire - Linux will print "lp: /dev/lp0 on fire!" under some circumstances. See http://www.dorje.com:8080/netstuff/jokes/prog.errors for a great list of error messages.] - a word processor probably does __not__ want to exit with unsavedchanges merely because the document couldn't be printed. So, do this:
local $SIG{__DIE__} = sub {
# yeah, whatever
};
# or...
local $SIG{__DIE__} = 'IGNORE';
...or, do the error processing of your choice. Perhaps set a lexically bound variable flag - see LexicalsMakeSense. # intercept death long enough to scream bloody murder $version = '$Id: ErrorReporting,v 1.20 2003/05/15 09:58:41 phaedrus Exp $'; # CVS will populate this if you use CVS $SIG{qq{__DIE__}} = sub { local $SIG{qq{__DIE__}}; # next die() will be fatal my $err = ''; $err .= "$0 version $version\n\n"; # stack backtrace $err .= join "\n", @_, join '', map { (caller($_))[0] ? sprintf("%s at line %d\n", (caller($_))[1,2]) : ''; } (1..30); $err.="\n"; # report on the state of global variables. this includes 'local' variables # and 'our' variables in scope. see PadWalker for an example of inspecting # lexical 'my' variables as well. foreach my $name (sort keys %{__PACKAGE__.'::'}) { my $value = ${__PACKAGE__.'::'.$name}; if($value and $name ne 'pass' and $name =~ m/^[a-z][a-z0-9_]+$/) { $err .= $name . ' ' . $value . "\n" } } $err .= "\n"; foreach my $name (sort keys %ENV) { $err .= $name . ' ' . $ENV{$name} . "\n"; } $err .= "\n"; # open the module/program that triggered the error, find the line # that caused the error, and report that. if(open my $f, (caller(1))[1]) { my $deathlinenum = (caller(1))[2]; my $deathline; # keep eof() from complaining: <$f>; $deathline = <$f> while($. != $deathlinenum and !eof); $err .= "line $deathline reads: $deathline\n"; close <$f>; } # send an email off explaining the problem # in text mode, errors go to the screen rather than by email require Mail::Sendmail; sendmail(To=>$errorsto, From=>$mailfrom, Subject=>"error", Message=>$err) unless($test); print "<pre>\n", CGI::escapeHTML($err), "</pre>\n" if($test); # reguardless, give the user a way out. in this case, we display what was in their # shopping cart and give them a manual order form that just sends an email, and we # call them back for payment info. $|=1; # print "Those responsible for sacking the people that have just been sacked, have just been sacked.<br><br>\n"; print "A software error has occured. Your order cannot be processed automatically. "; print "At the time of the error, your cart contained:<br><br>\n"; open my $cart, $cartdir.$sid; print "$_<br>\n" while(<$cart>); print qq{ <script language="javascript"> window.open($errororderpage); </script> }; close $cart; # finally, give up exit 0; };A software error has occured. Give the user an out. I wish I could remember what book this was from - the St. Thomas University library in St. Paul, Minnesota had it, but the author quoted a conversation that went something like...
# the config.pl file defines @listeners to contain a list of class names # that should receive notices from an EventListener broadcaster, # referenced by $broadcaster. require 'config.pl'; foreach my $listener (@listeners) { require $listener; my $list_inst = $listener->new(); $broadcaster->add_listener($list_inst); }See EventListener for the broadcaster/listener idiom. This avoids building the names of listener modules into the application. An independent author could write a plug-in to this application: she would need only have the user modify config.pl to include mention of the plug-in. Of course, modification of config.pl could be automated. The install program for the plug-in would need to ask the user where the config.pl is, and use the ConfigFile idiom to update it.
pacakge UserExtention1; # we are expected to have a "run_macro" method sub run_macro { my $this = shift; my $app = shift; $app->place_cursor(0, 0); $app->set_color('white'); $app->draw_circle(radius=>1); $app->set_color('red'); $app->draw_circle(radius=>2); # and so on... make a little bull's eye return 1; }The main application could prompt the user for a module to load, or load all of the modules in a plug-ins directory, then make them available as menu items in an "extentions" menu. When one of the extentions are select from the menu, a reference to the application - or a FacadePattern providing an interface to it - is passed to the run_macro() method of an instance of that package.
place_cursor(0, 0)
set_color(white)
draw_circle(radius=1)
set_color(red)
draw_circle(radius=2)
A few options exist: we can compile this directly to Perl bytecode using B::Generate (suitable for integrating legacy languages without performance loss), or we can munge this into Perl and eval it. Lets turn it into Perl.
# read in the users program
my $input = join '', <STDIN>;
# 0 if we're expecting a function name, 1 if we're expecting an argument,
# 2 if we're expecting a comma to seperate arguments
my $state = 0;
# perl code we're creating
my $perl = '
package UserExtention1;
sub run_macros {
my $this = shift;
my $app = shift;
';
while(1) {
# function call name
if($state == 0 && $input =~ m{\G\s*(\w+)\s*\(}cgs) {
$perl .= ' $app->' . $1 . '(';
$state = 1;
# a=b style parameter
} elsif($state == 1 && $input =~ m{\G\s*(\w+)\s*=\s*([\w0-9]+)}cgs) {
$perl .= qq{$1=>'$2'};
$state = 2;
# simple parameter
} elsif($state == 1 && $input =~ m{\G\s*([\w0-9]+)}cgs) {
$perl .= qq{'$1'};
$state = 2;
# comma to seperate parameters
} elsif($state == 2 && $input =~ m{\G\s*,}cgs) {
$perl .= ', ';
$state = 1;
# end of parameter list
} elsif(($state == 1 || $state == 2) && $input =~ m{\G\s*\)}cgs) {
$perl .= ");\n";
$state = 0;
# syntax error or end of input
} else {
return 1 unless $input =~ m{\G.}cgs;
print "operation name expected\n" if $state == 0;
print "parameter expected\n" if $state == 1;
print "comma or end of parameter list expected\n" if $state == 2;
return 0;
}
}
$perl .= qq<
return 1;
}
>;
eval $perl; if($@) {
# display diagnostic information to user
}
We're using the \G regex metacharacter that matches where the last global regex on that string left off. That lets us take off several small bites from the string rather than having to do it all in one big bite. The flags on the end of the regex are:
$money = $player->query_money();
if($player->query_max_money() < $x + $payout) {
$player->set_money($money + $payout);
$nickels_on_floor = 0;
} else {
$nickels_on_floor = $money + $payout - $player->query_max_money();
$player->set_money($player->query_max_money());
}
No matter which way we make the set_money() function work, we're doomed. If it enforces a ceiling, then we have to query again to see if the ceiling was enforced. If it doesn't enforce a ceiling, then we have to check each and every time we access the value and enforce it ourselves. The result is one or two of these sequences of logic will get strewn around the program. The problem is that the client needs something slightly more complex than the server is prepared to provide. We could, and perhaps should, make the object we're calling, $player, return an array, including success or failure, how much money actually changed hands, how much more they could carry. This would go with the approach of providing way more information than could ever be needed. This leads to bloated code and logic that we aren't sure whether or not is actually being used, leading to untested code going into production and becoming a time-bomb for the future, should anyone actually start using it. Less dramatically, we could modify the target object to return just one more piece of information when we realize we need it. This leads to a sort of feature envy, where the server is going out of its way to provide things in terms of a certain clients expectations, causing an API that is geared towards a particular client and incomprehensible to all else. Also, there is temptation to write something like: package Util;Beware of Utility, Helper, Misc, etc packages. They collect orphan code. The pressure to move things out of them is very low, as they all seem to fit by virtue of not fitting anywhere else. They grow indefinitely in size because the class of things that don't seem to belong anywhere is very large. The effect snowballs as the growth of other objects is stymied while the "Utility" package booms.
package Casino; use ImplicitThis; ImplicitThis::imply(); sub pay_out { # this method would go in $player, since it is mostly concerned with $player's variables, # but we don't want to clutter up $player's package, and we don't know if anyone else # will ever want to use this code. my $player = shift; my $payout = shift; my $money = $player->query_money(); if($player->query_max_money() < $money + $cost) { $player->set_money($money + $payout); $nickels_on_floor = 0; } else { $nickels_on_floor = $money + $payout - $player->query_max_money(); $player->set_money($player->query_max_money()); } }Associating methods with our client object that reasonably belong in the server object ($player, in our case), isn't always the worst solution. In fact, if you put them there and leave them until it is clear that they are needed elsewhere, you'll find that either they are globally applicable, they only apply to this client, they apply to a special case of the client, or they apply to a special case of the server.
10 let a=a+1
20 if a > 10 then goto 50
30 print a:print "\n"
40 goto 10
50 stop
foreach my $a (1..10) {
print "$a\n";
}
Despite the systematic banishment of these languages *, people still find ways to write code that has this problem on a large scale:
while(1) {
if(@queue) {
dosomething();
}
}
This example applies to threaded code, but non threaded code can fall prey as well:
while(! -f $file) { }
# do something with $file here
Both of these examples attempt to use 100% CPU resources. In the best case, you make everything sluggish. Worst case, you never release the CPU so the thing you're waiting for happens. On different systems, the results are different, too. Some threads preempt. Others never take the CPU back until you call yield() or an IO function! Writing code like this is the quickest way to make things work on some systems but not others. # this program attempts to use 100% of CPU time use IO::Socket::INET; my $sock = IO::Socket::INET->new(PeerAddr => 'www.perl.org', PeerPort => 'http(80)', Proto => 'tcp'); $sock->blocking(0); while(1) { read $sock, my $buffer, 8192; do_something_with_data($buffer); }The program should sleep, waiting for data to arrive, rather than looping constantly and trying over and over again. See SelectPollPattern for a solution using the select() call.
while(<$fh>) {
print;
}
This may print zero length strings sometimes, but no one will ever know. while(<$fh>) continues looping.
my $waketime = scalar(time()) + 60*60*8.5; # longer on the weekends
while(scalar time() < $waketime) {
sleep $waketime - scalar time(); # sleep the rest of the duration - probably
}
DebuggingPattern has a tiny example of dumping stack when a signal comes in.
Malak tells you: wee! :-) ok here is the question. if i have two
copies of a script downloading the same set of files (to make it go faster) i
want to make sure that one script doesnt try to download the same file as the
other. right now i'm using a -e check to see if the file exists but im not
sure if this will ever cause a problem if both scripts happen to hit the same
file at the same time
Yes, there is a race condition between the time that you test for the file
with //-e// and when you create the file with //open()//. It could well
happen that you test to see if the file, is there, it isn't, you go to
open it for write and over write another process.
if(! -e $file) {
open my $f, '>', $file;
download($f);
}
You tell Malak: yes. use sysopen(). open for write but not create. if
it returns error status, the race condition bit ya, move on to the next file
Malak tells you: not sure if i can do that. im calling an external
program to actually do the download...
You tell Malak: why don't you use threads, then? then you can create
a hash that is shared between all threads and use it to do locking
use threads::shared;
my :shared %locks;
Malak tells you: i wonder if the race condition matters though...
which ever process finishes downloadig last should write the file and replace
whatever the other file wrote, right?
You tell Malak: yeah
Malak tells you: i dont care if that happens, all i care about is
that no files get corrupted, seemingly downloaded good when they arent
You tell Malak: actually, on unix, what would happen is the same
would be downloaded, twice, at the same time, but only one of the inodes would
actually exist on the filesystem, so when the other processed closes its
filehandle, the filesystem will deallocate the blocks
#!/usr/bin/perl
print "Content-type: image/gif\n";
print "Pragma: no-cache\n";
print "\n";
my $pid = $$; # our pid, not the pid of some shell or something
umask 000;
local $ENV{PATH} = '/usr/local/bin';
open my $f, '+<', 'counter';
flock $f, 2;
$counter=<$f>;
$counter++;
seek $f, 0, 0;
printf $f $counter + "\n";
close $f;
system "pbmtext $counter | pnmcrop 2>/dev/null | pnmenlarge 3 > counter10.$pid.pbm 2>/dev/null ";
for(my $i=10;$i<30;$i++) {
my $j = $i + 1;
system "pbmlife counter$i.$pid.pbm > counter$j.$pid.pbm 2>/dev/null";
}
open my $gif, "ppmtogif -delay 40 -loop 100 counter??.$pid.pbm 2>/dev/null|";
while(read $gif, my $buf, 1024, 0) {
print $buf;
}
close $gif;
# this isn't working :(
for(my $i=10;$i<31;$i++) {
unlink "counter$i.$pid.pbm";
}
Didn't anyone ever tell you web-page hit counters were useless? They don't count number of hits, they're a waste of time, and they serve only to stroke the writer's vanity. It's better to pick a random number; they're more realistic.
$hits = int( (time() - 850_000_000) / rand(1_000) );
If the count doesn't impress your friends, then the code might. :-) When several processes are reading the current value (as it stands at any given moment), and one process is independently generating and storing new values, file I/O still has a race condition where the file may be null, between the time the file is truncated and the new data written. This also requires locking. NetPBM has an example of a multi-player Life game, where locking is not needed. Single bits are modified in the Sys::Mmap 'd image during any hit, and the current board is displayed. Since random memory access is being used rather than file I/O, truncated files aren't a concern. SQL engines want something like this, but the problem is far more complex. They must use generational locks, where each "update" or "insert" represents a generation. Only records marked at or earlier than the current generation at the time a query is started are returned in a query. "update" must add a new record with a newly incremented generation number before removing the old one in order to let currently executing queries run without garbled results. This arrangement lets one "insert"/"update" or other write operation happen at the same time as an arbitrary number of queries. Generational systems like this are also used in garbage collection, to avoid race conditions between the thread that is collecting unreferenced memory and the running program.
# procedural code gives exactly one possible definition for a function call -
# the code and data are mixed all over the place
sub do_a { $thing->[0]++; }
sub do_b { $thing->[1]++; }
sub do_c { $thing->[2]++; }
sub do_stuff {
my $thing = shift;
do_a($thing);
do_b($thing);
do_c($thing);
}
Because any data access methods are handled in functional way, the data is dependent on the module and can't be returned or used seperately.
sub do_a { $thing->[0]++; }
sub do_b { $thing->[1]++; }
sub do_c { $thing->[2]++; }
sub do_a_neg { $thing->[0]--; }
sub do_b_neg { $thing->[1]--; }
sub do_c_neg { $thing->[2]--; }
sub do_stuff {
my $thing = shift;
if($thing->[0] < 0) {
} else {
do_a($thing);
}
do_b($thing);
do_c($thing);
}
This is what has to be untangled [strange that we don't talk about untangling it]
# object oriented code allows method calls to be defined any number of different ways -
# the data has its own code
package ThingOne;
sub new { bless [], shift(); }
sub do_a { my $thing = shift; $thing->[0]++; }
sub do_b { my $thing = shift; $thing->[1]++; }
sub do_c { my $thing = shift; $thing->[2]++; }
package ThingTwo;
sub new { bless [], shift(); }
sub do_a { my $thing = shift; $thing->[0]--; }
sub do_b { my $thing = shift; $thing->[1]--; }
sub do_c { my $thing = shift; $thing->[2]--; }
package main;
sub do_stuff {
my $thing = shift;
$thing->do_a();
$thing->do_b();
$thing->do_c();
}
my $t1 = new ThingOne();
do_stuff($t1);
my $t2 = new ThingTwo();
do_stuff($t2);
Refactoring - Or, Getting From Here to There
my $treasure_location_x = 3;
my $treasure_location_y = 10;
my $treasure_value = 1000;
my $treasure_weight = 30;
# is better written:
my $treasure = {};
$treasure->{location}->{x} = 3;
$treasure->{location}->{y} = 10;
$treasure->{value} = 1000;
$treasure->{weight} = 30;
# or even:
($treasure->{location}->{x}, $treasure->{location}->{y}, $treasure->{value},
$treasure->{weight}) = (3, 10, 1000, 30);
There is almost never a time to use symbolic references. Unless you are doing meta-muddling across modules, use hashes instead. You can compute a hash index just as easily as a name table entry, and two different hashes will never step on each other.
${$varname} = $value;
# is better written:
$table->{$varname} = $value;
Promote References to Objects my @array = (1, 2, 3); $array[2]++; # all occurances of @array manually changed: my $array = new Array::Wrapper(); # our container for something that was an array my $value = $array->get_element(2); $array->set_element(2, $value+1);Failing to rewrite all code that accesses this variable to use the object oriented syntax violates I
# in your code: my @array; my $array = new Array::Wrapper(\@array); $array[3] = 10; # this works $array->set(3, 10); # so does this # in Array::Wrapper.pm: sub new { my $type = shift; my $self = shift; bless $self, $type; } sub get { my $self = shift; my $index = shift; return $self->[$index]; } sub set { my $self = shift; my $index = shift; $self->[$index] = shift; return $self; }Array::Wrapper::new() blesses an existing array reference into its package. The object returned and the existing array both represent the same data: changes to one are mirrored by the other. Subclass C<Array::Wrapper> and add more methods to do things specific to a given datastructure, or just copy the code and modify it.
Oregon Health & Science University holds a programming competition every year: the International Functional Programming Competition. http://icfpcontest.cse.ogi.edu/ has details. Last years competition involved writing a client application to control a robot in a multi user maze. The robot had to find and deliver packages, avoid water, and interact with other robots. I got a late start, having missed a day for lack of seeing the announcement in time, and fresh from studying object orientation, spent two dreary sleepless days writing Perl. The code is quick, dirty, and ugly. It was hastily written and serves as a fine example of how not to write Perl. Mired down with a mesh of arrays, and without time to rewrite code, I shoehored objects on top of the arrays. Reality tends to be closer to a programming game than anything you'd call "real".
# At the top of the main program:
my @treasure; # [x, y, bot, weight, destx, desty]
# Logic that generates a status report with map, written before I knew what
# I was getting into:
foreach my $i (grep {ref $_} @treasure) {
push @objs, [$i->[0], $i->[1], 't'] unless $i->[2];
}
# In logic that reads updates from server, I was overcome with complexity,
# created a helpful wrapper, and blessed my array into it:
my $treasure = new Treasure::Chest(\@treasure) or die;
# ... later in same area...
$treasure->get($t)->set_bot($bots[$curbot]);
# In logic that considers possible actions:
foreach my $i ($treasure->by_bot($bot)) {
my $dropoff = $i->get_dropoff();
my @coords = ($bot->[0], $bot->[1], $dropoff->x(), $dropoff->y());
my $route = [$dropoff, sub { plot_route(@coords) }, undef ];
$route->[2] = approx_cost(@coords);
$route->[2] -= 100 if($cap < $carry / 2);
$route->[2] -= 100 if($cap < $carry / 3);
$route->[2] -= 100 if($cap < $carry / 4);
push @routes, $route;
};
# In another possible action agent:
@goodies = SortPackages::go(
$treasure, $blockbuilder, $carry-$bot->[2], $bot->[0], $bot->[1]
);
# Deliver everything destined for this spot:
foreach my $i ($treasure->by_bot($bot)) {
if($i->destx == $bot->[0] and $i->desty == $bot->[1]) {
my $id = $i->id();
return [90, sub {
$treasure->zero($id);
"1 Drop $id"
}];
}
}
[XXX rewrite this mixed bag as a consistent OO example of going back and polishing the code if so desired] my $treasure = new Treasure::Chest(\@treasure) or die;... is the heart of this trick. An object is a blessed reference. @treasure can't be blessed because it isn't a reference. We, can, however take a reference to that. We could say bless \@treasure, Treasure::Chest, but it is better to leave those details to Treasure::Chest itself. We just pass the array off there for blessing and immediate return.
$fooLogic->doAnotherThing($stuff, $morestuff);
# or...
$fooLogic->doAnotherThing($stuff->{some_thing}, $morestuff->{some_other_thing});
$blockbuilder->plot_route($bot, $i);
# or...
$blockbuilder->plot_route($bot->[0], $bot->[1], $i->[0], $i->[1]);
# or a mixture...
@goodies = SortPackages::go(
$treasure, $blockbuilder, $carry-$bot->[2], $bot->[0], $bot->[1]
);
Go with your instricts, but be prepaired to change your mind later. To paraphrase Arthur Riel, sometimes awkward interfaces are new abstractions waiting to be discovered. Large numbers of seemingly unreleated things passed together might suggest some unseen relationship that can be used to further organize the data. Don't sweat it though.
# other functions in the god object still think plot_route() is a local
# function rather than a method in another object
sub plot_route {
my $sx = shift; my $sy = shift;
my $fx = shift; my $fy = shift;
# logic moved into $blockbuilder object
return $blockbuilder->plot_route($sx, $sy, $fx, $fy);
}
Perl throws a monkey wrench into the works with the requirement that object instance data (data that is independent from one object of that type to another) be accessed with a funky syntax. This effectively prevents cutting and pasting code out from the former God Object into a clean, new object.
# square foo when foo is a regular variable:
$foo = $foo * $foo;
# square foo when foo is an object instance variable:
$this->{'foo'} = $this->{'foo'} * this->{'foo'};
See Alias.pm or ImplicitThis.pm on CPAN, or live on the wild side of life (syntactically speaking) and make your own alias for $foo:
sub setvalue {
no strict 'refs';
my $this = shift;
local *foo = \$this->{foo};
$foo = $foo * $foo;
}
Don't tell anyone you got this from me, what ever you do. I'd never hear the end of it.
sub new {
my $class = shift;
my $hashref = shift;
my $self = {
foo => 'bar',
num => 10,
hashref => $hashref,
};
bless $self, $class;
}
sub dog_years {
my $self = shift;
my $age = shift;
# no need to take $hashref as an argument, we have it stashed in $self
return $age * $self->{hashref}->{'dog_factor'};
}
This hash or array reference will be passed to the constructor call of that object. It will be passed exactly once instead of for every call into that object. This is not always the good and correct thing to do - when not frequently used by that object, it is best to pass data only when it is needed.
sub new {
my $class = shift;
my %options = @_;
my $self = {
foo => 'bar',
num => 10,
thing => $options{'thing'},
};
bless $self, $class;
}
Godless Objects # which bot is holding treasure $x, if any? $treasure = $treasure_chest->get($x)->bot(); # which map region is bot $green standing on? $block = $block_builder->bounded_by($bots->get($green)->x(), $bot->get($green)->y()); # what objects stand between us and where we want to be? @route = $block->solve_maze($x, $y);Build healthy relationships between objects:
die "Invalid" unless $obj->isa('FOO');
This will let BARs by, if BAR inherits from FOO, or if FOO is an AbstractClass that BAR implements. It is much more forgiving, and it correctly handles the idea that types are complex structures, composed of other types.
my $appointment = $sunday->query_scheduler()->schedule_appointment($sunday, '9:00am');
if(!$appointment) {
warn "failed to book appointment. sorry, boss.";
}
foreach $num ($[ .. $#entry) {
print " $num\t'",$entry[$num],"'\n";
}
And of course you could set $[ to 17 to have arrays start at 17 instead of at 0 or 1. This was a great way to sabotage module authors.
package WhineyScalar;
sub new { tie $_[1], $_[0]; }
sub TIESCALAR {
bless \my $a, shift;
}
sub FETCH {
my $me = shift;
$$me;
}
sub STORE {
my $me = shift;
my $oldval = $$me;
$$me = shift;
(my $package, my $filename, my $line) = caller;
print STDERR "value changed from $oldval to $$me at ", join ' ', $package, $filename, $line, "\n";
}
1;
[XXX untested code] use WhineyScalar; new WhineyScalar my $foo; $foo = 30; # this generates diagnostic output print $foo, "\n"; $foo++; # this generates diagnostic outputUsing tie on a scalar, we can intercept attempts to store data, and generate diagnostics to help us track down what unexpected sequence of events is taking place.
x*5 + 10 = x*2 + 32Refactored:
15 = 5 * 3When programming, the simplest thing you can break things down into is a matter of opinion. Or rather, it is a matter of opinion what programs are composed of. Instructions? Expressions? Functions? Objects? Modules? Some languages represent everything with an object (Smalltalk, for instance). This lets us abide by an executive ruling that objects are the fundamental building block, which pleasantly closes the question. Perl being pragmatic, programs are built in strata. Packages are built on objects are built on functions are built on expressions. Just like a polynomial expression, these combine in different ways to create something more complex.
print 10+32;You move on to write reusable pieces of code needed to build things just more complex than the simplest structures.
sub indent {
print ' ' x $_[0], $_[1], "\n";
}
Functions let you repeat a bit of logic without having to repeat it in the program. This lets you repeat it an unknown number of times, and makes it easy to run it or not run it under different variable conditions. 100 OLDA=A 110 A=50 120 GOSUB 200 130 PRINT "The result is: ";PRINT $A 140 A=OLDA ... 200 A=B*100 210 RETURNWhat seems like the simple solution, stashing data somewhere and giving it a name, turns out to be a nightmare. Subroutines couldn't safely be written that that would perform a given operation on any give piece of data. Later versions of BASIC of course changed this, but not until a few new generations of langauges came and whooped it one.
opendir my $d, "arts/" or die $!;
my $processedflag = 0;
my $file;
FILE: while($file = readdir($d)) {
# attempt to process the file, set $processedflag if successful
handle_delete();
}
sub handle_delete {
unlink $file if $processedflag;
$processedflag = 0;
}
Later on, we decide to add the ability to handle all of the subdirectories of the given directory, a change of heart brought on by an interaction with an individual who expects you to code things he can't even remember, much less communicate.
sub process_directory {
my $dir = shift;
opendir my $d, $dir or die $!;
my $processedflag = 0;
my $file;
FILE: while(my $file = readdir($d)) {
if(-d $file) {
process_directory($file);
} else {
# attempt to process the file, set $processedflag if successful
# we now have to explicitly pass our arguments!
handle_delete($file, $processedflag);
}
}
sub handle_delete {
my $file = shift;
my $processedflag = shift;
unlink $file if $processedflag;
$processedflag = 0;
}
process_directory('arts/');
If we hadn't changed the call to handle_delete() to pass arguments, and modified the subroutine handle_delete to use local variables instead of global variables, $processedflag could hold left over data, altered by a call made from process_directory() to process_directory() that returned. It used to be that each instance of a running programmer had one main loop that used these variables. Now, this is a function that could be called several times, and even have several copies running at the same time, as it calls itself. Data that was specific to the program is now specific to an individual function call. Our variable definitions reflect this change.
# Before:
use vars qw/$foo/;
sub bar {
do_something_with($foo);
}
sub baz {
print $foo, "\n";
}
# more code...
$foo is visiable anywhere in that package, even though it is only used in those two methods. Restrict its scope:
# After:
do {
my $foo;
sub bar {
do_something_with($foo);
};
sub baz {
print $foo, "\n";
};
};
The enclosing block syntax is needed so that the functions are generated in a sterile environment of our own device, and there tied to their surrounding context - including the lexical variables. Only bar() and baz() can see $foo now. No other functions can. You could say that they are the only things in $foo's scope. The reduced scope makes the grouping visible, which makes the association obvious.
# store things in the correct array - ugly
print "Who should get a cookie? "; my $name = <STDIN>;
print "What kind of cookie should they get? "; my $cookie = <STDIN>;
push @{'cookies_for_'.$name}, $cookie;
# store things in the correct array - clean
my $peoples_cookies = {};
print "Who should get a cookie? "; my $name = <STDIN>;
print "What kind of cookie should they get? "; my $cookie = <STDIN>;
push @{$peoples_cookies->{$name}}, $cookie;
The already confusing reference syntax becomes even more confusing when you want to refer to something other than a scalar:
# scalars are easy:
my $cookie = $peoples_cookies->{'fred'}->[0];
# but refering to an array or hash inside of a data structure is confusing:
my @cookies = @{$peoples_cookies->{'fred'}};
The syntax for using datastructures is remarkably like the syntax for accessing the symbol table directly. The difference is what goes inside of the request to dereference:
@{...something...} # this is how you dereference something as an array
%{...something...} # this is how you dereference something as a hash
A "soft reference" or "symbolic reference" is a reference used this way with an expression that contains the name of the variable that contains the data.
open my $wordsfile, '/usr/share/dict/words' or die $!;
my @words = <$words>;
close $wordsfile;
my $something = \@words;
print "The words I know are: ", @{$something}, "\n";
The "my" in this example is important - otherwise our variable will be overwritten if we do this in a loop, and if we exit out of a block, it may vanish entirely.
$peoples_cookies->{'fred'}->[0] = 'sugar cookie';
print *{'cookies_for_fred'}, "\n"; # theres nothing there, and no warning
Perl will stop you if you use "soft" references (directly access the symbol table) while use strict is on.
<Spike[y]> does anyone know if theres a way to name a Hash like this: %hash$i{$a} = $blah; ?
<hull> i dont understand:P
<hull> cant you use %hash{$i}{$a} in your program?
<Spike[y]> can you? i'm trying to make the name of a new hash go up each time it hits a certian thing (ie. go from %hash1 to %hash2)
<hull> hmm
<hull> like, in a for loop?
<hull> for (my $i=0; $i<$k; $i++) { $hashname{$i}{$k} = "R0xx0R!"; }
<hull> you can do it like that:P
<Spike[y]> yeah!
<Spike[y]> i can?!
<Spike[y]> wierd
<Spike[y]> it tells me its an error
<Yaakov> $ not %
<cp5> ${"hash$i"}{$a} = $blah
<Spike[y]> hm .. ok
<Yaakov> NO!
<Yaakov> NO NO NO
* cp5 runs
<Yaakov> DON'T USE SYMREFS!
<perl-fu> ew... the green apple squirts
<Yaakov> DEATH
<perl-fu> AAAAAAAAAh!!
<Spike[y]> ??
<Yaakov> Use a hash of hashes
<Yaakov> read perldoc perldsc
<hull> Yaakov: hash of hashes... sorta like multi-dimensional hash, uh?
<scrottie> http://www.perldesignpatterns.com/?SoftrefsToHash
<scrottie> hull, you've been here before
XXX do a writeup on English::Reference - lot of people don't understand reference syntax, and for good reason. Ugh. First documented by TomChristiansenSee Also
my $tests_ski