package HTML::Seamstress;
use strict;
use warnings;
use Carp qw(confess);
use Cwd;
use Data::Dumper;
use File::Slurp;
use File::Spec;
use HTML::Element::Library;
use HTML::Element::Replacer;
use base qw/HTML::TreeBuilder HTML::Element/;
our $VERSION = '6.0' ;
sub bless_tree {
my ($node, $class) = @_;
if (ref $node) {
# warn "root node($class): ", $node->as_HTML;
bless $node, $class ;
foreach my $c ($node->content_list) {
bless_tree($c, $class);
}
}
}
sub new_from_file { # or from a FH
my ($class, $file) = @_;
$class = ref $class ? ref $class : $class ;
my $new = HTML::TreeBuilder->new_from_file($file);
bless_tree($new, $class);
#warn "CLASS: $class TREE:", $new;
# warn "here is new: $new ", $new->as_HTML;
$new;
}
sub new_file { # or from a FH
my ($class, $file, %args) = @_;
-e $file or die 'File $file does not exist';
my $new = HTML::TreeBuilder->new;
for my $k (keys %args) {
next if $k =~ /guts/ ; # scales for more actions later
$new->$k($args{$k});
}
-e $file or die "$file does not exist";
$new->parse_file($file);
bless_tree($new, $class);
if ($args{guts}) {
$new->guts;
} else {
$new;
}
}
sub html {
my ($class, $file, $extension) = @_;
$extension ||= 'html';
my $pm = File::Spec->rel2abs($file);
$pm =~ s!pm$!$extension!;
$pm;
}
sub eval_require {
my $module = shift;
return unless $module;
eval "require $module";
confess $@ if $@;
}
sub HTML::Element::xepand_replace {
my $node = shift;
my $seamstress_module = ($node->content_list)[0] ;
eval "require $seamstress_module";
die $@ if $@;
$node->replace_content($seamstress_module->new) ;
}
1;
__END__
=head1 NAME
HTML::Seamstress - HTML::Tree subclass for HTML templating via tree rewriting
=head1 SYNOPSIS
=head2 Text substitution via replace_content() API call.
In our first example, we want to perform simple text substitution on
the HTML template document. The HTML file html/hello_world.htm has
klass attributes which serve as compiler (kompiler?) hints to Seamstress:
Hello World
Hello World
Hello, my name is dummy_name.
Today's date is dummy_date.
=head3 Seamstress compiles HTML to C
shell> seamc html/hello_world.htm
Seamstress v2.91 generating html::hello_world from html/hello_world.htm
Now you simply use the "compiled" version of HTML with API calls to
HTML::TreeBuilder, HTML::Element, and HTML::Element::LIbrary
use html::hello_world;
my $tree = html::hello_world->new;
$tree->look_down(id => name)->replace_content('terrence brannon');
$tree->look_down(id => date)->replace_content('5/11/1969');
print $tree->as_HTML;
=head2 If-then-else with the highlander API call
(But also see C<< $tree->passover() >> in L).
Hello, does your mother know you're
using her AOL account?
Sorry, you're not old enough to enter
(and too dumb to lie about your age)
Welcome
=head3 Compile and use the module:
use html::age_dialog;
my $tree = html::dialog->new;
$tree->highlander
(age_dialog =>
[
under10 => sub { $_[0] < 10} ,
under18 => sub { $_[0] < 18} ,
welcome => sub { 1 }
],
$age
);
print $tree->as_HTML;
# will only output one of the 3 dialogues based on which closure
# fires first
The following libraries are always available for more complicated
manipulations:
=over
=item * L
=item * L
=item * L
=item * L
=back
=head1 PHILOSOPHY and MOTIVATION of HTML::Seamstress
Welcome to push-style dynamic HTML generation!
When looking at HTML::Seamstress, we are looking at a uniquely
positioned 4th-generation HTML generator. Seamstress offers two sets
of advantages: those common to all 4th generation htmlgens and those
common to a subclass of L.
I think a Perlmonks node:
L
sums up the job of Seamstress quite well:
Monks,
I'm tired of writing meta code in templating languages.
I'm really good at writing Perl, and good at writing HTML,
but I'm lousy at the templating languages (and I'm not too
fired up to learn more about them).
=head2 Reap 4th generation dynamic HTML generation benefits
What advantages does this fourth way of HTML manipulation offer? Let's
take a look:
=head3 Guarantee yourself well-formed HTML
Because lower-generation dynamic HTML generators treat HTML as a
string, there is no insurance against poorly formed HTML.
Take a look at these two Mason components, from
L :
=over
=item * Example 5-3. /autohandler
% $m->call_next;
<%method .body_tag>
<%args>
$bgcolor => 'white'
$textcolor => 'black'
%args>
%method>
=item * Example 5-4. /important_advice.mas
A Blue Page With Red Text
<& SELF:.body_tag, bgcolor=>'blue', textcolor=>'red' &>
Never put anything bigger than your elbow into your ear.
=back
There is nothing guaranteeing that open tags will match close tags or
that close tags will even exist.
To make the correspondence between open and close tags even more troublesome,
they are in different files. And it is not easy for an HTML designer and/or
design tool to manipulate things once they have been shredded apart
like this.
With the tree-based approach of Seamstress, the end tag will exist
and it will match the open tag. Well-formedness is job 1 in tree-based
HTML rewriting!
=head4 HTML will be properly escaped
=head3 Separate HTML development and its programmatic modification
Software engineers refer to this as B.
The contents of the document remain legal HTML/XML that can be be
developed using standard interactive design tools. The flow of control
of the code remains separate from the page. Technologies that mix
content and data in a single file result in code that is often
difficult to understand and has trouble taking full advantage of the
object oriented programming paradigm.
=head3 Work at meta-level instead of object-level
The book "Godel, Escher, Bach: An Eternal Golden Braid" by Douglas R
Hofstadter makes it clear what it means to operate at object-level as
opposed to meta-level. When you buy into earlier-generation HTML
generation systems you are working at object-level: you can only speak
and act I the HTML with no ability to speak I the HTML.
Compare a bird's eye view of a city with standing on a city block and
you have the difference between the 4th generation of HTML development
versus all prior generations.
=head3 Reduced learning curve
If you have a strong hold on
object-oriented Perl and a solid understand of the tree-based nature
of HTML, then all you need to do is read the manual pages showing how
Seamstress and related modules offer tree manipulation routines and
you are done.
Extension just requires writing new Perl methods - a snap for any
object oriented Perler.
=head3 Static validation and formatting
Mixing Perl and HTML (by any of the generation 1-3 approaches)
makes it impossible to use standard validation and formatting tools
for either Perl or HTML.
=head3 Two full-strength programming languages: HTML and Perl
Perl and HTML are solid technologies with years of effort behind
making them robust and flexible enough to meet real-world
technological demands.
=head3 Object-oriented reuse and extension of HTML
Class-based object-oriented programming makes use of inheritance and
other techniques to achieve maximum code reuse. This typically
happens by a certain base/superclass method containing common actions
and a derived/subclass/mixin method containing extra actions.
A genuine tree-based approach (such as HTML::Seamstress) to HTML
generation is supportive of all methods of object-oriented reuse:
because manipulator and manipulated are separate and manipulators are
written in oo Perl, we can compose manipulators as we please.
This is in contrast to inline simple object systems (as in Mason) and
also in contrast to the if-then approach of tt-esque systems.
=head4 Per-page stereotyped substitution
[ FYI: you can run the two Seamstress approaches. They are in
F<$DISTRO/samples/perpage> ]
In the HTML::Mason book by O'Reilly:
L
we see a technique for doing simple text insertion which varies per
page:
Welcome to Wally World!
# homepage.html
<%attr>
head => "Wally World Home"
%attr>
Here at Wally World you'll find all the finest accoutrements.
# productpage.html
<%attr>
head => "Wally World Products"
%attr>
...
So, how would we do this using Seamstress' pure Perl approach to HTML
refinement?
Welcome to Wally World!
# homepage.pm
package html::homepage;
use base qw( HTML::Seamstress ) ;
sub new {
my ($class, $c) = @_;
my $html_file = 'html/base.html';
my $tree = __PACKAGE__->new_from_file($html_file);
$tree;
}
sub process {
my ($tree, $c, $stash) = @_;
$tree->content_handler(head => 'Wally World Home');
$tree->content_handler(body =>
'Here at Wally World you'll find all the finest accoutrements.');
}
# productpage.pm
package html::productpage;
use base qw( HTML::Seamstress ) ;
sub new {
my ($class, $c) = @_;
my $html_file = 'html/base.html';
my $tree = __PACKAGE__->new_from_file($html_file);
$tree;
}
sub process {
my ($tree, $c, $stash) = @_;
$tree->content_handler(head => 'Wally World Products);
$tree->content_handler(body => html::productpage::body->new->guts)
}
We have solved our problem. However, we can create even more re-use
because the both of these classes are very similar. They only vary in
2 things: the particular head and body they provide.
You can abstract this with whatever methodmaker you like. I tend to
prefer prototype-based oop
over class-based, so with L,
here's how we might do it:
package html::abstract::common;
use base qw(HTML::Seamstress Class::Prototyped);
sub head { 'ABSTRACT BASE METHOD' }
sub body { 'ABSTRACT BASE METHOD' }
__PACKAGE__->reflect->addSlots(
html_file => 'html/base.html',
);
sub new {
my $self = shift;
my $tree = $self->new_from_file($self->html_file);
}
sub process {
my ($tree, $c, $stash) = @_;
$tree->content_handler(head => $tree->head);
$tree->content_handler(body => $tree->body);
}
1;
and then have both of the above classes instantiate and
specialize this common class accordingly.
[ Again: you can run the two Seamstress approaches. They are in
F<$DISTRO/samples/perpage> ]
=head3 Parallel generation of a single page natural
A tree of HTML usually contains subtrees with no
inter-dependance. They therefore can be manipulated in parallel. If a
page contains 5 areas each of which takes C time, then one could
realize an N-fold speedup.
=head2 Reap the benefits of using HTML::Tree
=head3 Pragmatic HTML instead of strict X(HT)ML
The real world is unfortunately more about getting HTML to work with
IE and maybe 1 or 2 other browsers. Strict XHTML may not be acceptable
under time and corporate pressures to get things to work with quirky
browsers.
=head3 Rich API and User Contributions
L has a nice large set of accessor/modifier functions. If
that is not enough, then take a gander at Matthew Sisk's
contributions: L as well as
L.
=head1 Seamstress contains no voodoo elements whatsoever
If you know object-oriented Perl and know how to rewrite trees, then
everything that Seamstress offers will make sense: it's just various
boilerplates and scripts that allow your mainline code to be very
succinct: think of it as Class::DBI for HTML::Tree.
=over
=item * unifying HTML and the HTML processing via a Perl class
Seamstress contains two scripts, F and F which
together make it easy to access and modify an HTML file in very few
lines of startup code. If you have a file named
F, Seamstress makes it easy for that to become
the Perl module C with a C method that
loads and parses the HTML into an L.
=item * a Catalyst View class with meat-skeleton processing
The meat-skeleton HTML production concept is discussed below.
L is all ready
to go for rendering simple or more complex pages.
=item * Loading in the HTML::Tree support classes
One a Perl class has been built for your HTML, it has
L and
L as superclasses, ready
for you to use to rewrite the tree.
=back
=head2 Seamstress is here to help you use HTML::Tree, that's all.
=head2 Unify HTML and the processing of the HTML via a Perl class
Let's see why this is a good idea. In Mason, your Perl and HTML are
right there together in the same file.
Same with Template. Now, since Seamstress
operates on the HTML without touching the HTML, the operations and
the HTML are not in the same file. So we create a Perl module to
glue the HTML file to the operations we plan to perform on it.
This module (auto-created by F and perhaps F)
has a constructor C, which grabs the HTML file and
constructs an L tree from it and
returns it to you.
It also contains a C subroutine which processes the
HTML in some way: text substitutions, unrolling list elements,
building tables, and whatnot.
Finally, it contains a C subroutine. This subroutine is
designed to support the meat-skeleton paradigm, discussed above.
The C subroutine generated the C<$meat>. After <$meat>
has been placed in C<$skeleton>, there may be some page-specific
processing to the whole HTML page that you want to: pop in some
javascript, remove a copyright notice, whatever. That's what
this routine is for.
Now that I've said all that, please understand that you are perfectly
free to call C and do what you want with the HTML tree. You
don't have to use C and C. But they are there and
are used by L to make meat-skeleton
dynamic HTML development quick-and-easy (and non-greasy).
=head3 A Perl class created by spkg.pl
Here is our venerable little HTML file:
metaperl@pool-71-109-151-76:/ernest/dev/catalyst-simpleapp/MyApp/root/html$ cat hello_world.html
Hello World
Hello World
Hello, my name is dummy_name.
Today's date is dummy_date.
Now let's abstract this as a Perl class:
metaperl@pool-71-109-151-76:/ernest/dev/catalyst-simpleapp/MyApp/root/html$ spkg.pl --base_pkg=MyApp::View::Seamstress --base_pkg_root=`pwd`/../../lib hello_world.html
comp_root........ /ernest/dev/catalyst-simpleapp/MyApp/root/
html_file_path... /ernest/dev/catalyst-simpleapp/MyApp/root/html/
html_file........ hello_world.html
html_file sans... hello_world
hello_world.html compiled to package html::hello_world
metaperl@pool-71-109-151-76:/ernest/dev/catalyst-simpleapp/MyApp/root/html$
Now lets see what html::hello_world looks like. Everything other than
C was auto-generated:
package html::hello_world;
use strict;
use warnings;
use HTML::TreeBuilder;
use base qw(MyApp::View::Seamstress);
our $tree;
sub new {
my $file = __PACKAGE__->comp_root() . 'html/hello_world.html' ;
-e $file or die "$file does not exist. Therefore cannot load";
$tree =HTML::TreeBuilder->new;
$tree->parse_file($file);
$tree->eof;
bless $tree, __PACKAGE__;
}
sub process {
my ($self, $c, $stash) = @_;
$tree->look_down(id => $_)->replace_content($stash->{$_})
for qw(name date);
}
sub fixup { $tree }
1;
=head2 The meat-skeleton paradigm
This section is written to help understanding of
L for people who want to use Seamstress as
the view for their L apps.
HTML pages typically have meat and a skeleton. The meat varies from page
to page while the skeleton is fairly (though not completely)
static. For example, the skeleton of a webpage is usually a header, a
footer, and a navbar. The meat is what shows up when you click on a
link on the page somewhere. While the meat will change with each
click, the skeleton is rather static.
The perfect example of
Mason accomodates the meat-skeleton paradigm via
an C and C<< $m->call_next() >>. Template
accomodates it via its C directive.
And Seamstress? Well, here's what you _can_ do:
=over
=item 1 generate the meat, C<$meat>
This is typically what you see in the C part of an HTML page
=item 2 generate the skeleton, C<$skeleton>
This is typically the html, head, and maybe some body
=item 3 put the meat in the skeleton
=back
So, nothing about this is forced. This is just how I typically do
things and that is why
L has support
for this.
In all honesty, the meat-skeleton paradigm should be supported here
and called from C. But the problem is, I
don't
want to create an abstract API here unless I have used the
meat-skeleton paradigm from one other framework besides Catalyst. Then
I will have a good idea of how to refactor it so any framework can
make good use of the paradigm.
=head1 USAGE
The best example of usage is the F directory in this
distribution. You can read L and
actually run the code in that directory at the same time. After doing
so, the following sections are additional instruction.
=head2 Understand that HTML is a tree
The best representation of this fact is this slide right here:
L
If you understand this (and maybe the rest of the slides), then you
have a good grip on seeing HTML as a tree.
L does also teach this, but it takes a while
before he gets to what matters to us. It's a fun read nonetheless.
Now that we've got this concept under our belts let's try some full examples.
=head2 Install and Setup Seamstress
The first thing to remember is that Seamstress is really just
convenience functions for L. You can do
entirely without
Seamstress. It's just that my daily real-world obligations have lead
to a set of library functions (HTML::Element::Library) and a
convenient way to locate "templates" (C) that work well on
top of L
=over
=item * move spkg.pl and sbase.pl onto your execution C<$PATH>
C and C are used to simplify the process of
parsing an HTML file into HTML::Treebuilder object. In other words
instead of having to do this in your Perl programs:
use HTML::TreeBuilder;
my $tree = HTML::TreeBuilder->new_from_file('/usr/htdocs/hello.html');
You can do this:
use htdocs::hello;
my $tree = htdocs::hello->new;
The lines of code is not much different, but abstracting away absolute
paths is important in production environments where the absolute path
may come from who knows where via who knows how.
=item * run sbase.pl
sbase.pl will ask you 2 very simple questions. Just answer them.
When it is finished, it will have installed a package named
C on your C<@INC>. This module contains one
function, C which points to a place you wouldn't
typically have on your C<@INC> but which you must have because your
HTML file and corresponding C<.pm> abstracting it are going to be
there.
=item * run spkg.pl
In the default seutp,
no options need be supplied to this script. They
are useful in cases where you have more than one document root or want
to inherit from more than one place.
metaperl@pool-71-109-151-76:~/www$ spkg.pl moose.html
comp_root........ /home/metaperl/
html_file_path... /home/metaperl/www/
html_file........ moose.html
html_file sans... moose
moose.html compiled to package www::moose
=item * load your abstracted HTML and manipulate it
Now, from Perl, to get the TreeBuilder object
representing this HTML file, we simply do this:
use www::moose;
my $tree = www::moose->new;
# manipulate tree...
$tree->as_HTML;
In a mod_perl setup, you would want to pre-load your HTML and
L was designed for this very purpose. But
that's a topic for another time.
In a setup with HTML files in numerous places, I recommend setting up
multiple C,
C for each file root. To do this, you
will need to use the C<--base_pkg> and C<--base_pkg_root> options to
spkg.pl
=item * That's it!
Now you are ready to abstract away as many files as you want with the
same C call. Just supply it with a different HTML file to
create a different package. Then C