package GraphViz::Data::Structure; use strict; use Carp; use lib '..'; use GraphViz 2.01; use Devel::Peek; use Scalar::Util qw(refaddr reftype blessed); our $Debug = 0; sub _debug(@) { return unless $Debug; return unless @_; print STDERR @_; print "\n" unless $_[-1]=~/\n/; } our $VERSION = '0.17'; # The currently-supported color palettes. our %palettes = ( Pastel => {Scalar=>'lightyellow', Array =>'palevioletred', Hash =>'paleturquoise', Glob =>'lavender', Font =>'black'}, Bright => {Scalar=>'yellow', Array =>'tomato', Hash =>'cyan', Glob =>'purple', Font =>'white'}, Deep => {Scalar=>'gold', Array =>'firebrick2', Hash =>'turquoise', Glob =>'mediumpurple1', Font =>'white'}, Plain => {Scalar=>'white', Array =>'white', Hash =>'white', Glob =>'white', Font =>'black'} ); =head1 NAME GraphViz::Data::Structure - Visualise data structures =head1 SYNOPSIS use GraphViz::Data::Structure; my $gvds = GraphViz:Data::Structure->new($data_structure); print $gvds->graph()->as_png; =head1 DESCRIPTION This module makes it easy to visualise data structures, even recursive or circular ones. It is provided as an alternative to GraphViz::Data::Grapher. Differences: =over 4 =item C handles structures of arbitrary depth and complexity, automatically following links using a standard graph traversal algorithm. =item C creates graphics of indiividual substructures (arrays, scalars, hashes) which keep the substructure type and data together; C does this by shape alone. =item C encapsulates object info (if any) directly into the node being used to represent the class. =item C colors its graphs; C doesn't by default. =item C can parse out globs and CODE references (almost as well as the debugger does). =back =head1 REPRESENTING DATA STRUCTURES AS GRAPHS C tries to draw data structure diagrams with a minimum of complexity and a maximum of elegance. To this end, the following design choices were made: =over 4 =item Strings, scalars, filehandles, and code references are represented as plain text. =item Empty hashes and arrays are represented as Perl represents them in code: hashes as C<{}>, and arrays as C<[]>, except if they are blessed (see below). =item Arrays are laid out as sets of boxes, in the order in which they were found in the existing data structure (left-to-right or top-to-bottom, depending on overall graph layout). =item Hashes are laid out as pairs of sets of boxes, with the keys in alphabetically-sorted order top-to-bottom or left-to-right. =item Blessed items have a box added to them in parallel, containing the name of the class and its type (scalar/array/hash). =item Code references are decoded to determine their fully-qualified package name and are output as plaintext nodes. =item Glob pointed to by references are disassembled and their individual parts dumped. =back =head1 ALGORITHM The algorithm is a standard recursive depth-first treewalk; we determine how the current node should be added to the current graph, add it, and then call ourselves recursively to determine how all nodes below this one should be visualized.Edges are added after the subnodes are added to the graph. Items "within" the current subnode (array and hash elements which are I references) are rendered inside a cell in the aggregate corresponding to their position. References are represented by an edge linking the appropriate postion in the aggregate to the appropriate subnode. This code does its data-structure unwrapping in a manner very similar to that used by C, the code used by the debugger to display data structures as text. The initial structure treewalk was written in isolation; the C code was integrated only after it was recognized that there was more to life than hashes, arrays, and scalars.The C code to decode globs and code references was used almost as-is. Code was added to attempt to spot references to array or hash elements, but this code still does not work as desired. Array and hash I references still appear to be scalars to the current algorithm. =head1 GLOBAL SETTINGS =head2 C Set this to a true value to turn on some debugging messages output to STDERR. Defaults to false, and should probably be left that way unless you're reworking init(). # Turn on GraphViz::Data::Structure debugging. $GraphViz::Data::Structure::Debug = 1; =head1 CLASS METHODS =head2 C This is the constructor. It takes one mandatory argument, which is the data structure to be visualised. A C object, the name of the top node, and a list defining the 'to' port for this top node (if there is a 'to' port; if none, an empty list) are all returned. # Graph a data structure, creating a GraphViz object. # The new GraphViz:Data::Structure object, the name of # the top node in the structure, and the "in" port are returned. my ($gvds, $top_name, @port) = GraphViz::Data::Structure->new($structure); print $gvds->graph()->as_png("my.png"); If you so desire, you can use the returned information to join other graphs up to the top of the graph contained in this object by callling C to extract the C object and calling other C primitives on that object. Most of the time you'll only care about the C object and not the additional info. =head3 Optional parameters You can specify any, none, or all of the following optional keyword parameters: =over 4 =item C You can specify your own C object, in which the graph will be built. C nodes all start with the string C; if you avoid using nodes with similar names, you should not have any nodename collisions. # Create a graph of a data structure, using your own GraphViz object my ($gvds, $top_name, @port) = GraphViz::Data::Structure->new($structure, GraphViz=>GraphViz->new()); $gvds->graph()->as_png("my.png"); =item C If the C parameter is supplied, C stops at the designated level. If any references are found at this level, plaintext C<...> nodes are constructed for them. The default limit is B limit. # Stop after reaching level 7. my ($gvds, $top_name, @port) = GraphViz::Data::Structure->new($structure, Level=>7); $gvds->graph()->as_png("my.png"); This can be useful if you have a very large data structure, but showing just the upper levels is sufficient for your purposes. =item C If your data structure has large pieces of text in it, you will probably want to limit the size of the text displayed to keep C from creating huge unwieldy nodes. C allows you to specify the maximum length of any text to be inserted into blocks; the default value is B<40> characters. # Trim any text to 20 characters or less. my ($gvds, $top_name, @port) = GraphViz::Data::Structure->new($structure, Fuzz=>20); $gvds->graph()->as_png("my.png"); Be aware: large values for C will result in long character strings being passed to C, which will eventually segfault if the strings are long enough. =item C You can choose to have your records laid out so that arrays and hashes are either laid out horizontally, with class labels at the top, or vertically, with class labels on the left. Default is C. # Stack items vertically. my ($gvds, $top_name, @port) = GraphViz::Data::Structure->new($structure, Orientation=>'vertical'); $gvds->graph()->as_png("my.png"); You cannot mix horizontal and vertical layouts in the same graph. =item C You can choose how you want the different kinds of nodes colored by passing a reference to a hash of type-to-color mappings as the value of the C parameter, or by choosing the name of any of the predefined palettes. If you're making up your own set of colors, you can use any of the colors listed in the C manual. The names of the types are C, C, C, C, and C. At present, C doesn't allow you to color the plain-text items (strings, scalar values, coderefs). The predefined palettes are: =over 4 =item Colors=>'Pastel' - this is a pale, rather subtle set of colors. Colors=>{Scalar=>'lightyellow', Array=>'palevioletred', Hash =>'paleturquoise', Glob =>'lavender', Font =>'black'); =item Colors=>'Bright' - this is a brightly-colored set. Colors=>{Scalar=>'yellow', Array=>'tomato', Hash =>'cyan', Glob =>'purple', Font =>'white); =item Colors=>'Deep' - this is a darker set. Colors=>{Scalar=>'gold', Array=>'firebrick4', Hash =>'turquoise4', Glob =>'MediumPurple4', Font =>'white'); =item Colors=>'Plain' - this is the same as no coloring at all, and is the default behaviour. Colors=>{Scalar=>'white', Array=>'white', Hash =>'white', Glob =>'white', Font =>'black'); =back # Graph a structure, with the "bright" palette: my $gvds = GraphViz::Data::Structure->new($structure,Colors=>Bright); # Graph a structure, creating your own palette: my $gvds = Graphviz::Data::Structure->new($structure, Colors=>{Scalar=>'VioletRed1', Array =>'SeaGreen1', Hash =>'tan1', Glob =>'goldenrod1', Font =>'white' } ); It should be noted that the optional palettes are simply a demonstration set of colors; someone with a better eye for graphic design will, I hope, submit better ones. The "rainbow" effect caused using the alternate palettes on a data structure with a lot ofdifferent node types in it is rather jarring - sort of like an explosion in a Jello factory. =item Other parameters C supports a number of other parameters at the graph level; any parameters that C doesn't understand itself will be passed on to C. # Add a title and change the default font: my ($gvds, $top_name, @port) = GraphViz::Data::Structure->new($structure, graph=>{label=>'My graph', fontname=>'Helvetica'} ); =back =cut sub new { my $proto = shift; my $class = ref($proto) || $proto; my $data_structure = shift; my %params = @_; # GraphViz::Data::Structure object. Internal use only. # Do not write code that depends on any fields within this object! # They are subject to change without notice in future releases. my $self = {}; bless $self, $class; # Parameters we understand. $self->{Fuzz} = $params{Fuzz} || 40; $self->{Depth} = $params{Depth} || undef; $self->{Label} = $params{Label} || 'left'; $self->{Orientation} = $params{Orientation} || "horizontal"; # Handle Colors. This will be either a color set name, or a reference # to a hash which defines the colors. # Begin by defaulting the palette. $self->{Colors} = $palettes{Plain}; if (defined $params{Colors}) { # Color parameter set was specified. Override the defaults with whatever # was specified. Note that specifying everything is the same as defining # a completely new palette. if (ref $params{Colors}) { foreach my $item (sort keys %{$params{Colors}}) { $self->{Colors}->{$item} = $params{Colors}->{$item}; } } else { # Color set name was provided. Choose from the supported # palettes; if not there, generate a monochrome palette with black text. $self->{Colors} = defined $palettes{$params{Colors}} ? $palettes{$params{Colors}} : {Scalar=>$params{Colors}, Hash =>$params{Colors}, Array =>$params{Colors}, Glob =>$params{Colors}, Font =>'black' }; } } # Carry over the remaining parameters to GraphViz, if possible. # If we've got an old GraphViz object, it's too late. local $_; map {delete $params{$_}} qw(Fuzz Depth Label Orientation Colors); my @gvparams = %params; push @gvparams, ($self->{Orientation} eq 'vertical' ? ('rankdir'=>1) : ()); $self->{Graph} = $params{GraphViz} || (GraphViz->new(@gvparams)); # Initialize the node and address caches. $self->{NodeCache} = {}; $self->{Addresses} = {}; # Counters for name generation. $self->{Atoms} = 0; $self->{Scalars} = 0; $self->{Arrays} = 0; $self->{Dummies} = 0; $self->{Subs} = 0; $self->{Undefs} = 0; $self->{Globs} = 0; # Recursive descent, depth-first search. my ($top, @port) = $self->init($data_structure, 0); # Done. Return GraphViz::Data::Structure object, or list as appropriate. wantarray() ? ($self, $top, @port) : $self; } =head2 C C, called as a class method, simply calls C, supporting all of the C parameters as usual. # Create a graph (replicates the new() call). Parameters default. my ($gvds, $top_name, @ports) = GraphViz::Data::Structure->add($structure); =head1 INSTANCE METHODS =head2 C C returns a C object, loaded with the nodes and edges corresponding to any data structure passed in via C and/or C. You can make any of the standard C calls to this object. Methods include C, C, C, C, C, C, C, C, C, C, C, C, C, C, C, C, C, C. See the C documentation for more information. The most common methods are: # Print out a PNG-format file print $gvds->graph->as_png(); # Print out a PostScript-format file print $gvds->graph->as_ps(); # Print out a dot file, in "canonical" form: print $gvds->graph->as_canon(); =cut sub graph { my $self = shift; $self->{Graph}; } =head2 C C checks to ensure that your data structure didn't generate a graph that was too complex for C to handle. Directly self-referential structures (e.g., C<@a = (1,\@a,3)>) seem to be the only offenders in this area; if your structure isn't directly self-referential -- by far the most likely situation -- you won't need to use C at all. C forces a C run to get the "canonical" form of the graph back, which can be computationally expensive; avoid it if possible. =cut sub was_null { my $self = shift; return $self->graph->as_canon() eq ""; } =head2 C C, called as an instance method, simply adds new nodes and edges (corresponding to a new data structure) to an existing C object. You can specify the C, C