NAME
String::Dump - Dump strings of characters (or bytes) for printing and
debugging
VERSION
This document describes String::Dump version 0.08.
SYNOPSIS
use String::Dump qw( dump_hex dump_bin );
say 'hex: ', dump_hex($string);
say 'bin: ', dump_bin($string);
DESCRIPTION
When debugging or examining strings containing non-ASCII or non-printing
characters, String::Dump is your friend. It provides simple functions to
return a dump of the code points for Unicode strings or the bytes for
byte strings in several different formats, such as hex, binary, Unicode
names, and more.
For using this module from the command line, see the bundled dumpstr
script. For tips on debugging Unicode or byte strings with this module,
see the document String::Dump::Debugging.
FUNCTIONS
These functions all accept a single argument: the string to dump, which
may either be a Unicode string or a byte string. Each function has to be
explicitly exported or they can all be exported with the ":all" tag.
use String::Dump qw( :all );
dump_hex($string)
Hexadecimal (base 16) mode.
use utf8;
# string of 6 characters
say dump_hex('Ĝis! ☺'); # 11C 69 73 21 20 263A
no utf8;
# series of 9 bytes
say dump_hex('Ĝis! ☺'); # C4 9C 69 73 21 20 E2 98 BA
For a lowercase hex dump, simply pass the response to "lc".
say lc dump_hex('Ĝis! ☺'); # 11c 69 73 21 20 263a
dump_dec($string)
Decimal (base 10) mode. This is mainly useful when referencing 8-bit
code pages like ISO-8859-1 or 7-bit ones like ASCII variants.
use utf8;
say dump_dec('Ĝis! ☺'); # 284 105 115 33 32 9786
no utf8;
say dump_dec('Ĝis! ☺'); # 196 156 105 115 33 32 226 152 186
dump_oct($string)
Octal (base 8) mode. This is mainly useful when referencing 7-bit code
pages like ASCII variants.
use utf8;
say dump_oct('Ĝis! ☺'); # 434 151 163 41 40 23072
no utf8;
say dump_oct('Ĝis! ☺'); # 304 234 151 163 41 40 342 230 272
dump_bin($string)
Binary (base 2) mode.
use utf8;
say dump_bin('Ĝis! ☺');
# 100011100 1101001 1110011 100001 100000 10011000111010
no utf8;
say dump_bin('Ĝis! ☺');
# 11000100 10011100 1101001 1110011 100001 100000 11100010 10011000 10111010
dump_names($string)
Unicode character name mode. Unlike the various numeral modes above,
this mode uses “, ” <comma, space> for the delimiter and it only makes
sense for Unicode strings, not byte strings.
use utf8;
say dump_names('Ĝis! ☺');
# LATIN CAPITAL LETTER G WITH CIRCUMFLEX, LATIN SMALL LETTER I,
# LATIN SMALL LETTER S, EXCLAMATION MARK, SPACE, WHITE SMILING FACE
The output in the example above has been manually split into multiple
lines for the layout of this document.
dump_codes($string)
Unicode code point mode. This is similar to "dump_hex" except it follows
the standard Unicode code point notation. The hex value is 4 to 6
digits, padded with “0” <digit zero> when less than 4, and prefixed with
“U+” <latin capital letter u, plus sign>. As with "dump_names", this
function only makes sense for Unicode strings, not byte strings.
use utf8;
say dump_codes('Ĝis! ☺'); # U+011C U+0069 U+0073 U+0021 U+0020 U+263A
SEE ALSO
* dumpstr - Dump strings of characters on the command line
* String::Dump::Debugging - String debugging tips with String::Dump
* Template::Plugin::StringDump - String::Dump plugin for TT
* Data::HexDump - Simple hex dumping using the default output of the
Unix "hexdump" utility
* Data::Hexdumper - Advanced formatting of binary data, similar to
"hexdump"
AUTHOR
Nick Patch <patch@cpan.org>
COPYRIGHT AND LICENSE
© 2011–2012 Nick Patch
This library is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.