NAME
Unicode::Util - Unicode grapheme-level versions of built-in Perl
functions
VERSION
This document describes Unicode::Util version 0.07.
SYNOPSIS
use Unicode::Util qw( grapheme_length grapheme_reverse );
# grapheme cluster ю́ (Cyrillic small letter yu, combining acute accent)
my $grapheme = "\x{044E}\x{0301}";
say length($grapheme); # 2 (length in code points)
say grapheme_length($grapheme); # 1 (length in grapheme clusters)
# Spın̈al Tap; n̈ = Latin small letter n, combining diaeresis
my $band = "Sp\x{0131}n\x{0308}al Tap";
say scalar reverse $band; # paT länıpS
say grapheme_reverse($band); # paT lan̈ıpS
DESCRIPTION
This module provides Unicode grapheme cluster–level versions of Perl’s
built-in string functions, tailored to work on grapheme clusters as
opposed to code points or bytes.
This is an early release and major revisions are planned for the near
future.
FUNCTIONS
Functions may each be exported explicitly or by using the ":all" tag for
everything.
grapheme_length($string)
Returns the length of the given string in grapheme clusters. This is
the closest to the number of “characters” that many people would
count on a printed string.
grapheme_chop($string)
Returns the given string with the last grapheme cluster chopped off.
Does not modify the original value, unlike the built-in "chop".
grapheme_reverse($string)
Returns the given string value with all grapheme clusters in the
opposite order.
TODO
"grapheme_substr", "graphem_index", "grapheme_rindex", "canonical_eq",
"compatibility_eq"
SEE ALSO
Unicode::GCString, String::Multibyte, Perl6::Str,
<http://perlcabal.org/syn/S32/Str.html>
AUTHOR
Nick Patch <patch@cpan.org>
COPYRIGHT AND LICENSE
© 2011–2013 Nick Patch
This library is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.