=encoding utf-8 =head1 NAME Locale::Country::Multilingual::Unicode - Recommended Usage with Unicode =head1 SYNOPSIS use utf8; use Encode::StdIO; use Locale::Country::Multilingual {use_io_layer => 1}; my $lcm = Locale::Country::Multilingual->new; $lcm->set_lang('de'); print $lcm->code2country('gb'), "\n"; I You are on a modern computer system, that uses C encoding by default. L uses language data, that is in C too. Everything is fine.... Really? Try this in your favorite terminal: > perl -le 'print "bäh!"' bäh! Uppercase it: > LANG=en_US perl -Mlocale -le 'print uc "bäh!"' BäH! Wrong! It should have been C. Though on C systems it works. Same for C<Österreich> - the German and native name for C. If you run C on it, it won't change. What happened is, that you write files (and code) in C, a multi-byte encoding, but Perl expects C (C) B, a single-byte encoding. Provided you C together with an appropriate locale (here C) in your Perl program, a lowercase C C<ä> (C<0xe4>) is turned into an uppercase C<Ä> (C<0xc4>) - but only if your input comes as C. A C C<ä> is encoded as C<0xc3, 0xa4>. Therefore C does not detect the two-byte C<ä> as a letter that could be uppercased. Language files in C are in C. To make everything work the correct workflow is: =over 4 =item use utf8; This pragma tells Perl, that all text in your code is actually in C, so the Perl interpreter converts it into its internal string format correctly. Actually this is only necessary, when you have literals that contain non-ASCII characters, e.g. when you code: print "Dürüm Döner Kebap\n"; Even if your system does not use C by default, your Perl programs should be encoded in C. Use an editor where you can set the encoding. =item Set encoding for input and output By default Perl converts the internal string representation into C for input and output. So the above C output would be broken on a non-C system. For switching C, C and C to C, you can write: binmode STDIN, ':utf8'; binmode STDOUT, ':utf8'; binmode STDERR, ':utf8'; If your system uses another encoding, e.g. C<"euc-jp">, you can switch a filehandle to that encoding with: binmode FH, ':encoding(euc-jp)'; In a web application don't forget to set the output MIME type as well! If output goes to a terminal: use Encode::StdIO; This module determines your terminal's encoding - even if it is something other than C - and sets the appropriate IO layers for the three standard IO handles. =item Set C<< use_io_layer => 1 >> There are two places where this option can be specified: Either in C or in new: use Locale::Country::Multilingual {use_io_layer => 1}; my $lcm = Locale::Country::Multilingual->new( lang => 'de', use_io_layer => 1, ); print uc $lcm->code2country('gb'), "\n"; That should print VEREINIGTES KÖNIGREICH GROSSBRITANNIEN UND NORDIRLAND Wow! Even the C<"ß"> has been converted correctly into C<"SS">. =back =head1 SEE ALSO L, L =head1 AUTHOR Bernhard Graf C =head1 COPYRIGHT & LICENSE This text is in the public domain.