NAME
HTML::Entities::ImodePictogram - encode / decode i-mode pictogram
SYNOPSIS
use HTML::Entities::ImodePictogram;
$html = encode_pictogram($rawtext);
$rawtext = decode_pictogram($html);
$cleantext = remove_pictogram($rawtext);
use HTML::Entities::ImodePictogram qw(find_pictogram);
$num_found = find_pictogram($rawtext, \&callback);
DESCRIPTION
HTML::Entities::ImodePictogram handles HTML entities for i-mode
pictogram (emoji), which are assigned in Shift_JIS private area.
See http://www.nttdocomo.co.jp/i/tag/emoji/index.html for details about
i-mode pictogram.
FUNCTIONS
In all functions in this module, input/output strings are asssumed as
encoded in Shift_JIS. See the Jcode manpage for conversion between
Shift_JIS and other encodings like EUC-JP or UTF-8.
This module exports following functions by default.
encode_pictogram
$html = encode_pictogram($rawtext);
Encodes pictogram characters in raw-text into HTML entities.
decode_pictogram
$rawtext = decode_pictogram($html);
Decodes HTML entities for pictogram into raw-text.
remove_pictogram
$cleantext = remove_pictogram($rawtext);
Removes pictogram characters in raw-text.
This module also exports following functions on demand.
find_pictogram
$num_found = find_pictorgram($rawtext, \&callback);
Finds pictogram characters in raw-text and executes callback when
found. It returns the total numbers of charcters found in text.
The callback is given two arguments. The first is a found pictogram
character itself, and the second is a decimal number which
represents codepoint of the character. Whatever the callback returns
will replace the original text.
Here is an implementation of encode_pictogram(), which will be the
good example for the usage of find_pictogram().
sub encode_pictogram {
my $text = shift;
find_pictogram($text, sub {
my($char, $number) = @_;
return '' . $number . ';';
});
return $text;
}
CAVEAT
This module works so slow, because regex used here matches "ANY"
characters in the text. This is due to the difficulty of extracting
character boundaries of Shift_JIS encoding.
AUTHOR
Tatsuhiko Miyagawa
This library is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.
SEE ALSO
the HTML::Entities manpage,
http://www.nttdocomo.co.jp/i/tag/emoji/index.html