Sympa documentation licensed under GPL
Sympa::Tools::Text - Text-related functions
This package provides some text-related functions.
addrencode ( $addr, [ $phrase, [ $charset, [ $comment ] ] ] )
Returns formatted (and encoded) name-addr as RFC5322 3.4.
canonic_email ( $email )
Function. Returns canonical form of e-mail address.
Leading and trailing white spaces are removed. Latin letters without accents are lower-cased.
For malformed inputs returns undef
.
canonic_message_id ( $message_id )
Returns canonical form of message ID without trailing or leading whitespaces
or <
, >
.
canonic_text ( $text )
Canonicalizes text.
$text
should be a binary string encoded by UTF-8 character set or
a Unicode string.
Forbidden sequences in binary string will be replaced by
U+FFFD REPLACEMENT CHARACTERs, and Normalization Form C (NFC) will be applied.
decode_filesystem_safe ( $str )
Function. Decodes a string encoded by encode_filesystem_safe().
Parameter:
$str
String to be decoded.
Returns:
Decoded string, stripped utf8
flag if any.
decode_html ( $str )
Function. Decodes HTML entities in a string encoded by UTF-8 or a Unicode string.
Parameter:
$str
String to be decoded.
Returns:
Decoded string, stripped utf8
flag if any.
encode_filesystem_safe ( $str )
Function. Encodes a string $str to be suitable for filesystem.
Parameter:
$str
String to be encoded.
Returns:
Encoded string, stripped utf8
flag if any.
All bytes except '-'
, '+'
, '.'
, '@'
and alphanumeric characters are encoded to sequences '_'
followed by
two hexdigits.
Note that '/'
will also be encoded.
encode_html ( $str, [ $additional_unsafe ] )
Function.
Encodes characters in a string $str to HTML entities.
By default
'<'
, '>'
, '&'
and '"'
are encoded.
Parameter:
$str
String to be encoded.
$additional_unsafe
Character or range of characters additionally encoded as entity references.
This optional parameter was introduced on Sympa 6.2.37b.3.
Returns:
Encoded string, not stripping utf8 flag if any.
encode_uri ( $str, [ omit => $chars ] )
Function. Encodes potentially unsafe characters in the string using “percent” encoding suitable for URIs.
Parameters:
$str
String to be encoded.
omit => $chars
By default, all characters except those defined as “unreserved” in RFC 3986
are encoded, that is, [^-A-Za-z0-9._~]
.
If this parameter is given, it will prevent encoding additional characters.
Returns:
Encoded string, stripped utf8
flag if any.
escape_chars ( $str )
Escape weird characters.
ToDo: This should be obsoleted in the future release: Would be better to use “encode_filesystem_safe”.
escape_url ( $str )
DEPRECATED. Would be better to use “encode_uri” or “mailtourl”.
foldcase ( $str )
Function. Returns “fold-case” string suitable for case-insensitive match. For example, a code below looks for a needle in haystack not regarding case, even if they are non-ASCII UTF-8 strings.
$haystack = Sympa::Tools::Text::foldcase($HayStack);
$needle = Sympa::Tools::Text::foldcase($NeedLe);
if (index $haystack, $needle >= 0) {
...
}
Parameter:
$str
A string.
guessed_to_utf8( $text, [ lang, … ] )
Function. Guesses text charset considering language context and returns the text reencoded by UTF-8.
Parameters:
$text
Text to be reencoded.
lang, …
Language tag(s) which may be given by “implicated_langs” in Sympa::Language.
Returns:
Reencoded text.
If any charsets could not be guessed, iso-8859-1
will be used
as the last resort, just because it covers full range of 8-bit.
mailtourl ( $email, [ decode_html => 1 ], [ query => {key => val, …} ] )
Function.
Constructs a mailto:
URL for given e-mail.
Parameters:
E-mail address.
decode_html => 1
If set, arguments are assumed to include HTML entities.
query => {key => val, …}
Optional query.
Returns:
Constructed URL.
pad ( $str, $width )
Pads space a string so that result will not be narrower than given width.
Parameters:
$str
A string.
$width
If $width is false value or width of $str is not less than $width,
does nothing.
If $width is less than 0
, pads right.
Otherwise, pads left.
Returns:
Padded string.
qdecode_filename ( $filename )
Q-Decodes web file name.
ToDo: This should be obsoleted in the future release: Would be better to use “decode_filesystem_safe”.
qencode_filename ( $filename )
Q-Encodes web file name.
ToDo: This should be obsoleted in the future release: Would be better to use “encode_filesystem_safe”.
slurp ( $file )
Get entire content of the file.
Normalization by canonic_text() is applied.
$file
is the path to text file.
unescape_chars ( $str )
Unescape weird characters.
ToDo: This should be obsoleted in the future release: Would be better to use “decode_filesystem_safe”.
valid_email ( $string )
Basic check of an email address.
weburl ( $base, \@paths, [ decode_html => 1 ], [ fragment => $fragment ], [ query => \%query ] )
Constructs a http:
or https:
URL under given base URI.
Parameters:
$base
Base URI.
\@paths
Additional path components.
decode_html => 1
If set, arguments are assumed to include HTML entities. Exception is $base: It is assumed not to include entities.
fragment => $fragment
Optional fragment.
query => \%query
Optional query.
Returns:
A URI.
wrap_text ( $text, [ $init_tab, [ $subsequent_tab, [ $cols ] ] ] )
Function. Returns line-wrapped text.
Parameters:
$text
The text to be folded.
$init_tab
Indentation prepended to the first line of paragraph.
Default is ''
, no indentation.
$subsequent_tab
Indentation prepended to each subsequent line of folded paragraph.
Default is ''
, no indentation.
$cols
Max number of columns of folded text.
Default is 78
.
Sympa::Tools::Text appeared on Sympa 6.2a.41.
decode_filesystem_safe() and encode_filesystem_safe() were added on Sympa 6.2.10.
decode_html(), encode_html(), encode_uri() and mailtourl() were added on Sympa 6.2.14, and escape_url() was deprecated.
guessed_to_utf8() and pad() were added on Sympa 6.2.17.
canonic_text() and slurp() were added on Sympa 6.2.53b.
The content of this page is automatically generated from the source distribution of Sympa. For details about this document see original source file.
Theme originally designed by orderedlist