Description: | Many modern text editors automatically save files using UTF-8 codification.
However, the perl interpreter does not expect it by default. Whilst this does
not represent a big deal on (most) backend-oriented programs, Web framework
(Catalyst, Mojolicious) based applications will suffer so-called Mojibake
(literally: "unintelligible sequence of characters"). Even worse: if an editor
saves BOM (Byte Order Mark, U+FEFF character in Unicode) at the start of a
script with the executable bit set (on Unix systems), it won't execute at all,
due to shebang corruption.
Avoiding codification problems is quite simple:
* Always use utf8/use common::sense when saving source as UTF-8
* Always specify =encoding utf8 when saving POD as UTF-8
* Do neither of above when saving as ISO-8859-1
* Never save BOM (not that it's wrong; just avoid it as you'll barely
notice its presence when in trouble)
However, if you find yourself upgrading old code to use UTF-8 or trying to
standardize a big project with many developers, each one using a different
platform/editor, reviewing all files manually can be quite painful, especially
in cases where some files have multiple encodings (note: it all started when I
realized that gedit and derivatives are unable to open files with character
conversion tables).
Enter the Test::Mojibake ;) |