1
2
3
4
5
6
7
8
9
10
11
use utf8;
use locale;
my $s="ü";
#Bei diesem Vergleich scheint "use locale" etwas kaputt zu machen.
if ($s =~ /\w/){
print "match\n";
}
else{
print "kein match\n";
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
>locale
LANG=de_DE.UTF-8
LANGUAGE=de_DE
LC_CTYPE="de_DE.UTF-8"
LC_NUMERIC="de_DE.UTF-8"
LC_TIME="de_DE.UTF-8"
LC_COLLATE="de_DE.UTF-8"
LC_MONETARY="de_DE.UTF-8"
LC_MESSAGES="de_DE.UTF-8"
LC_PAPER="de_DE.UTF-8"
LC_NAME="de_DE.UTF-8"
LC_ADDRESS="de_DE.UTF-8"
LC_TELEPHONE="de_DE.UTF-8"
LC_MEASUREMENT="de_DE.UTF-8"
LC_IDENTIFICATION="de_DE.UTF-8"
LC_ALL=
QuoteStarting in Perl 5.16, a hybrid mode for this pragma is available,
use locale ':not_characters';
which enables only the portions of locales that don't affect the character set (that is, all except LC_COLLATE and LC_CTYPE). This is useful when mixing Unicode and locales, including UTF-8 locales.
Guest Raphaelda ich mit Funktionen wie sort Umlaute korrekt sortieren möchte
1 2 3 4
use Unicode::Collate; my $alphasorter_modul = Unicode::Collate->new(); my $alphasorter_regex = qr{[^0-9 a-z!"§$%&/()=?\{\[\]\}\]><|_\-+*,.:;#'~\^]}ix; my $alphasorter = sub { $_[0] =~ $alphasorter_regex || $_[1] =~ $alphasorter_regex ? $alphasorter_modul->cmp($_[0],$_[1]) : lc $_[0] cmp lc $_[1] };