use utf8 und Verhalten von String-Operationen (gelöst) - #150984 (Allgemeines zu Perl)

2011-08-02 21:06

User since
2003-08-04
12209 Artikel
Admin1

du hast ja an meinem beispiel gesehen, dass es nicht so ist.

use bytes arbeitet so:

Quote
The "use bytes" pragma disables character semantics for the rest of the lexical scope in which it appears.

use utf8 arbeitet aber nicht so und verändert nur variablen, deren inhalt im sourcecode steht.
der gesamte quellcode wird als utf8 betrachtet, daher geht dann auch folgendes:

Code (perl): (dl )

1
2

use utf8;
my $var_ä = 23;

substr() ändert also sein verhalten bei use bytes, aber use utf8 ist substr() egal. es richtet sich nur nach der kodierung, in der die variable vorliegt. wurde der inhalt der variable unter use utf8 im skript geschrieben, dann hat diese variable das utf8-flag, und dieses ändert das verhalten von substr.

bei CGI-skripten und Datenbank-Operationen braucht man daher Encode, um stringmanipulationen korrekt durchzuführen.

edit:

Quote
The "use utf8" pragma tells the Perl parser to allow UTF-8 in the
program text in the current lexical scope (allow UTF-EBCDIC on EBCDIC
based platforms). The "no utf8" pragma tells Perl to switch back to
treating the source text as literal bytes in the current lexical scope.

Do not use this pragma for anything else than telling Perl that your
script is written in UTF-8.

Last edited: 2011-08-02 21:07:55 +0200 (CEST)

Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live. -- Damian Conway in "Perl Best Practices"
lesen: Wiki:

Wie frage ich & perlintro Wiki:

brian's Leitfaden für jedes Perl-Problem