2016-07-21T11:24:08 GwenDragonKommt darauf an wie falsch es gemacht wird, damit es lahmt ;)
2016-07-21T13:52:29 clmsWas? Die haben den Müll mit in die DB geschrieben? Dann ist es kein Wunder, das sowas lahmt.2. das eigentliche Problem liegt im Flow des CMS. Das System ist kollabiert, weil bei jeder Ausgabe der Homepage der problematische Input aus der Datenbank geholt und erst dann der Filter darauf angewendet wurde. Warum filtern die das Zeug nicht einmal bei der Eingabe bevor sie es in die Datenbank schreiben? Das hätte nur bei einem Request zu einer höheren Serverbelastung geführt. Der Server wäre dann evtl. kurzzeitig langsamer geworden, evtl. wäre die Eingabe gescheitert, aber der Server selbst wäre durchgelaufen.
2016-07-21T14:09:30 ?Was? Die haben den Müll mit in die DB geschrieben? Dann ist es kein Wunder, das sowas lahmt.
QuoteThe post was in the homepage list, and that caused the expensive regular expression to be called on each home page view.
2016-07-21T14:09:30 GwenDragonWas? Die haben den Müll mit in die DB geschrieben? Dann ist es kein Wunder, das sowas lahmt.
.*\S(\s+)$
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
use strict; use warnings; use 5.010; use Benchmark qw(:all :hireswallclock) ; my $inputA = "X".(" \t" x 200000)."A"; my $inputB = "X".(" \t" x 200000)."B"; my $c1A = <<'CODE'; my $s = $inputA; $s =~ s/[\s\u200c]+A/ A/; CODE my $c1B = <<'CODE'; my $s = $inputB; $s =~ s/[\s\u200c]+A/ A/; CODE my $c2A = <<'CODE'; my $s = $inputA; $s =~ s/(?<=\S)[\s\u200c]+A/ A/; CODE my $c2B = <<'CODE'; my $s = $inputB; $s =~ s/(?<=\S)[\s\u200c]+A/ A/; CODE timethis(50_000_000,$c1A); timethis(50_000_000,$c1B); timethis(50_000_000,$c2A); timethis(50_000_000,$c2B);
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
use strict; use warnings; use Benchmark qw(:all :hireswallclock) ; our $inputA = "X".(" \t" x 10000)."A"; our $inputB = "X".(" \t" x 10000)."B"; our $m1A = 0; our $m1B = 0; our $m2A = 0; our $m2B = 0; my $c1A = <<'CODE'; my $s = $inputA; $m1A++ if ($s =~ s/[\s\u200c]+A/ A/g); CODE my $c1B = <<'CODE'; my $s = $inputB; $m1B++ if ($s =~ s/[\s\u200c]+A/ A/g); CODE my $c2A = <<'CODE'; my $s = $inputA; $m2A++ if ($s =~ s/(?<=\S)[\s\u200c]+A/ A/g); CODE my $c2B = <<'CODE'; my $s = $inputB; $m2B++ if ($s =~ s/(?<=\S)[\s\u200c]+A/ A/g); CODE timethis(100_000,$c1A); timethis(100_000,$c1B); timethis(100_000,$c2A); timethis(100_000,$c2B); print "M1A: $m1A\n"; print "M1B: $m1B\n"; print "M2A: $m2A\n"; print "M2B: $m2B\n";
1
2
3
4
5
6
7
8
timethis 100000: 3.81796 wallclock secs ( 3.82 usr + 0.00 sys = 3.82 CPU) @ 26178.01/s (n=100000)
timethis 100000: 1.70095 wallclock secs ( 1.70 usr + 0.00 sys = 1.70 CPU) @ 58823.53/s (n=100000)
timethis 100000: 2.86398 wallclock secs ( 2.87 usr + 0.00 sys = 2.87 CPU) @ 34843.21/s (n=100000)
timethis 100000: 1.70098 wallclock secs ( 1.70 usr + 0.00 sys = 1.70 CPU) @ 58823.53/s (n=100000)
M1A: 100000
M1B: 0
M2A: 100000
M2B: 0
1
2
3
4
5
6
7
8
9
Subroutine Benchmark::mytime redefined at /usr/lib/perl5/5.8.8/Benchmark.pm line 459.
timethis 100000: 23.9664 wallclock secs (23.95 usr + 0.01 sys = 23.96 CPU) @ 4173.62/s (n=100000)
timethis 100000: 1.69792 wallclock secs ( 1.70 usr + 0.00 sys = 1.70 CPU) @ 58823.53/s (n=100000)
timethis 100000: 23.9532 wallclock secs (23.94 usr + 0.02 sys = 23.96 CPU) @ 4173.62/s (n=100000)
timethis 100000: 1.6998 wallclock secs ( 1.70 usr + 0.00 sys = 1.70 CPU) @ 58823.53/s (n=100000)
M1A: 100000
M1B: 0
M2A: 100000
M2B: 0
1 2 3 4 5 6 7 8 9
use strict; use warnings; use re 'debug'; my $inputA = "X".(" \t" x 10000)."A"; (my $t1 = $inputA) =~ s/[\s\u200c]+A/ A/; <STDIN>; (my $t1 = $inputA) =~ s/[\s\u200c]*$/ A/;
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
use v5.10; use Time::HiRes qw(time); my $inputA = "X".(" \t" x 10000)."A " . ( " " x 10000); my $start = time; (my $t1 = $inputA) =~ s/[\s\u200c]*$/ A/; say time - $start; use re::engine::RE2; $start = time; (my $t2 = $inputA) =~ s/[\s\u200c]*$/ A/; say time - $start; say "gleich: " . ( $t1 eq $t2 );
2016-07-26T12:42:08 reneeÄndere mal in Deinem Regex mal +A in *$... Das Problem sind ja nicht die Treffer, sondern die Fehlschläge.
1 2 3 4
use re 'debug'; my $inputB = "X".(" \t" x 10000)."B"; $inputB =~ s/[\s\u200c]+A/ A/g;
1
2
3
4
5
6
7
8
9
10
11
12
13
$ perl clms_regex.pl
Compiling REx "[\s200c]+A"
Final program:
1: PLUS (13)
2: ANYOF[\x{09}-\x{0D} 02c][{utf8}0085 00A0 1680 2000-200A 2028-2029 202F 205F 3000] (0)
13: EXACT <A> (15)
15: END (0)
floating "A" at 1..9223372036854775807 (checking floating) stclass ANYOF[\x{09}-\x{0D} 02c][{utf8}0085 00A0 1680 2000-200A 2028-2029 202F 205F 3000] plus minlen 2
Matching REx "[\s200c]+A" against "X %t %t %t %t %t %t %t %t %t %t %t %t %t %t %t %t %t %t %t "...
Intuit: trying to determine minimum start position...
Did not find floating substr "A"...
Match rejected by optimizer
Freeing REx: "[\s200c]+A"
1 2 3 4
use re 'debug'; my $inputB = "X".(" \t" x 10000)."B"; $inputB =~ s/[\s\u200c]*$/ A/;
2016-07-26T15:05:01 reneeDas hat wiederum damit zu tun, dass Du eine "Zeichenkette" (das A) verwendest. Die Regex-Engine erkennt, dass dieser Substring nicht vorkommt und bricht dann ruckzuck ab...
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
timethis 100000: 2.16451 wallclock secs ( 2.17 usr + 0.00 sys = 2.17 CPU) @ 46082.95/s (n=100000)
M1A: 100000
timethis 100000: 1.70105 wallclock secs ( 1.70 usr + 0.00 sys = 1.70 CPU) @ 58823.53/s (n=100000)
M1B: 0
timethis 100000: 1.70482 wallclock secs ( 1.71 usr + 0.00 sys = 1.71 CPU) @ 58479.53/s (n=100000)
M1S: 0
timethis 100000: 1.7609 wallclock secs ( 1.76 usr + 0.00 sys = 1.76 CPU) @ 56818.18/s (n=100000)
M2A: 100000
timethis 100000: 1.70132 wallclock secs ( 1.70 usr + 0.00 sys = 1.70 CPU) @ 58823.53/s (n=100000)
M2B: 0
timethis 100000: 1.70033 wallclock secs ( 1.70 usr + 0.00 sys = 1.70 CPU) @ 58823.53/s (n=100000)
M2S: 0
timethis 100000: 7.1719 wallclock secs ( 7.17 usr + 0.00 sys = 7.17 CPU) @ 13947.00/s (n=100000)
M3A: 0
timethis 100000: 7.17198 wallclock secs ( 7.17 usr + 0.00 sys = 7.17 CPU) @ 13947.00/s (n=100000)
M3B: 0
timethis 100000: 2.04794 wallclock secs ( 2.05 usr + 0.00 sys = 2.05 CPU) @ 48780.49/s (n=100000)
M3S: 100000
timethis 100000: 155.864 wallclock secs (155.84 usr + 0.00 sys = 155.84 CPU) @ 641.68/s (n=100000)
M4A: 0
timethis 100000: 155.899 wallclock secs (155.86 usr + 0.01 sys = 155.87 CPU) @ 641.56/s (n=100000)
M4B: 0
timethis 100000: 1.75641 wallclock secs ( 1.76 usr + 0.00 sys = 1.76 CPU) @ 56818.18/s (n=100000)
M4S: 100000
timethis 100000: 7.17382 wallclock secs ( 7.18 usr + 0.00 sys = 7.18 CPU) @ 13927.58/s (n=100000)
M5A: 0
timethis 100000: 7.18063 wallclock secs ( 7.18 usr + 0.00 sys = 7.18 CPU) @ 13927.58/s (n=100000)
M5B: 0
timethis 100000: 1.91674 wallclock secs ( 1.92 usr + 0.00 sys = 1.92 CPU) @ 52083.33/s (n=100000)
M5S: 100000
1
2
3
4
5
6
7
8
timethis 100000: 5.43372 wallclock secs ( 5.43 usr + 0.00 sys = 5.43 CPU) @ 18419.60/s (n=100000)
timethis 100000: 1.66317 wallclock secs ( 1.67 usr + 0.00 sys = 1.67 CPU) @ 59916.12/s (n=100000)
timethis 100000: 5.62105 wallclock secs ( 5.59 usr + 0.00 sys = 5.59 CPU) @ 17905.10/s (n=100000)
timethis 100000: 1.7069 wallclock secs ( 1.70 usr + 0.00 sys = 1.70 CPU) @ 58788.95/s (n=100000)
M1A: 100000
M1B: 0
M2A: 100000
M2B: 0
1
2
3
4
5
6
7
8
timethis 100000: 12.2518 wallclock secs (12.18 usr + 0.00 sys = 12.18 CPU) @ 8208.16/s (n=100000)
timethis 100000: 5.47224 wallclock secs ( 5.48 usr + 0.00 sys = 5.48 CPU) @ 18261.50/s (n=100000)
timethis 100000: 195.541 wallclock secs (194.83 usr + 0.00 sys = 194.83 CPU) @513.27/s (n=100000)
timethis 100000: 5.57992 wallclock secs ( 5.55 usr + 0.00 sys = 5.55 CPU) @ 18005.04/s (n=100000)
M1A: 0
M1B: 100000
M2A: 0
M2B: 100000
$str =~ y/\x20//d;