Leser: 25
2010-09-26T11:28:32 linhallo Communty - bin neu hier -
habe vor wenigen Monaten mit Linux angefangen - und jetzt will /sollte ich eine PERL-Aufgabe lösen.
2010-09-26T11:28:32 linAlso - ich will mit Perl starten auf OpenSuse 11.3
Hab schon mal mit dem Paketmanager einige Module geadded
Frage wie starte ich denn ein Perl-Script.
2010-09-26T11:28:32 linDie Aufgabe: ich will ein HTML-Parser-Job ausführen. Da muss ich einige hundert HTML-Files parsen und dann einen Text /immer im selben Format herausfiltern.
Wie mach ich das denn grundsätzlich? Ich meine also:
a. Die HTML-Files (immer im selben Format wie gesagt - sid in einem Ordner drinne. Wo speichere ich die denn?
Wie rufe ich die denn dann auf!?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
#!/usr/bin/perl # strict und warnings sollten bei jedem Programm Pflicht sein use strict; use warnings; # Bibliothek zum einfachen Finden von Dateien # muss ggf. noch installiert werden use File::Find::Rule; # Bibliothek zum Parsen von HTML-Dateien use HTML::TreeBuilder::LibXML; # Verzeichnis in dem die HTML-Dateien gespeichert sind my $html_dir = '/path/to/dir/with/html.files'; # hole alle .html-Dateien aus dem Verzeichnis my @html_files = File::Find::Rule->file->name( '*.html' )->in( $html_dir ); for my $file ( @html_files ) { # parse die Datei # speichere den Text aus dem HTML in einer Datei }
2010-09-26T11:28:32 linHTML::TreeBuilder::LibXML
Das Kommando auf der console
zypper in perl-HTML-Tree
gab mir zurück:
perl-HTML-Tree is allready installed
Aber ich glaube dass Das nicht dem HTML::TreeBuilder::LibXML entspricht.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
cpan[1]> install File::Find::Rule
Going to read '/root/.cpan/Metadata'
Database was generated on Sun, 26 Sep 2010 17:28:39 GMT
Running install for module 'File::Find::Rule'
Running make for R/RC/RCLAMP/File-Find-Rule-0.32.tar.gz
Fetching with LWP:
ftp://ftp.hosteurope.de/pub/CPAN/authors/id/R/RC/RCLAMP/File-Find-Rule-0.32.tar.gz
Fetching with LWP:
ftp://ftp.hosteurope.de/pub/CPAN/authors/id/R/RC/RCLAMP/CHECKSUMS
Checksum for /root/.cpan/sources/authors/id/R/RC/RCLAMP/File-Find-Rule-0.32.tar.gz ok
Scanning cache /root/.cpan/build for sizes
............................................................................DONE
CPAN.pm: Going to build R/RC/RCLAMP/File-Find-Rule-0.32.tar.gz
Checking if your kit is complete...
Looks good
Writing Makefile for File::Find::Rule
cp lib/File/Find/Rule.pm blib/lib/File/Find/Rule.pm
cp lib/File/Find/Rule/Extending.pod blib/lib/File/Find/Rule/Extending.pod
cp lib/File/Find/Rule/Procedural.pod blib/lib/File/Find/Rule/Procedural.pod
cp findrule blib/script/findrule
/usr/bin/perl -MExtUtils::MY -e 'MY->fixin(shift)' -- blib/script/findrule
Manifying blib/man1/findrule.1
Manifying blib/man3/File::Find::Rule.3pm
Manifying blib/man3/File::Find::Rule::Extending.3pm
Manifying blib/man3/File::Find::Rule::Procedural.3pm
RCLAMP/File-Find-Rule-0.32.tar.gz
make -- OK
Running make test
PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
t/File-Find-Rule.t .. 1/44 Use of uninitialized value $magnitude in lc at /usr/lib/perl5/vendor_perl/5.12.1/Number/Compare.pm line 30.
Use of uninitialized value $magnitude in lc at /usr/lib/perl5/vendor_perl/5.12.1/Number/Compare.pm line 31.
Use of uninitialized value $magnitude in lc at /usr/lib/perl5/vendor_perl/5.12.1/Number/Compare.pm line 32.
Use of uninitialized value $magnitude in lc at /usr/lib/perl5/vendor_perl/5.12.1/Number/Compare.pm line 33.
Use of uninitialized value $magnitude in lc at /usr/lib/perl5/vendor_perl/5.12.1/Number/Compare.pm line 34.
Use of uninitialized value $magnitude in lc at /usr/lib/perl5/vendor_perl/5.12.1/Number/Compare.pm line 35.
Use of uninitialized value $magnitude in lc at /usr/lib/perl5/vendor_perl/5.12.1/Number/Compare.pm line 30.
Use of uninitialized value $magnitude in lc at /usr/lib/perl5/vendor_perl/5.12.1/Number/Compare.pm line 31.
Use of uninitialized value $magnitude in lc at /usr/lib/perl5/vendor_perl/5.12.1/Number/Compare.pm line 32.
Use of uninitialized value $magnitude in lc at /usr/lib/perl5/vendor_perl/5.12.1/Number/Compare.pm line 33.
Use of uninitialized value $magnitude in lc at /usr/lib/perl5/vendor_perl/5.12.1/Number/Compare.pm line 34.
Use of uninitialized value $magnitude in lc at /usr/lib/perl5/vendor_perl/5.12.1/Number/Compare.pm line 35.
Use of uninitialized value $magnitude in lc at /usr/lib/perl5/vendor_perl/5.12.1/Number/Compare.pm line 30.
Use of uninitialized value $magnitude in lc at /usr/lib/perl5/vendor_perl/5.12.1/Number/Compare.pm line 31.
Use of uninitialized value $magnitude in lc at /usr/lib/perl5/vendor_perl/5.12.1/Number/Compare.pm line 32.
Use of uninitialized value $magnitude in lc at /usr/lib/perl5/vendor_perl/5.12.1/Number/Compare.pm line 33.
Use of uninitialized value $magnitude in lc at /usr/lib/perl5/vendor_perl/5.12.1/Number/Compare.pm line 34.
Use of uninitialized value $magnitude in lc at /usr/lib/perl5/vendor_perl/5.12.1/Number/Compare.pm line 35.
t/File-Find-Rule.t .. ok
t/findrule.t ........ ok
All tests successful.
Files=2, Tests=50, 2 wallclock secs ( 0.09 usr 0.01 sys + 0.86 cusr 0.14 csys = 1.10 CPU)
Result: PASS
RCLAMP/File-Find-Rule-0.32.tar.gz
make test -- OK
Running make install
Installing /usr/lib/perl5/site_perl/5.12.1/File/Find/Rule.pm
Installing /usr/lib/perl5/site_perl/5.12.1/File/Find/Rule/Procedural.pod
Installing /usr/lib/perl5/site_perl/5.12.1/File/Find/Rule/Extending.pod
Installing /usr/share/man/man1/findrule.1
Installing /usr/share/man/man3/File::Find::Rule.3pm
Installing /usr/share/man/man3/File::Find::Rule::Extending.3pm
Installing /usr/share/man/man3/File::Find::Rule::Procedural.3pm
Appending installation info to /usr/lib/perl5/5.12.1/i586-linux-thread-multi/perllocal.pod
RCLAMP/File-Find-Rule-0.32.tar.gz
make install -- OK