regexp - mehrfaches erkennen einer regexp (Perl/CGI)

[thread]501[/thread]

regexp - mehrfaches erkennen einer regexp

chakal

2005-06-03 23:53

User since
2005-06-03
1 Artikel
BenutzerIn
[default_avatar]

Mein Problem ist folgendes :
ich habe eine Webseite, auf der mehrere Links zu finden sind, die ich herausziehen möchte. Wie mache ich das ?

Code: (dl )

$doc=~ m/<a class="listentry" target=(.*)>(.*)<\/a>//gi;

So bekomme ich doch nur einen Link, oder etwa nicht? Mir fehlt da noch das Verständnis.

Keanuf

2005-06-04 01:39

User since
2005-06-02
9 Artikel
BenutzerIn

hmm ich will dir nicht zu nahe treten,
aber so richtig verstanden habe ich deine Anfrage nicht.

eventuell kannst du die doch noch ein bisschen ausführlicher schreiben :)

ich habe eine Webseit = ok verstanden
auf der mehrere Links zu finden sind = ok auch verstanden
die ich herausziehen möchte = ?????? keine Ahnung was du damit meinst.

pKai

2005-06-04 02:10

User since
2005-02-18
357 Artikel
BenutzerIn
[default_avatar]

Aus <a href="//www.perldoc.com/perl5.8.0/pod/perlop.html" target="_blank">perldoc perlop</a>

Quote
The "/g" modifier specifies global pattern matching--that is,
matching as many times as possible within the string. How it
behaves depends on the context. In list context, it returns a
list of the substrings matched by any capturing parentheses in
the regular expression. If there are no parentheses, it returns
a list of all the matched strings, as if there were parentheses
around the whole pattern.

In scalar context, each execution of "m//g" finds the next
match, returning true if it matches, and false if there is no
further match. The position after the last match can be read or
set using the pos() function; see the pos entry in the perlfunc
manpage. A failed match normally resets the search position to
the beginning of the string, but you can avoid that by adding
the "/c" modifier (e.g. "m//gc"). Modifying the target string
also resets the search position.

Meine Hervorhebung.

Bedeutet, du kannst so einen Match mit /g in einer Schleifenbedingung verwenden.

Anmerkung zu deinem Ausdruck:
[*]Der verdoppelte "/" im Endbereich ist ein Syntaxfehler.[*]Das "gierige" Konstrukt [tt].*[/tt] bewirkt, dass du bei mehr als einem solchen a-Tag in einer Zeile Unsinn matchst. (Mit [tt].*?[/tt] nicht-gierig matchen würde das verhindern.)[*]Umbrüche zwischen <a..> und </a> verhindern den Match bei dir. Der /s-Modifikator hilft dir hier weiter, falls die komplette Seite in der Variablen steckt.

I sense a soul in search of answers.

Dubu

2005-06-04 17:17

User since
2003-08-04
2145 Artikel
ModeratorIn + EditorIn

user image

Ich schlage HTML::LinkExtor oder HTML::SimpleLinkExtor vor.

renee

2005-06-05 14:32

User since
2003-08-04
14371 Artikel
ModeratorIn

Im Wiki findest Du auch einen Artikel zu HTML::Parser, in dem auch beschrieben ist, wie man Links rausfischt...

OTRS-Erweiterungen (http://feature-addons.de/)
Frankfurt Perlmongers (http://frankfurt.pm/)
--

Unterlagen OTRS-Workshop 2012: http://otrs.perl-services.de/workshop.html
Perl-Entwicklung: http://perl-services.de/

View all threads created 2005-06-03 23:53.