Kniffeliger regulärer Ausdruck gesucht

AlexZ71 · 23. November 2005 um 21:20

Hallo allerseits,

… naja, kniffelig jedenfalls für mich.

Also: ich habe einen String, z.B.:

$text = „Teil1\nTeil2\nRestRest\nRestRest…“;

Und möchte, dass dieser String in 3 Einzelteile zerlegt wird, und zwar in:

Teil1
Teil2
RestRest\nRestRest…

(Man beachte das Zeilenumbruchzeichen um dritten String, es sollen dort beliebig viele vorkommen dürfen)
Bis jetzt habe ich alles mögliche um folgende Lösung probiert:

$text =~ /(.+?)\n(.+?)\n(.+)/;
$Teil1 = $1;
$Teil2 = $2;
$Rest = $3;

Aber so recht klappt das nicht, weil auf diese Weise alles hinter dem \n von „Rest“ abgeschnitten wird;
RestRest

Wenn ich durch Hinzufügen von $ das Ende markieren will:
$text =~ /(.+?)\n(.+?)\n(.+)$/;

dann ist zwar „Rest“ vollständig, aber „Teil2“ verschwindet.

Ich denke, für Erfahrene ist das ein Kinderspiel und hoffe auf eure Hilfe!

Vielen Dank im Voraus!
Alexander

Moritz_fe1b4f · 24. November 2005 um 00:00

Hallo,

$text = „Teil1\nTeil2\nRestRest\nRestRest…“;

müssen es unbedingt regex sein?

my $teil1;
my $teil2;
my $rest;
{
 my @tmp = split /\n/, $text;
 $teil1 = shift @tmp;
 $teil2 = shift @tmp;
 $rest = join("\n", @tmp);
}

Und möchte, dass dieser String in 3 Einzelteile zerlegt wird,
und zwar in:

Teil1
Teil2
RestRest\nRestRest…

(Man beachte das Zeilenumbruchzeichen um dritten String, es
sollen dort beliebig viele vorkommen dürfen)
Bis jetzt habe ich alles mögliche um folgende Lösung probiert:

$text =~ /(.+?)\n(.+?)\n(.+)/;
$Teil1 = $1;
$Teil2 = $2;
$Rest = $3;

Das Problem ist, dass in der letzten Klammer der Punkt ‚.‘ kein newline matched.
Zitat:
Because . doesn’t match \n. [\0-\377] is the most efficient way to match
everything currently. Maybe \e should match everything. And \E would
of course match nothing.
– Larry Wall in

Und aus perldoc -q match:

Found in /usr/share/perl/5.8/pod/perlfaq6.pod
 I'm having trouble matching over more than one line. What's wrong?

 Either you don't have more than one line in the string you're looking
 at (probably), or else you aren't using the correct modifier(s) on your
 pattern (possibly).

 There are many ways to get multiline data into a string. If you want
 it to happen automatically while reading input, you'll want to set $/
 (probably to '' for paragraphs or "undef" for the whole file) to allow
 you to read more than one line at a time.

 Read perlre to help you decide which of "/s" and "/m" (or both) you
 might want to use: "/s" allows dot to include newline, and "/m" allows
 caret and dollar to match next to a newline, not just at the end of the
 string. You do need to make sure that you've actually got a multiline
 string in there.

 For example, this program detects duplicate words, even when they span
 line breaks (but not paragraph ones). For this example, we don't need
 "/s" because we aren't using dot in a regular expression that we want
 to cross line boundaries. Neither do we need "/m" because we aren't
 wanting caret or dollar to match at any point inside the record next to
 newlines. But it's imperative that $/ be set to something other than
 the default, or else we won't actually ever have a multiline record
 read in.

 $/ = ''; # read in more whole paragraph, not just one line
 while ( ) {
 while ( /\b([\w'-]+)(\s+\1)+\b/gi ) { # word starts alpha
 print "Duplicate $1 at paragraph $.\n";
 }
 }

 Here's code that finds sentences that begin with "From " (which would
 be mangled by many mailers):

 $/ = ''; # read in more whole paragraph, not just one line
 while ( ) {
 while ( /^From /gm ) { # /m makes ^ match next to \n
 print "leading from in paragraph $.\n";
 }
 }

 Here's code that finds everything between START and END in a paragraph:

 undef $/; # read in whole file, not just one line or paragraph
 while ( ) {
 while ( /START(.\*?)END/sgm ) { # /s makes . cross line boundaries
 print "$1\n";
 }
 }

Noch Fragen?

Grüße,
Moritz

AlexZ71 · 24. November 2005 um 00:38

Hallo Moritz,

nein, muss kein Regex sein. Den join-Befehl kannte ich nicht, und das sieht sehr schön kompakt aus!

Keine Fragen mehr, vielen Dank!!

Gruß
Alexander

Dominic_Neumann_dce771 · 20. Februar 2006 um 00:02

my $teil1;
my $teil2;
my $rest;
{
my @tmp = split /\n/, $text;
$teil1 = shift @tmp;
$teil2 = shift @tmp;
$rest = join("\n", @tmp);
}

noch kürzer und nativer geht es so:

my ($teil1, $teil2, $rest) = split(/\n/, $text, 3);