Author |
Message |
Lee M Horowit #1 / 23
|
 regexp for strings of chars
I cant help thinking there's a "neat" way to do this, but somehow I cant think of it... I'ld like a regexp that matches a string containing a "run" of lets say THREE of the SAME letter or number. Any letter or number would do. In other words xaaabcds would match but so would abcxxx or abc111qzd or .... you get it.
|
Sat, 28 Oct 2000 03:00:00 GMT |
|
 |
Craig Ber #2 / 23
|
 regexp for strings of chars
: I cant help thinking there's a "neat" way to do this, but somehow I cant : think of it... : I'ld like a regexp that matches a string containing a "run" of lets say : THREE of the SAME : letter or number. Any letter or number would do. In other words xaaabcds : would match : but so would abcxxx or abc111qzd or .... you get it. First, you might wish to configure your newsreader to wrap your columns at 78 characters, or switch to another newsreader if this isn't possible on the one you have. Now, here are some possibilities, which we'll match using /g in a list context to see what pops up from this test string: $_ = 'aaaaaabbb,,,xyfooffff';
There are lots of other possibilities, but this should convey the basic idea. Hope this helps! ---------------------------------------------------------------------
--*-- Home Page: http://www.cinenet.net/users/cberry/home.html | Member of The HTML Writers Guild: http://www.hwg.org/ "Every man and every woman is a star."
|
Sat, 28 Oct 2000 03:00:00 GMT |
|
 |
Mike St #3 / 23
|
 regexp for strings of chars
Quote: >I cant help thinking there's a "neat" way to do this, but somehow I cant >think of it... >I'ld like a regexp that matches a string containing a "run" of lets say >THREE of the SAME >letter or number. Any letter or number would do. In other words xaaabcds >would match >but so would abcxxx or abc111qzd or .... you get it.
You might consider backreferences (which make perl's regexes not regular) e.g. if ($string =~ /(.)\1{2}/) { ... } You should consult the perlre man page to find out more about how perl's regexes work (are \1 and $1 different? what does . match? etc. :-). Hope this helps, Mike --
http://www.stok.co.uk/~mike/ | PGP fingerprint FE 56 4D 7D 42 1A 4A 9C http://www.tiac.net/users/stok/ | 65 F3 3F 1D 27 22 B7 41
|
Sat, 28 Oct 2000 03:00:00 GMT |
|
 |
Tom Christianse #4 / 23
|
 regexp for strings of chars
[courtesy cc of this posting sent to cited author via email]
:I cant help thinking there's a "neat" way to do this, but somehow I cant :think of it... <<<<BAD WRAP>>> :I'ld like a regexp that matches a string containing a "run" of lets say :THREE of the SAME <<<<BAD WRAP>>> :letter or number. Any letter or number would do. In other words xaaabcds :would match <<<<BAD WRAP>>> :but so would abcxxx or abc111qzd or .... you get it. I'm afraid that your newsreader has used MSFMH line-wrapping your posting. Better check its settings. --tom -- "If Dennis Ritchie were the man who developed Modula-2 then C would be long forgotten." --Tarjei Jensen
|
Sat, 28 Oct 2000 03:00:00 GMT |
|
 |
Tom Phoeni #5 / 23
|
 regexp for strings of chars
Quote:
> I'ld like a regexp that matches a string containing a "run" of lets say > THREE of the SAME letter or number. Any letter or number would do.
I think you want a backreference, which is one of those things that looks like '\8' in a pattern. What have you tried that didn't work? -- Tom Phoenix Perl Training and Hacking Esperanto Randal Schwartz Case: http://www.rahul.net/jeffrey/ovs/
|
Sat, 28 Oct 2000 03:00:00 GMT |
|
 |
Tad McClell #6 / 23
|
 regexp for strings of chars
: I cant help thinking there's a "neat" way to do this, but somehow I cant : think of it... : I'ld like a regexp that matches a string containing a "run" of lets say : THREE of the SAME : letter or number. Any letter or number would do. In other words xaaabcds : would match : but so would abcxxx or abc111qzd or .... you get it. /([a-zA-Z0-9])\1\1/; -- Tad McClellan SGML Consulting
Fort Worth, Texas
|
Sat, 28 Oct 2000 03:00:00 GMT |
|
 |
Craig Ber #7 / 23
|
 regexp for strings of chars
: : I cant help thinking there's a "neat" way to do this, but somehow I cant : : think of it... : : I'ld like a regexp that matches a string containing a "run" of lets say : : THREE of the SAME : : letter or number. Any letter or number would do. In other words xaaabcds : : would match : : but so would abcxxx or abc111qzd or .... you get it. : : Now, here are some possibilities, which we'll match using /g in a list : context to see what pops up from this test string: : : $_ = 'aaaaaabbb,,,xyfooffff'; Right about here my brain took a long lunch. See corrected regexes (using backreferences, as others have suggested) below each of my clue-deficient originals:
m/(.)\1{2}/g;
m/(\w)\1{2}/g;
m/(.)\1{2,}/g;
m/(\w)\1{2,}/g; Moral: Never ever answer questions on clpm during little half-minute breaks from trying to figure out why a major product make that worked yesterday perfectly well is dying horribly today. :) ---------------------------------------------------------------------
--*-- Home Page: http://www.cinenet.net/users/cberry/home.html | Member of The HTML Writers Guild: http://www.hwg.org/ "Every man and every woman is a star."
|
Sat, 28 Oct 2000 03:00:00 GMT |
|
 |
Jim Mont #8 / 23
|
 regexp for strings of chars
Quote:
> $_ = 'aaaaaabbb,,,xyfooffff';
> m/(.)\1{2}/g;
I'm a newbie trying to understand, so pardon me if I'm missing something obvious, but... $ cat test.pl #!/usr/bin/perl -w
print "$_\n"; Quote: }
$ ./test.pl a a b , f $ Is this the intended match/output? -- Jim Monty
Tempe, Arizona USA
|
Sun, 29 Oct 2000 03:00:00 GMT |
|
 |
Tom Christianse #9 / 23
|
 regexp for strings of chars
[courtesy cc of this posting sent to cited author via email] In comp.lang.perl.misc,
You should able to use:
But that's not helping. I smell a bug. This also fails: while('aaaaaabbb,,,xyfooffff' =~ m/((.)\1{2})/g) { print "$1\n" } But this works: while('aaaaaabbb,,,xyfooffff' =~ m/(.)\1{2}/g) { print "$&\n" } To produce output like: aaa aaa bbb ,,, fff I'll play with the bug some more. --tom -- char program[1]; /* Unwarranted chumminess with compiler. */ --Larry Wall in the Perl source code (quoting Henry Spencer (quoting Dennis Ritchie (quoting Brian Kerninghan)))
|
Mon, 30 Oct 2000 03:00:00 GMT |
|
 |
Larry Rosle #10 / 23
|
 regexp for strings of chars
[posted and emailed]
Quote: > [courtesy cc of this posting sent to cited author via email] >In comp.lang.perl.misc,
>You should able to use:
>But that's not helping. I smell a bug. >This also fails: > while('aaaaaabbb,,,xyfooffff' =~ m/((.)\1{2})/g) > print "$1\n" > } >But this works: > while('aaaaaabbb,,,xyfooffff' =~ m/(.)\1{2}/g) > print "$&\n" > } >To produce output like: > aaa > aaa > bbb > ,,, > fff >I'll play with the bug some more.
Gosh, no bug. Try this regex (with \2 instead of \1): while('aaaaaabbb,,,xyfooffff' =~ m/((.)\2{2})/g) print "$1\n" } -- Larry Rosler Hewlett-Packard Laboratories
|
Mon, 30 Oct 2000 03:00:00 GMT |
|
 |
Tom Christianse #11 / 23
|
 regexp for strings of chars
[courtesy cc of this posting sent to cited author via email] In comp.lang.perl.misc,
:>I'll play with the bug some more. That's what I get for posting before coffee. I had been going back and forth with (, (?:, and (?= for some tests and at some point didn't propagate the required \1 and \2 changes through. You know, I don't think that (foo\1stuff) can be reasonable when \1 hasn't completed yet. I wonder whether that shouldn't warn you. --tom -- "There is no idea so sacred that it cannot be questioned, analyzed... and ridiculed." --Cal Keegan
|
Mon, 30 Oct 2000 03:00:00 GMT |
|
 |
Craig Ber #12 / 23
|
 regexp for strings of chars
: I'm a newbie trying to understand, so pardon me if I'm missing something : obvious, but... : : $ cat test.pl : #!/usr/bin/perl -w :
:
: print "$_\n"; : } : $ ./test.pl : a : a : b : , : f : $ : : Is this the intended match/output? Yes; my regex above tells you which chars are repeated three times. To get the actual three-char strings is a bit trickier, since you need two sets of capturing parens, which blows the accumulate to list with /g strategy. ---------------------------------------------------------------------
--*-- Home Page: http://www.cinenet.net/users/cberry/home.html | Member of The HTML Writers Guild: http://www.hwg.org/ "Every man and every woman is a star."
|
Mon, 30 Oct 2000 03:00:00 GMT |
|
 |
Tom Christianse #13 / 23
|
 regexp for strings of chars
[courtesy cc of this posting sent to cited author via email] In comp.lang.perl.misc,
:Yes; my regex above tells you which chars are repeated three times. To :get the actual three-char strings is a bit trickier, since you need two :sets of capturing parens, which blows the accumulate to list with /g :strategy. You can use a while loop my $string = 'aaaaaabbb,,,xyfooffff';
Or you get use careful greps: my $count = 0; my $string = 'aaaaaabbb,,,xyfooffff';
For overlapping matches, do this: my $string = 'aaaaaabbb,,,xyfooffff';
or my $count = 0; my $string = 'aaaaaabbb,,,xyfooffff';
--tom -- I wish there was a knob on the TV to turn up the intelligence. There's a knob called "brightness", but it doesn't work. --Gallagher
|
Mon, 30 Oct 2000 03:00:00 GMT |
|
 |
Ilya Zakharevi #14 / 23
|
 regexp for strings of chars
[A complimentary Cc of this posting was sent to Tom Christiansen
Quote: > For overlapping matches, do this: > my $string = 'aaaaaabbb,,,xyfooffff';
After almost-successful //h debate (heirarchical match) on p5p I decided to postpone implementing (?.REX) construct (which would hide parens/backrefs from the enclosing RE, and would start counting groups from 1). With such a construct, this might have been optimized to
and it might have been beautified to $triple = qr/(?.(.)\1{2})/;
Note that I would be a little bit wary to use //g with zero-length expression - unless necessary. Though it looks hard to avoid it in the above construct. This
is completely overboard. :-( Ilya
|
Mon, 30 Oct 2000 03:00:00 GMT |
|
 |
Jari Aalto+mail.pe #15 / 23
|
 regexp for strings of chars
| > For overlapping matches, do this: | > | > my $string = 'aaaaaabbb,,,xyfooffff';
| | After almost-successful //h debate (heirarchical match) on p5p I | decided to postpone implementing (?.REX) construct (which would hide | parens/backrefs from the enclosing RE, and would start counting groups | from 1). With such a construct, this might have been optimized to |
| | and it might have been beautified to | | $triple = qr/(?.(.)\1{2})/; |
| | Note that I would be a little bit wary to use //g with zero-length | expression - unless necessary. Though it looks hard to avoid it in | the above construct. This |
...Eh, would someone translate these magic regexps to something understandable. They are way of of my league. Thank you. jari
|
Sat, 04 Nov 2000 03:00:00 GMT |
|
|