regexp for strings of chars 
Author Message
 regexp for strings of chars

I cant help thinking there's a "neat" way to do this, but somehow I cant
think of it...
I'ld like a regexp that matches a string containing a "run" of lets say
THREE  of the SAME
letter or number. Any letter or number would do. In other words xaaabcds
would match
but so would abcxxx or abc111qzd or .... you get it.



Sat, 28 Oct 2000 03:00:00 GMT  
 regexp for strings of chars


: I cant help thinking there's a "neat" way to do this, but somehow I cant
: think of it...
: I'ld like a regexp that matches a string containing a "run" of lets say
: THREE  of the SAME
: letter or number. Any letter or number would do. In other words xaaabcds
: would match
: but so would abcxxx or abc111qzd or .... you get it.

First, you might wish to configure your newsreader to wrap your columns
at 78 characters, or switch to another newsreader if this isn't possible
on the one you have.

Now, here are some possibilities, which we'll match using /g in a list
context to see what pops up from this test string:

  $_ = 'aaaaaabbb,,,xyfooffff';





There are lots of other possibilities, but this should convey the basic
idea.  Hope this helps!

---------------------------------------------------------------------

 --*--    Home Page: http://www.cinenet.net/users/cberry/home.html
   |      Member of The HTML Writers Guild: http://www.hwg.org/  
       "Every man and every woman is a star."



Sat, 28 Oct 2000 03:00:00 GMT  
 regexp for strings of chars



Quote:
>I cant help thinking there's a "neat" way to do this, but somehow I cant
>think of it...
>I'ld like a regexp that matches a string containing a "run" of lets say
>THREE  of the SAME
>letter or number. Any letter or number would do. In other words xaaabcds
>would match
>but so would abcxxx or abc111qzd or .... you get it.

You might consider backreferences (which make perl's regexes not regular)
e.g.

  if ($string =~ /(.)\1{2}/) {
     ...
  }

You should consult the perlre man page to find out more about how perl's
regexes work (are \1 and $1 different? what does . match? etc. :-).

Hope this helps,

Mike

--

http://www.stok.co.uk/~mike/       |   PGP fingerprint FE 56 4D 7D 42 1A 4A 9C
http://www.tiac.net/users/stok/    |                   65 F3 3F 1D 27 22 B7 41



Sat, 28 Oct 2000 03:00:00 GMT  
 regexp for strings of chars

 [courtesy cc of this posting sent to cited author via email]


:I cant help thinking there's a "neat" way to do this, but somehow I cant
:think of it...  <<<<BAD WRAP>>>
:I'ld like a regexp that matches a string containing a "run" of lets say
:THREE  of the SAME  <<<<BAD WRAP>>>
:letter or number. Any letter or number would do. In other words xaaabcds
:would match  <<<<BAD WRAP>>>
:but so would abcxxx or abc111qzd or .... you get it.

I'm afraid that your newsreader has used MSFMH line-wrapping your posting.
Better check its settings.

--tom
--
"If Dennis Ritchie were the man who developed Modula-2 then C would be long forgotten."
    --Tarjei Jensen



Sat, 28 Oct 2000 03:00:00 GMT  
 regexp for strings of chars

Quote:

> I'ld like a regexp that matches a string containing a "run" of lets say
> THREE of the SAME letter or number. Any letter or number would do.

I think you want a backreference, which is one of those things that looks
like '\8' in a pattern. What have you tried that didn't work?

--
Tom Phoenix       Perl Training and Hacking       Esperanto
Randal Schwartz Case:     http://www.rahul.net/jeffrey/ovs/



Sat, 28 Oct 2000 03:00:00 GMT  
 regexp for strings of chars


: I cant help thinking there's a "neat" way to do this, but somehow I cant
: think of it...
: I'ld like a regexp that matches a string containing a "run" of lets say
: THREE  of the SAME
: letter or number. Any letter or number would do. In other words xaaabcds
: would match
: but so would abcxxx or abc111qzd or .... you get it.

   /([a-zA-Z0-9])\1\1/;

--
    Tad McClellan                          SGML Consulting

    Fort Worth, Texas



Sat, 28 Oct 2000 03:00:00 GMT  
 regexp for strings of chars


: : I cant help thinking there's a "neat" way to do this, but somehow I cant
: : think of it...
: : I'ld like a regexp that matches a string containing a "run" of lets say
: : THREE  of the SAME
: : letter or number. Any letter or number would do. In other words xaaabcds
: : would match
: : but so would abcxxx or abc111qzd or .... you get it.
:
: Now, here are some possibilities, which we'll match using /g in a list
: context to see what pops up from this test string:
:
:   $_ = 'aaaaaabbb,,,xyfooffff';

Right about here my brain took a long lunch.  See corrected regexes
(using backreferences, as others have suggested) below each of my
clue-deficient originals:


         m/(.)\1{2}/g;


         m/(\w)\1{2}/g;


         m/(.)\1{2,}/g;


         m/(\w)\1{2,}/g;

Moral:  Never ever answer questions on clpm during little half-minute
breaks from trying to figure out why a major product make that worked
yesterday perfectly well is dying horribly today. :)

---------------------------------------------------------------------

 --*--    Home Page: http://www.cinenet.net/users/cberry/home.html
   |      Member of The HTML Writers Guild: http://www.hwg.org/  
       "Every man and every woman is a star."



Sat, 28 Oct 2000 03:00:00 GMT  
 regexp for strings of chars

Quote:

> $_ = 'aaaaaabbb,,,xyfooffff';

>      m/(.)\1{2}/g;

I'm a newbie trying to understand, so pardon me if I'm missing something
obvious, but...

$ cat test.pl
#!/usr/bin/perl -w


    print "$_\n";

Quote:
}

$ ./test.pl
a
a
b
,
f
$

Is this the intended match/output?

--
Jim Monty

Tempe, Arizona USA



Sun, 29 Oct 2000 03:00:00 GMT  
 regexp for strings of chars

 [courtesy cc of this posting sent to cited author via email]

In comp.lang.perl.misc,

You should able to use:

But that's not helping.  I smell a bug.

This also fails:

    while('aaaaaabbb,,,xyfooffff' =~ m/((.)\1{2})/g) {
        print "$1\n"
    }

But this works:

    while('aaaaaabbb,,,xyfooffff' =~ m/(.)\1{2}/g) {
        print "$&\n"
    }

To produce output like:

    aaa
    aaa
    bbb
    ,,,
    fff

I'll play with the bug some more.

--tom
--
    char program[1];        /* Unwarranted chumminess with compiler. */
            --Larry Wall in the Perl source code
    (quoting Henry Spencer (quoting Dennis Ritchie (quoting Brian Kerninghan)))



Mon, 30 Oct 2000 03:00:00 GMT  
 regexp for strings of chars

[posted and emailed]


Quote:
> [courtesy cc of this posting sent to cited author via email]

>In comp.lang.perl.misc,


>You should able to use:


>But that's not helping.  I smell a bug.

>This also fails:

>    while('aaaaaabbb,,,xyfooffff' =~ m/((.)\1{2})/g)
> print "$1\n"
>    }

>But this works:

>    while('aaaaaabbb,,,xyfooffff' =~ m/(.)\1{2}/g)
> print "$&\n"
>    }

>To produce output like:

>    aaa
>    aaa
>    bbb
>    ,,,
>    fff

>I'll play with the bug some more.

Gosh, no bug.  Try this regex (with \2 instead of \1):

    while('aaaaaabbb,,,xyfooffff' =~ m/((.)\2{2})/g)

        print "$1\n"
    }

--
Larry Rosler
Hewlett-Packard Laboratories



Mon, 30 Oct 2000 03:00:00 GMT  
 regexp for strings of chars

 [courtesy cc of this posting sent to cited author via email]

In comp.lang.perl.misc,

:>I'll play with the bug some more.

That's what I get for posting before coffee.  I had been going back
and forth with (, (?:, and (?= for some tests and at some point didn't
propagate the required \1 and \2 changes through.

You know, I don't think that (foo\1stuff) can be reasonable
when \1 hasn't completed yet.  I wonder whether that shouldn't
warn you.

--tom
--
"There is no idea so sacred that it cannot be questioned, analyzed...
 and ridiculed." --Cal Keegan



Mon, 30 Oct 2000 03:00:00 GMT  
 regexp for strings of chars

: I'm a newbie trying to understand, so pardon me if I'm missing something
: obvious, but...
:
: $ cat test.pl
: #!/usr/bin/perl -w
:

:

:     print "$_\n";
: }
: $ ./test.pl
: a
: a
: b
: ,
: f
: $
:
: Is this the intended match/output?

Yes; my regex above tells you which chars are repeated three times.  To
get the actual three-char strings is a bit trickier, since you need two
sets of capturing parens, which blows the accumulate to list with /g
strategy.

---------------------------------------------------------------------

 --*--    Home Page: http://www.cinenet.net/users/cberry/home.html
   |      Member of The HTML Writers Guild: http://www.hwg.org/  
       "Every man and every woman is a star."



Mon, 30 Oct 2000 03:00:00 GMT  
 regexp for strings of chars

 [courtesy cc of this posting sent to cited author via email]

In comp.lang.perl.misc,

:Yes; my regex above tells you which chars are repeated three times.  To
:get the actual three-char strings is a bit trickier, since you need two
:sets of capturing parens, which blows the accumulate to list with /g
:strategy.

You can use a while loop

    my $string = 'aaaaaabbb,,,xyfooffff';

Or you get use careful greps:

    my $count  = 0;
    my $string = 'aaaaaabbb,,,xyfooffff';

For overlapping matches, do this:

    my $string = 'aaaaaabbb,,,xyfooffff';

or

    my $count  = 0;
    my $string = 'aaaaaabbb,,,xyfooffff';

--tom
--
I wish there was a knob on the TV to turn up the intelligence.  There's
a knob called "brightness", but it doesn't work.
                --Gallagher



Mon, 30 Oct 2000 03:00:00 GMT  
 regexp for strings of chars

[A complimentary Cc of this posting was sent to Tom Christiansen


Quote:
> For overlapping matches, do this:

>     my $string = 'aaaaaabbb,,,xyfooffff';


After almost-successful //h debate (heirarchical match) on p5p I
decided to postpone implementing (?.REX) construct (which would hide
parens/backrefs from the enclosing RE, and would start counting groups
from 1).  With such a construct, this might have been optimized to


and it might have been beautified to

    $triple = qr/(?.(.)\1{2})/;


Note that I would be a little bit wary to use //g with zero-length
expression - unless necessary.  Though it looks hard to avoid it in
the above construct.  This


is completely overboard.  :-(

Ilya



Mon, 30 Oct 2000 03:00:00 GMT  
 regexp for strings of chars


| > For overlapping matches, do this:
| >
| >     my $string = 'aaaaaabbb,,,xyfooffff';

|
| After almost-successful //h debate (heirarchical match) on p5p I
| decided to postpone implementing (?.REX) construct (which would hide
| parens/backrefs from the enclosing RE, and would start counting groups
| from 1).  With such a construct, this might have been optimized to
|

|
| and it might have been beautified to
|
|     $triple = qr/(?.(.)\1{2})/;
|

|
| Note that I would be a little bit wary to use //g with zero-length
| expression - unless necessary.  Though it looks hard to avoid it in
| the above construct.  This
|

...Eh, would someone translate these magic regexps to something
understandable. They are way of of my league. Thank you.

jari



Sat, 04 Nov 2000 03:00:00 GMT  
 
 [ 23 post ]  Go to page: [1] [2]

 Relevant Pages 

1. HELP: deactivating regexp-active chars from string?

2. regexp: matching at least n chars out of a string of length m

3. reading char by char in a string

4. Char position of 1st non-word char in a string

5. Appending char to strings by char

6. split a string not only by a single char but also by a string

7. Newbie: sprintf ('%-20s', 48 char string) returns 48 not 20 length string

8. regexp to negate 2 chars at once

9. RegExp Char Class

10. Regexp char class: mixing ranges and negation?

11. Regexp containing plus-chars won't work?!?!?

12. How handle special chars during regexp

 

 
Powered by phpBB® Forum Software