Regular Expressions/Pattern Matching/Unordered pattern 
Author Message
 Regular Expressions/Pattern Matching/Unordered pattern

I've been trying to find a regular expression or something to do some
pattern matching.

Basically I have a group of letters say "dtseupx".  Now I want to
apply words to this group of letters and see if it's possible to spell
that word with that group of letters.

So first I would check like "setup"  which is valid but "setups" is in
valid because there's only 1 "s" in the group of letters.  Also, "sex"
is valid but "axe" is invalid.

Is there some sort of regular expression to do this or would I have to
order the list of letters alphabetically first?  Has anybody come
across this before?

Cristov



Sun, 12 Dec 2004 16:55:55 GMT  
 Regular Expressions/Pattern Matching/Unordered pattern


Quote:
> I've been trying to find a regular expression or something to do some
> pattern matching.

> Basically I have a group of letters say "dtseupx".  Now I want to
> apply words to this group of letters and see if it's possible to spell
> that word with that group of letters.

> So first I would check like "setup"  which is valid but "setups" is in
> valid because there's only 1 "s" in the group of letters.  Also, "sex"
> is valid but "axe" is invalid.

> Is there some sort of regular expression to do this or would I have to
> order the list of letters alphabetically first?  Has anybody come
> across this before?

Here's a quick solution. I am not aware of a pure-regex solution, but I am
no re guru. "Quick" as in quickly written, not fast not efficient.

#!/usr/bin/perl
my $chars = 'sadletjluaskpdjlekjsasd';

my %chars;
foreach ( split //, $chars ) {
   $chars{$_}++;

Quote:
}



   my %copy = %chars;


   print "$word can "
         . ( ( grep { $_ < 0 } values %copy ) ? 'not ' : '' )
         . "be made of the characters $chars.\n";

Quote:
}

Steffen






Sun, 12 Dec 2004 17:12:20 GMT  
 Regular Expressions/Pattern Matching/Unordered pattern

Quote:

>Basically I have a group of letters say "dtseupx".  Now I want to
>apply words to this group of letters and see if it's possible to spell
>that word with that group of letters.

>So first I would check like "setup"  which is valid but "setups" is in
>valid because there's only 1 "s" in the group of letters.  Also, "sex"
>is valid but "axe" is invalid.

>Is there some sort of regular expression to do this or would I have to
>order the list of letters alphabetically first?  Has anybody come
>across this before?

Ooh, there's a thought. Sort your letters of both words alphabetically.
Next, try to match the larger word with the smaller word, with .*
inserted between the letters.  So:

        $pool = join '', sort split //, 'dtseupx';
        $word = 'setup';
        $pattern = join '.*', sort split //, $word;
        if($pool =~ /$pattern/) {
            print "We've got a winner!\n";
        } else {
            print "Better luck next time.\n";
        }

Now, that works. I wouldn't think it's impossible that it would
occasionally take a long time, because of a lot of backtracking in the
search, thanks to the many occurrences of ".*". Now, perhaps the "do not
backtrack here" assertion might help, I don't know. Perhaps somebody
with a better understanding of those, could jump in. (See /(?>PATTERN)/
in perlre.)

Another approach would be to use hashes. You make a pool of available
letters, with a count for each. Now, for all letters in the word,
subtract one from its hash value. It fails as soon as you find one
negative result. Like this:

        $pool = 'dtseupx';
        $word = 'setup';
        foreach (split //, $pool) {
            $present{$_}++;
        }
        my $fail;
        foreach (split //, $word) {
            if(--$present{$_} < 0) {
                $fail = 1;
                last;
            }
        }
        unless($fail) {
            print "We've got a winner!\n";
        } else {
            print "Better luck next time.\n";
        }

--
        Bart.



Sun, 12 Dec 2004 17:15:49 GMT  
 Regular Expressions/Pattern Matching/Unordered pattern


[...]

Quote:
> $pool = 'dtseupx';
> $word = 'setup';
> foreach (split //, $pool) {
>     $present{$_}++;
> }
> my $fail;
> foreach (split //, $word) {
>     if(--$present{$_} < 0) {
>         $fail = 1;
>         last;
>     }

Maybe add this line here to speed things up:
last if $fail;

Quote:
> }
> unless($fail) {
>     print "We've got a winner!\n";
> } else {
>     print "Better luck next time.\n";
> }

Steffen





Sun, 12 Dec 2004 17:27:16 GMT  
 Regular Expressions/Pattern Matching/Unordered pattern

Quote:

>>     if(--$present{$_} < 0) {
>>         $fail = 1;
>>         last;
>>     }

>Maybe add this line here to speed things up:
>last if $fail;

Won't help. last is executed anyway, as soon as fail is set.

You could simplify it to

        $fail = --$present{$_} < 0 and last;

--
        Bart.



Sun, 12 Dec 2004 17:29:01 GMT  
 Regular Expressions/Pattern Matching/Unordered pattern


Quote:

> >>     if(--$present{$_} < 0) {
> >>         $fail = 1;
> >>         last;
> >>     }

> >Maybe add this line here to speed things up:
> >last if $fail;

> Won't help. last is executed anyway, as soon as fail is set.

Yes, of course. I somehow misread the inner if block to be a loop out of
which the last would jump. I'll shut down my brain for an overhaul now and
go play squash. :)

Steffen





Sun, 12 Dec 2004 17:46:33 GMT  
 Regular Expressions/Pattern Matching/Unordered pattern

Quote:

> Basically I have a group of letters say "dtseupx".  Now I want to
> apply words to this group of letters and see if it's possible to spell
> that word with that group of letters.
> So first I would check like "setup"  which is valid but "setups" is in
> valid because there's only 1 "s" in the group of letters.  Also, "sex"
> is valid but "axe" is invalid.

This can be easily accomplished with a simple substitution
operator and Perl's length () function.

Godzilla!
--
 My primary web site enjoys an average twelve-thousand hits per month.
 How popular is your web site?

TEST SCRIPT:
____________

#!perl

print "Content-type: text/plain\n\n";

$input = "setup";
$test = $input;
&Compare;

$input = "setups";
$test = $input;
&Compare;

$input = "sex";
$test = $input;
&Compare;

$input = "axe";
$test = $input;
&Compare;

sub Compare
 {
  local ($") = "";

   {
    $test =~ s/$_//;
    if (length ($test) == 0)

   }
  if (length ($test) > 0)

 }

PRINTED RESULTS:
________________

setup can be spelled with dtseupx

setups cannot be spelled with dtseupx

sex can be spelled with dtseupx

axe cannot be spelled with dtseupx



Sun, 12 Dec 2004 18:00:17 GMT  
 Regular Expressions/Pattern Matching/Unordered pattern
Chris wrote at Wed, 26 Jun 2002 17:55:55 +0200:

Quote:
> I've been trying to find a regular expression or something to do some pattern matching.

> Basically I have a group of letters say "dtseupx".  Now I want to apply words to this group of
> letters and see if it's possible to spell that word with that group of letters.

> So first I would check like "setup"  which is valid but "setups" is in valid because there's only
> 1 "s" in the group of letters.  Also, "sex" is valid but "axe" is invalid.

> Is there some sort of regular expression to do this or would I have to order the list of letters
> alphabetically first?  Has anybody come across this before?

Year, in fact there's a one liner:

Let's assume that

my $letters = "dtseupx";
my $word    = "setups";

Then
$word =~ s/$_// for split(//, $letters);

removes as much (valid) letters from word.
So now $word is empty iff it had been an valid word.
E.g.
print $word ? "In-Valid" : "Valid";

Greetings,
Janek



Sun, 12 Dec 2004 19:13:36 GMT  
 Regular Expressions/Pattern Matching/Unordered pattern

Quote:

> I've been trying to find a regular expression or something to do some
> pattern matching.

> Basically I have a group of letters say "dtseupx".  Now I want to
> apply words to this group of letters and see if it's possible to spell
> that word with that group of letters.

> So first I would check like "setup"  which is valid but "setups" is
> invalid because there's only 1 "s" in the group of letters.  Also,
> "sex" is valid but "axe" is invalid.

for my $str ( qw( setup setups sex axe ) ) {
   print "$str is ";
   print "in" if $str =~ tr/dtseupx//c;
   print "valid\n";

Quote:
}
> Is there some sort of regular expression to do this or would I have to
> order the list of letters alphabetically first?  Has anybody come
> across this before?

> Cristov

--

pack 'u', pack 'H*', 'ab5cf4021bafd28972030972b00a218eb9720000';


Sun, 12 Dec 2004 22:10:58 GMT  
 Regular Expressions/Pattern Matching/Unordered pattern

(snipped)

Quote:
> for my $str ( qw( setup setups sex axe ) ) {
>    print "$str is ";
>    print "in" if $str =~ tr/dtseupx//c;
>    print "valid\n";
> }

You would be prudent to test your code before posting.

Godzilla!
--
 My primary web site enjoys an average twelve-thousand hits per month.
 How popular is your web site?



Sun, 12 Dec 2004 21:32:17 GMT  
 Regular Expressions/Pattern Matching/Unordered pattern

Quote:

>> So first I would check like "setup"  which is valid but "setups" is
>> invalid because there's only 1 "s" in the group of letters.  Also,
>> "sex" is valid but "axe" is invalid.

>for my $str ( qw( setup setups sex axe ) ) {
>   print "$str is ";
>   print "in" if $str =~ tr/dtseupx//c;
>   print "valid\n";
>}

First of all, on 5.6.1, this prints:

        Modification of a read-only value attempted

in the tr///, of course. I thought this got fixed.

Second, you're doing something wrong:

        setup is valid
        setups is valid
        sex is valid
        axe is invalid

The second one is wrong. You're not counting instances of the same
letter.

--
        Bart.



Sun, 12 Dec 2004 21:32:40 GMT  
 Regular Expressions/Pattern Matching/Unordered pattern

Quote:


> >> So first I would check like "setup"  which is valid but "setups" is
> >> invalid because there's only 1 "s" in the group of letters.  Also,
> >> "sex" is valid but "axe" is invalid.

> >for my $str ( qw( setup setups sex axe ) ) {
> >   print "$str is ";
> >   print "in" if $str =~ tr/dtseupx//c;
> >   print "valid\n";
> >}

> First of all, on 5.6.1, this prints:

>         Modification of a read-only value attempted

> in the tr///, of course. I thought this got fixed.

It did, in 5.8.0.

Quote:
> Second, you're doing something wrong:

>         setup is valid
>         setups is valid
>         sex is valid
>         axe is invalid

> The second one is wrong. You're not counting instances of the same
> letter.

Oops, you're right.

Hmm.

for my $string ( qw( setup setups sex axe ) ) {
   my $scopy = $string;
   # First, check with tr///, since that's fast.
   print("$string is invalid\n"), next
      if $scopy =~ tr/dtseupx//c;

   $scopy =~ s/\Q$_// for qw[d t s e u p x];
   print("$string is invalid\n"), next
      if length $scopy;
   print("$string is valid\n");

Quote:
}

[still untested, though]

The second part might be done better with a hash, as others have
suggested.

--

pack 'u', pack 'H*', 'ab5cf4021bafd28972030972b00a218eb9720000';



Sun, 12 Dec 2004 23:22:40 GMT  
 Regular Expressions/Pattern Matching/Unordered pattern

Quote:

> Ooh, there's a thought. Sort your letters of both words alphabetically.

An idea along similar lines:

#!/usr/bin/perl -w
use strict;

my $letters_pattern = '^' . join('?', sort split //, 'dtseupx') .
"?\$";

foreach my $word (<DATA>)
{
    chomp($word);
    my $sorted_word = join '', sort split //, $word;
    print "$word: ", ($sorted_word =~ /$letters_pattern/)? "yes\n":
"no\n";

Quote:
}

__DATA__
setup
setups
axe
sex
miscellaneous

This version sorts the allowed letters, and makes each optional in the
regex, i.e. the matching string must consist of some subset of the
allowed set. Of course the input word must be sorted also.

Quote:
> Now, that works. I wouldn't think it's impossible that it would
> occasionally take a long time, because of a lot of backtracking in the
> search, thanks to the many occurrences of ".*".

There's still a bit of backtracking possible in this version, but it's
greatly restricted since there are only "?"s, and no ".*"s.

It would be possible to generate a regex for an allowed set that did
not require sorting of the input, but the regex would still need to be
generated beforehand, so it would be of marginal usefulness.

Two approaches to this are:

. recursively make a regex that is the union of regexes starting with
each letter and continuing with a regex that takes the same form for
the remaining letters (optionally). Basically it would describe all
languages consisting of all permutations of any number of the allowed
letters

. Using lookahead assertions that each allowed letter appears in the
string no more times than in the allowed list.

#!/usr/bin/perl -w
use strict;

my $allowed = 'dtseuspx';
my $letters_pattern = "^(?!.*[^$allowed])";
foreach my $letter (split //, $allowed)
{
    my $count = ($allowed =~ s/$letter//g) + 1;
    $letters_pattern .= "(?!(?:.*$letter){$count})" if ($count > 1);

Quote:
}

print "PATTERN: $letters_pattern\n";

foreach my $word (<DATA>)
{
    chomp($word);
    print "$word: ", ($word =~ /$letters_pattern/)? "yes\n": "no\n";

Quote:
}

__DATA__
setup
setups
setupss
axe
sex
miscellaneous

This constructs a regex that forbids disallowed letters, and adds an
assertion for each allowed letter that it does not appear too many
times.

I guess this might be useful, in that any string can then simply be
matched against the (compiled) pattern at a later stage.

Note that these techniques will work with letters, but any regex
metachars would need special handling.

Regards,

Ian



Fri, 17 Dec 2004 10:22:52 GMT  
 
 [ 13 post ] 

 Relevant Pages 

1. Regular Expressions: Matching words contained within character patterns?

2. PERLFUNC: m// - match a string with a regular expression pattern

3. PERLFUNC: m// - match a string with a regular expression pattern

4. Pattern matching and regular expression

5. pattern as a sentence in pattern matching

6. help with pattern match (repeated patterns)

7. Search the pattern with Perl Regular Expression

8. regular expression - html tag-esque pattern

9. Regular Expressions: Excluding Pattern

10. Regular Pattern Matching

11. Regular Matching pattern help

12. loops, conditional expressions, and pattern match variables

 

 
Powered by phpBB® Forum Software