Need an expression to strip non-alphanumeric except space 
Author Message
 Need an expression to strip non-alphanumeric except space

Quote:

>I'm not an re guru.  Can someone give me a line to strip out
>non-alphanumeric except space from a line.  I tried \W but it takes
>out the spaces i.e.;

>$string='abcd efg+-_)';
>$string=~s/\W||_//gi;
>print "$string\n";
>abcdefg

>Can someone give me an re that'll result in:
>abcd efg

Here's one way - by negating a character class

$string =~ s/[^a-z\d\s]//gi;

--
-- Steve __



Mon, 07 Jun 2004 00:19:24 GMT  
 Need an expression to strip non-alphanumeric except space

Quote:
> I'm not an re guru.  Can someone give me a line to strip out
> non-alphanumeric except space from a line.  I tried \W but it takes
> out the spaces i.e.;

> $string='abcd efg+-_)';
> $string=~s/\W||_//gi;

               ^
               ^
That's an extra alternation symbol.  Plus, you are substituting one
character at a time instead of strings of characters.  The real
problem is that \W *includes* the space character, so it gets
eliminated.  

Quote:
> print "$string\n";
> abcdefg

> Can someone give me an re that'll result in:
> abcd efg

Re-phase your question and it will appear obvious:

I want to eliminate all but alphanumeric and space.  That asks for a
negated character class:

  my $string = 'abcd efg+-_)';
  $string =~ s/[^A-Za-z0-9 ]+//g;

--
Garry Williams



Mon, 07 Jun 2004 01:31:45 GMT  
 Need an expression to strip non-alphanumeric except space

Quote:

> I'm not an re guru.  Can someone give me a line to strip out
> non-alphanumeric except space from a line.  I tried \W but it takes
> out the spaces i.e.;

> $string='abcd efg+-_)';
> $string=~s/\W||_//gi;
> print "$string\n";
> abcdefg

> Can someone give me an re that'll result in:
> abcd efg

You don't need a regular expression to do this:

$string =~ tr/a-zA-Z0-9 //cd;

John
--
use Perl;
program
fulfillment



Mon, 07 Jun 2004 02:30:35 GMT  
 Need an expression to strip non-alphanumeric except space

Quote:


> > I'm not an re guru.  Can someone give me a line to strip out
> > non-alphanumeric except space from a line.  I tried \W but it takes
> > out the spaces i.e.;

> > $string='abcd efg+-_)';
> > $string=~s/\W||_//gi;
>                ^
>                ^
> That's an extra alternation symbol.  Plus, you are substituting one
> character at a time instead of strings of characters.  The real
> problem is that \W *includes* the space character, so it gets
> eliminated.  

> > print "$string\n";
> > abcdefg

> > Can someone give me an re that'll result in:
> > abcd efg

> Re-phase your question and it will appear obvious:

Re-phase or rephrase?

Quote:

> I want to eliminate all but alphanumeric and space.  That asks for a
> negated character class:

>   my $string = 'abcd efg+-_)';
>   $string =~ s/[^A-Za-z0-9 ]+//g;

I don't see what rephrasing my question has to do with anything.
Maybe I'm not using the term "regular expression" properly -- like I
said, not a guru (at least not in Perl)

Anyway, I did come up with the solution (identical to what you posted)
and tried to remove my original post from the newsgroup, but
apparently too late.

Thanks for the reply, though.



Mon, 07 Jun 2004 21:26:20 GMT  
 Need an expression to strip non-alphanumeric except space

Quote:



> > > I'm not an re guru.  Can someone give me a line to strip out
> > > non-alphanumeric except space from a line.  I tried \W but it takes
> > > out the spaces i.e.;

> > > $string='abcd efg+-_)';
> > > $string=~s/\W||_//gi;

> > Re-phase your question and it will appear obvious:

> > I want to eliminate all but alphanumeric and space.  That asks for a
> > negated character class:

> >   my $string = 'abcd efg+-_)';
> >   $string =~ s/[^A-Za-z0-9 ]+//g;

> I don't see what rephrasing my question has to do with anything.

Because your original regex said "remove everything that *matches* foo"
whereas Gary's says "remove everything that *doesn't match* foo".  Your
English phrasing of the question was actually closer to the mark than
your regex.  It sometimes helps to learn the logic of regexen if you
phrase the question in English the best way you can and then make the
regex follow the same logic.

--
Jeff



Mon, 07 Jun 2004 21:33:33 GMT  
 Need an expression to strip non-alphanumeric except space

Quote:


>> > I'm not an re guru.  Can someone give me a line to strip out
>> > non-alphanumeric except space from a line.  I tried \W but it takes
>> > out the spaces i.e.;

>> > $string='abcd efg+-_)';
>> > $string=~s/\W||_//gi;

[snip]

Quote:
>> Re-phase your question and it will appear obvious:

> Re-phase or rephrase?

:-)

Quote:
>> I want to eliminate all but alphanumeric and space.  That asks for a
>> negated character class:

>>   my $string = 'abcd efg+-_)';
>>   $string =~ s/[^A-Za-z0-9 ]+//g;

> I don't see what rephrasing my question has to do with anything.

Well, I thought "give me a line to strip out non-alphanumeric except
space from a line" was ambiguous.

Quote:
> Maybe I'm not using the term "regular expression" properly

No, I was just confused and thought rephrasing the question would
help.  

Quote:
> -- like I
> said, not a guru (at least not in Perl)

So what kind of a guru are you?   :-)

--
Garry Williams



Mon, 07 Jun 2004 21:39:18 GMT  
 
 [ 6 post ] 

 Relevant Pages 

1. Regular expression to check for non-alphanumeric?

2. how to define alphanumeric (words containing non-ASCII chars)

3. loop thru mails stripping all lines except body

4. remove non word characters EXCEPT blank

5. replacing any character except for spaces...

6. a regular expression with all except ...

7. Stripping non standard control characters from a file

8. Stripping non-contents from HTML snippet

9. Stripping non-alphanum's from within string?

10. How to strip non-printing characters.

11. stripping leading white space, a better way?

12. stripping leading white space, a better way?

 

 
Powered by phpBB® Forum Software