regexp matches unicode char in string 
Author Message
 regexp matches unicode char in string

Hallo,

I have a web form where user inputs data.

The codepage of webpage is windows-1257, and user can enter an international
character in the form's text box. The character what makes trouble in my
particular case, is 0x101 ( charCodeAt returns 257).

The problem is, that string containing this character, for example, myStr =
"\u0101bcde" validates against regex /\w+/

How can I strictly check against these international chars, so I can
generate an error message asking user to change the string? I would like to
have a regex rather than loop checking charCodes.

thanks,

-- Pavils



Fri, 14 Jan 2005 22:45:12 GMT  
 regexp matches unicode char in string
Your /\w+/ pattern matches the 'bcde' portion of the string. If you don't
want any characters except [A-Za-z0-9_] in your string you have to anchor
the pattern to the beginning and end of the string (/^\w+$/) and accept the
string or test for any other characters (/\W/)and reject the string.

  badStr   = "\u0101bcde";
  goodStr  = "abcde";
  patterns = ['\\w+', '^\\w+$', '\\W'];

  for (i in patterns) {
    regex = new RegExp(patterns[i]);
    WScript.echo (
      regex.source + "\011" +
      regex.test(badStr) + "\011" +
      regex.test(goodStr)
    );
  }

--
You cannot shake hands with a clenched fist. -Indira Gandhi

=-=-=
Steve
-=-=-


Quote:
> Hallo,

> I have a web form where user inputs data.

> The codepage of webpage is windows-1257, and user can enter an
international
> character in the form's text box. The character what makes trouble in my
> particular case, is 0x101 ( charCodeAt returns 257).

> The problem is, that string containing this character, for example, myStr
=
> "\u0101bcde" validates against regex /\w+/

> How can I strictly check against these international chars, so I can
> generate an error message asking user to change the string? I would like
to
> have a regex rather than loop checking charCodes.



Sat, 15 Jan 2005 01:10:41 GMT  
 regexp matches unicode char in string
Shame on me, Steve,

You are right, of course! How could I forget to add "^" and "$"...

Thanks for enlighing me!

-- Pavils


Quote:
> Your /\w+/ pattern matches the 'bcde' portion of the string. If you don't
> want any characters except [A-Za-z0-9_] in your string you have to anchor
> the pattern to the beginning and end of the string (/^\w+$/) and accept
the
> string or test for any other characters (/\W/)and reject the string.

>   badStr   = "\u0101bcde";
>   goodStr  = "abcde";
>   patterns = ['\\w+', '^\\w+$', '\\W'];

>   for (i in patterns) {
>     regex = new RegExp(patterns[i]);
>     WScript.echo (
>       regex.source + "\011" +
>       regex.test(badStr) + "\011" +
>       regex.test(goodStr)
>     );
>   }

> --
> You cannot shake hands with a clenched fist. -Indira Gandhi

> =-=-=
> Steve
> -=-=-



> > Hallo,

> > I have a web form where user inputs data.

> > The codepage of webpage is windows-1257, and user can enter an
> international
> > character in the form's text box. The character what makes trouble in my
> > particular case, is 0x101 ( charCodeAt returns 257).

> > The problem is, that string containing this character, for example,
myStr
> =
> > "\u0101bcde" validates against regex /\w+/

> > How can I strictly check against these international chars, so I can
> > generate an error message asking user to change the string? I would like
> to
> > have a regex rather than loop checking charCodes.



Sat, 15 Jan 2005 16:23:29 GMT  
 
 [ 3 post ] 

 Relevant Pages 

1. RegExp: how NOT to match string??

2. Regexp to match a subset of strings but exclude all of a certain character

3. RegExp match of string with square bracket

4. RegExp question: match within another match

5. convert in unicode with specific char - webbrowser

6. PostScript names for Unicode chars

7. unicode chars

8. RegExp Pattern for Matching a URL?

9. RegExp Pattern for Matching a URL?

10. RegExp pattern matching

11. RegExp pattern-matching problem

12. RegExp to match complete words

 

 
Powered by phpBB® Forum Software