regexp matches unicode char in string
Your /\w+/ pattern matches the 'bcde' portion of the string. If you don't
want any characters except [A-Za-z0-9_] in your string you have to anchor
the pattern to the beginning and end of the string (/^\w+$/) and accept the
string or test for any other characters (/\W/)and reject the string.
badStr = "\u0101bcde";
goodStr = "abcde";
patterns = ['\\w+', '^\\w+$', '\\W'];
for (i in patterns) {
regex = new RegExp(patterns[i]);
WScript.echo (
regex.source + "\011" +
regex.test(badStr) + "\011" +
regex.test(goodStr)
);
}
--
You cannot shake hands with a clenched fist. -Indira Gandhi
=-=-=
Steve
-=-=-
Quote:
> Hallo,
> I have a web form where user inputs data.
> The codepage of webpage is windows-1257, and user can enter an
international
> character in the form's text box. The character what makes trouble in my
> particular case, is 0x101 ( charCodeAt returns 257).
> The problem is, that string containing this character, for example, myStr
=
> "\u0101bcde" validates against regex /\w+/
> How can I strictly check against these international chars, so I can
> generate an error message asking user to change the string? I would like
to
> have a regex rather than loop checking charCodes.