Internationalization/I18n: Char.IsDigit() matches more than just "0" through "9"
Raymond Chen just gave me a "duh" moment, by pointing out the obvious-only-if-you-think-about-it. Char.IsDigit() doesn't mean 'IsZeroToNineInEnglish', it means 'is in the decimal range of 0 to 9' and darnnit if there aren't other ways (other than 0,1,2,3,4,5,6,7,8,9) to express them! :)
So let's run an experiment.
class Program { public static void Main(string[] args) { System.Console.WriteLine( System.Text.RegularExpressions.Regex.Match( "\x0661\x0662\x0663", // "١٢٣" "^\\d+$").Success); System.Console.WriteLine( System.Char.IsDigit('\x0661')); } }The characters in the string are Arabic digits, but they are still digits, as evidenced by the program output:
True TrueUh-oh. Do you have this bug in your parameter validation? (More examples..)
If you use a pattern like@"^\d$"
to validate that you receive only digits, and then later useSystem.Int32.Parse()
to parse it, then I can hand you some Arabic digits and sit back and watch the fireworks. The Arabic digits will pass your validation expression, but when you get around to using it, boom, you throw aSystem.FormatException
and die. [Raymond Chen]
Arabic speakers (مرحبًا, كيف حالك ؟ and forgive me, it's been college since I studied Arabic) how to you handle numeric validation in JavaScript AND guarantee that the JavaScript you use on the client-side is semantically equivalent to the server-side code?
Either way, my friends, read, grok, and be enlightened. Muy interesante.
About Scott
Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. He is a failed stand-up comic, a cornrower, and a book author.
About Newsletter
Would there be any remote possibility of adding a feature that would allow posters to delete their own stupid comments? Or edit them? Sort sort of method for self-moderation...
Just a thought...
Yes, you have to also guarantee that all the users use none other than IE. :-)
Comments are closed.
If we all spoke english and used ASCII, we wouldn't have this problem.
Feet
Inches
$0.02
Tower of Babel