We didn't split up, because we were in Japan (aka They have an opening that you might fit)

by Michael S. Kaplan, published on 2007/11/24 03:01 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2007/11/24/6494996.aspx


No, this is not a post about a past relationship! 

The question that came up yesterday was:

Hi.

Following lines of WSF script generate script error.  The character in the string is Katakana letter Pe U+30DA

    formatline = "first=ペ"
    arr1 = Split(formatline, "=", 2, 1)

The error that is returned:

    test.wsf(14, 2) Microsoft VBScript runtime error: Invalid procedure call or argument: 'Split'

Failing case is introduced when standards and formats on my Vista OS is Korean.

I suspect that this function should have used binary comparison where it does not fail:

    arr1 = Split(formatline, "=", 2, 0)

Both function calls work fine when user locale is set to Japanese, matching the Language of the character.

Why the call with textual comparison is treated as illegal?  What are preferences between textual and binary comparison?

Thanks.

Now the person who asked the question is someone I have worked with before. I like his instincts.

Anyway the former script dev pointed out the basic things to consider here:

A possible source of the problem is the encoding of the script file. All bets are off if your WSF file is not Unicode.

Assuming that your script file IS Unicode:

Text compare uses current locale information for special text processing when you use TextCompare, so it’s not too surprising that anomalies arise when the text and the locale don’t match up.  Using SetLocale(“ja”) may make it work as expected. Unfortunately, it looks like the code has some inconsistencies, using system locale for certain processing and the current engine locale for others, so setting the locale may not make it work.

It looks to me that this should work without errors, though. A bug should be filed.

Both of then are basically right here!

In this case, there are two important issues here, both of which he pointed out in his message:

  1. The fact that this call ever fails is a bug -- it should not be happening;
  2. The call should be done with a binary comparison since there is no "linguistic" sense of the equals sign that would make sense here for comparison purposes.

So both of them are batting 1000. :-)

It is interesting how often valid globalization bugs are found in the context of usage that is essentially incorrect.

There is an interesting challenge to send these two messages such that people will not think you are the jerk who takes someone reporting a bug as an opportunity to slam on the person doing the reporting, if you know what I mean....

Not so much in this case (since it was someone else's script bug and he was just the person who was tracking the down the failure found with a different language configuration on a test machine) but it is a pretty common scenario, probably more common with people reporting the error with their own code. And an important skill to master.

This is a relevant point for you, the reader, as there is yet another opportunity for a job in my former team that deals with these kinds of issues!

They are looking for another tester -- an SDET, to be [recise.

Kieran mentioned the position in her recent post entitled Shameless self-promotion, and now that I have somewhere between 7 and 8.5 people (depending on how you count!) who have come to work for MS after reading this blog I have to consider the non-zero percent probability that the person who applies for and gets this job might be a regular reader. And this can be a very valuable skill to have for this position in particular. :-)

Both Kieran's post and the job description it links to have hints about future functionality that are much broader than I would be posting here (it's their team, and that gives then a lot more leeway on such matters!), for people who like to watch that kind of thing. Some exciting stuff....

If you end up interviewing be sure to drop me a line, maybe I could get you a Limonata or something. :-)

 

This post brought to you by = and(U+003d and U+30da, aka EQUALS SIGN and KATAKANA LETTER PE)


MSDNArchive on 28 Nov 2007 3:57 AM:

I am the (almost former) script dev referred to in the post. The discussion on this issue was pleasing and productive on a number of levels:

The poster of the problem gave a minimal and complete working repro for the issue. I usually have to probe for that. He also gives the workaround - valuable for the discussion list. Using binary compare is going to be quite a bit faster, in addition to avoiding the script bug.

There is a third good point the poster had: that documentation on global text support in the script engine/runtime is essentially nonexistent.

Unlike some devs, I am pleased to get bug reports in what I own, especially as well stated as this one was. It's easier to take, because I didn't write the code in question. It is a real issue in vbscript, for which I posted Windows 7 bug 125295. I didn't debug in detail, since I don't own script anymore, except as backup for the new team that does. I winced when I looked at the underlying code to see if I could spot the problem. Smoke and junk blew by when I popped the hood, and there were some obvious loose wires. This is an old codebase, and shows it in places.

Lastly, Michael gave me a nice public compliment. At least, I took it that way. And he did it again here in this post (I'm not a jerk after all :-). Michael is demonstrating another skill or two worth mastering.

We've had occasional interactions over the years in mail, but I can't remember if we've ever met in the physical world. It's always been a pleasure working a problem with him. We should and will meet in person one of these days.

See you 'round, Michael.

-- Paul

Michael S. Kaplan on 28 Nov 2007 10:37 AM:

Definitely intended as compliment(s), Paul.

In a former lifetime (for both of us) I have a vague recollection of the Uniscribe support within VS? :-)

I agree with the "non-existent documentation" issue, though that particular issue has always seemed "by design" of somebody unfortunately -- that and its cousins the "misleading documentation" and "incorrect documentation". If the docs were present and right and clear more often, I'm sure I wouldn't feel the need to blog so much....


go to newer or older post, or back to index or month or day