I daresay it is often <= -1

by Michael S. Kaplan, published on 2006/04/11 03:06 -04:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/04/11/572943.aspx

A lot of the documentation for functions in the NLS API makes the point that for a length parameter you can either pass the length or you can pass -1 to signal that the string is NULL terminated.

(I have mentioned this previously.)

As it turns out, most of the functions will treat any negative value this way, and it has been that way since the functions were first written (in many/most cases, over a decade ago).

So, let's look at them one at a time....

CompareString <= -1

CompareStringOrdinal == -1

FindNLSString == -1

FoldString <= -1

GetStringTypeEx <= -1

GetStringType[A|W] <= -1

IsNormalizedString == -1

LCMapString <= -1
MultiByteToWideChar <= -1

NormalizeString == -1

WideCharToMultiByte <= -1

Believe it or not, some of the comments specifically talk about the parameter being equal to -1 even though the check underneath the comment is for less than or equal to one.

Of course we can't change the behavior due to the backcompat concerns, though maybe the comments should be updated someday. Though the fact that no one has pointed it out before makes me wonder if anyone ever reads them anyway? :-)

If you look at the functions, you may even see the pattern -- it is only functions added later that have the "equal to -1" behavior. So that is something, right? We are consistent, in our own way.

In any case, it is safer to just pass -1 since some functions work that way. And it is an easier rule to remember than a big list of functions....

This post brought to you by "-" (U+002d, a.k.a. HYPHEN-MINUS)

# Phylyp on 11 Apr 2006 6:13 AM:

Warning: some wild ass guesses follow:

I think the whole <= -1 thing probably came about because someone decided that zero and positive values would be a value, and negative values would be a code/indicator. In the initial implementation, -1 was for a NULL terminated string, and other negatives reserved for future use.

Ideally, this should be done by separate parameters, one indicating if a string is NULL terminated, another parameter for the length (if not NULL terminated). This approach is verbose, and that is both a good and a bad thing!

# Michael S. Kaplan on 11 Apr 2006 6:47 AM:

Hi Phylyp,

But we can't change the behavior once it has been created -- so if that were the case then they should all have been made to check for == -1....

(for an example of a case where we DID do something like this with the return value, take a look at NormalizeString!)

I won't disagree about the muliple param idea, other than to point out that the most common cited reason for people to wrap our functions is that people find them to be too complicated as they stand!

# Maurits [MSFT] on 11 Apr 2006 11:18 AM:

What do the "== -1" functions do if you pass, say, -2?

# Michael S. Kaplan on 11 Apr 2006 11:58 AM:

They fail with an ERROR_INVALID_PARAMETER....

# Maurits [MSFT] on 11 Apr 2006 12:02 PM:

Ah, good. Kudos for that.

As an aside, it's odd that your comments to this post have an overescaped < in their subject, but ours (mine and Phylyp's) don't... must be due to the "HTML"-ized blog admin tools?

# Maurits [MSFT] on 11 Apr 2006 12:04 PM:

> overescaped <

Weird... they seem to wander, and appear only intermittently. Right now I'm looking at a page where only my last comment has an <, and all of the other comments have a normal <

# Michael S. Kaplan on 11 Apr 2006 2:25 PM:

Actually, I have been editing them as I see them. :-)

# Phylyp on 12 Apr 2006 9:42 AM:

Mike,
Yup, I agree that multiple params are only an 'idealistic' solution, practically it's probably better to merge them, especially in scenarios like this, where -ve lengths are meaningless.

Heh, I used to wonder how people programmed Windows before IntelliSense, IntelliTips and all those keystroke savers in VS 6.

Also, no, I wasn't suggesting changing the behaviour (UK spelling, ha!) of existing functions.

I was merely pointing out the fact that one parameter is actually for two purposes: as a flag/indicator, when -ve and as a value when +ve.

On a related note: I work in data warehousing. We often have a need to store values and/or indicators pertaining to an attribute/entity. In these cases too we often encounter a similar debate about merging values and indicators in the same column.

Since its a database, and especially since its a data warehouse (where space isn't a constraint), we go for the purist approach, and have a column for the indicator, another for the value. The biggest advantage of the purist approach is that it is quite self-documenting.

# Maurits [MSFT] on 12 Apr 2006 7:58 PM:

> one parameter is actually for two purposes: as a flag/indicator, when -ve and as a value when +ve

Pushing the idea of parameter reuse a bit further, LCMapString has a parameter (lpDestStr) that is a pointer to a byte array when LCMAP_SORTKEY is passed, and a pointer to a string when LCMAP_SORTKEY is not passed.

It gets a little weird when LCMAP_SORTKEY | LCMAP_BYTEREV is passed, depending on whether the number of bytes in the sort key is odd or even... see
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/nls_5s2v.asp

But what are the reserved parameters for in LCMapStringEx? Anything you can let us in on?

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/nls_LCMapStringEx.asp
...
LPNLSVERSIONINFO lpVersionInformation,
LPVOID lpReserved,
LPARAM lParam
...
lpVersionInformation
[in] Reserved, must be NULL.
lpReserved
[in] Reserved, must be NULL.
lParam
[in] Reserved, must be zero.

# Michael S. Kaplan on 12 Apr 2006 11:47 PM:

Think of those extra parameters as leaving room for future expansion. :-)

Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

go to newer or older post, or back to index or month or day