Converting doubles to Unicode strings

by Michael S. Kaplan, published on 2006/12/13 06:45 -05:00, original URI: http://blogs.msdn.com/b/michkap/archive/2006/12/13/1274163.aspx


The question Andy asked:

Are there no wide-character methods of converting a double to a string? I see methods to convert wide strings to doubles (e.g. _wtof) but not the reverse.

He is 100% correct -- there is no CRT function (safe or otherwise) that does this. So there is conversion in one direction in _wtof/_wtof_l but not the other.

Of course this does not tell the whole story, since obviously functions like swprintf/_swprintf_l and their more secure cousins (swprintf_s/_swprintf_s_l) can obviously take an e or E field type character and a double as a parameter, but as pointed out by Phil Lucido:

All forms of printf, way down deep, call _cfltcvt_l (or some variant) for FP conversion.  That always outputs to a char buffer, not wchar_t.  For the wide printfs, that buffer is then run, eventually, through MultiByteToWideChar.

It is interesting that there is not a specific set of functions to do it directly (and if the CRT was ever going to support using other numbers beyond the ASCII digits it would probably be a much bigger problem, but for now this is the best you can do.

Now the managed side has a better story here with the explicit interface implementation on System.IConvertible.ToDouble on System.String, not to mention all of the Double.ToString overloads that exist. But I guess it is easy to point out that .NET has only methods that produce Unicode strings here, and that if you want to produce non-Unicode strings you have to convert them. So I guess we have symmetry here. :-)

 

This post brought to you by E (U+0045, a.k.a. LATIN CAPITAL LETTER E)


Mike Dimmick on 13 Dec 2006 8:34 AM:

You don't mention _ecvt, _fcvt or _gcvt (and their secure variants) which at least avoid the overhead of parsing the format string. However, none of these have Unicode versions either.

Adam on 13 Dec 2006 8:54 AM:

"All forms of printf, way down deep, call _cfltcvt_l (or some variant) for FP conversion.  That always outputs to a char buffer, not wchar_t.  For the wide printfs, that buffer is then run, eventually, through MultiByteToWideChar."

First, YUCK! :-)

Second, that's an implementation detail. There should be no reason why that couldn't be changed and wide variants of _cfltcvt_l added to the system, and called directly from swprintf()

sprintf()/snprintf()/swprintf() are the standard functions to use for converting numbers to strings in C. That's how it's done. See also the C FAQ, question 13.1:

http://c-faq.com/lib/itoa.html

Michael S. Kaplan on 13 Dec 2006 9:05 AM:

Yes, an implementation detail -- the key point that no version of Microsoft's VC runtime implements it? :-)

Lars Viklund on 13 Dec 2006 9:09 AM:

If you're using C++ and want a safe way to convert from double to a wide string, you can always use std::wstringstream. It resides in the <sstream> header.

std::wstringstream wss;

wss << 42.0;

std::wstring ws = wss.str();

Or if you have the Boost libraries installed, you can use Boost.Lexical_cast to do your conversion.

std::wstring ws = boost::lexical_cast<std::wstring>(42.0);

Adam on 13 Dec 2006 9:24 AM:

Sorry if I was unclear:

There should be no reason why MS could not release updates to the various VC runtimes that *do* implement wide variants of _cfltcvt_l, and include versions of sprintf()/snprintf()/swprintf() that call them.

Michael S. Kaplan on 13 Dec 2006 9:25 AM:

Certainly. And if you are using COM then VarBstrFromR8 can handle things nicely. Whether one has Boost or the std library or COM is a separate question, and each can be a good answer depending on what one is already dependent on (and whether one is using C or C++, of course!).

Heck, there is that whole swprintf ideas. There are lots of ways to do it. The key here is that there is nothing built into the CRT library of functions that MS provides....

Michael S. Kaplan on 13 Dec 2006 9:26 AM:

Hi Adam -- I agree. They are not blocked from doing this.

But they don't do it at the moment. :-)

Adam on 13 Dec 2006 2:44 PM:

OK, we're agreed on that.

But how can you claim "that there is nothing built into the CRT library of functions that MS provides" that is meant for "converting a double to a string"?

Just because the current MS implementation of swprintf() happens to suck, that does not mean that it doesn't exist, or isn't a recommended way to do the conversion in /all/ C implementations.

*shrug*

Martin v. Löwis on 13 Dec 2006 5:00 PM:

What is the problem with %E calling an ASCII function inside? Conversion of a floating point number will produce output that is *always* in the range U+0020..U+0080, so there is really no semantic difference here. Also, conversion of that to Unicode will be quite efficient, as ASCII can be converted efficiently to Unicode (it's just a byte-for-byte widening).

So I fail to see a problem. The initial statement "there is no of converting doubles to wchar_t" is simply false: there is such a method, and it works perfectly well.

Michael S. Kaplan on 13 Dec 2006 5:27 PM:

Hi Martin,

One could claim (by that logic) that all of the functions that convert different data types to strings are not needed since there is a method to do it.

But that is a somewhat specious argument, and many people find the functions to be useful. Having some of them not be there is definitely something missing in one case but not the other.

Martin v. Löwis on 14 Dec 2006 1:34 AM:

Michael,

First, the logic "we don't need a function because we can do it" doesn't apply. The point is that it is really straight-forward to write

 wchar_t buf[_CVTBUFSIZE];

 swprintf(buf, "%.3f", value);

There can't really be a better way to do it; indeed, Double.ToString(String) works the same way. You don't have to implement the conversion yourself - there is a ready-to-use method, in the library, at your disposal.

More importantly, there isn't other good procedure for char (byte) strings, either, so it's not that just Unicode strings "suffer". There is _ecvt, but that has many flaws. There is _ecvt_s, which is better (although the example sucks, as it does malloc). However, it isn't standard C, so I wouldn't consider using it (as there is a standard way to do it just as well).

Michael S. Kaplan on 14 Dec 2006 2:42 AM:

um, huh?

I am talking about how there is _wtoi and _itow, _wtoi64 and _i64tow, and so on. Yet there is only ONE direction for floats.

swprintf is a workaround for this lack, but there is a missing type of function in a clearly intentional matrix of functionality that is provided.

If swprintf was enough, then there would be no reason for _itow. Clearly someone sees benefit to having different functions (and I see even more benefit in an _ftow where parameters guiding how to affect the rounding could provide more granularity than swprintf currently provides,

Adam on 14 Dec 2006 4:33 AM:

_itow() (and friends) ??? Where did that come from? I never even knew they existed!

If you were talking about them before, why not *mention* them before?

I thought you were saying that that whole *class* of functionality was missing from the CRT.

OK, that all makes a bit more sense now.

Martin v. Löwis on 14 Dec 2006 6:23 PM:

As Adam, I didn't know you were talking about itow, as you weren't mentioning it. I still don't find the reasoning convincing:

- it's not that just Unicode is suffering; there isn't an ftoa function either, although there is atof.

- there is a rationale for having ito[aw] over sprintf, which is that itoa supports alternative radices (more than just 8, 10, and 16) - although it is debatable whether the ability to format in base 11 deserves support in the standard library.

- I fail to see a similar rationale for fto[aw]. You are giving .NET's ToDouble function as a rationale, but I fail to see how it provides features that sprintf doesn't provide (except for extended format codes, like "G", which could be added to sprintf if there is a need). You are giving rounding as an example, but it is unclear to me what you mean by that: if you want to support different rounding modes, wouldn't it better to support the C99 mechanisms (e.g. fesetround) to developers to write portable code?

Sergei on 15 Dec 2006 2:58 PM:

There is another drawback in sprintf/swprintf as opposed to the C++ Standard Library. sprintf/swprintf don't respect thousands separator.

This code:

wostringstream ss;

ss.imbue(locale("Russian"));

ss.precision(3);

ss<<fixed<<12345.;

produces "12 345,000" (note the space).

But this code:

_wsetlocale(LC_ALL,L"Russian");

wchar_t s[_CVTBUFSIZE];

swprintf(s,L"%.3f",12345.);

produces "12345,000" (no space).


Please consider a donation to keep this archive running, maintained and free of advertising.
Donate €20 or more to receive an offline copy of the whole archive including all images.

go to newer or older post, or back to index or month or day