Aug 98 Factory Floor
Volume Number: 14 (1998)
Issue Number: 8
Column Tag: From The Factory Floor
The New C++ Standard: Locales
by by Howard Hinnant and Dave Mark, ©1998 by Metrowerks, Inc., all rights reserved.
In last month's column, we took our first pass at the Final Draft International Standard for C++. Specifically, we covered the topic of namespaces. In this month's column, Howard Hinnant is back, and will take us through the concepts behind C++ locales.
Howard Hinnant is a software engineer on the MSL team at Metrowerks, and is responsible for the C++ and EC++ libraries. Howard is a refugee from the aerospace industry where FORTRAN still rules. He has extensive experience in scientific computing including C++ implementations of linear algebra, finite difference and finite element solvers.
Dave: What is a C++ locale and why would I want to use one?
Howard: A locale is a C++ class in the standard library that allows you to easily customize I/O so that users of your software feel at home, wherever that home may be. For example, let's say you're writing some high finance report software. This stuff is for executives, and you know how much they like their formatting. So when you print out ten thousand bucks, it has to look like:
$10,000.00
Ok, so you've put 9 months into this software, and your boss is really pleased with it. In fact, he/she likes it so much, they want to use it for the offices world wide. Except that the formatting needs to be customized for each country... (you did plan for this didn't you!). The French executives might want to see:
10.000,00 Franc
And the German executives might want to see:
10.000,00 Mark
Does this mean you need seperate formatting routines for every country? No. This is precisely what locale was meant to solve. In fact, with locale it is not difficult to have several formats current at the same time (think international client/server).
Dave: But wait a minute, US$10,000 is not equivalent to 10.000,00 German Marks!
Howard: Good point. An important thing to understand about locale is that it only helps you with formatting and I/O. Any calculations (like exchange rates) are totally up to you. This all might become more clear with a specific example. Let's say you wrote a class called Money in your killer app that looked something like:
class Money
{
public:
Money(double amount) : amount_(100*amount) {}
double to_double() const {return amount_;}
private:
double amount_;
};
std::ostream& operator<< (std::ostream& os,
const Money& amount);
Your top level code might look like:
Money amount; // Your Money class
is >> amount; // input from stream (perhaps from US)
os << amount; // output to stream (perhaps to France)
Dave: How do these streams know which country to format for? Where is the formatting logic?
Howard: You (the programmer) do have to do some more work here. In the previous example, you have to "imbue" each stream with a locale that knows how to format your Money.
This might look like:
std::locale US_Locale(...);
is.imbue(US_Locale);
std::locale Fr_Locale(...);
os.imbue(Fr_Locale);
Dave: The "...": Is this an exercise left to the reader?
Howard: Always hated those! A locale consists of a collection of facets. Each facet has a specialized responsibility. One facet knows how to format money. We need to create a locale with a money-formatting-facet that knows how we want our Money formatted. Here is how you might write a French franc formatter:
class Fr_Moneypunct : public std::moneypunct<char, false>
{
protected:
virtual char do_decimal_point() const {return ',';}
virtual char do_thousands_sep() const {return '.';}
virtual std::string do_curr_symbol() const
{return "Franc";}
virtual pattern do_pos_format() const {
pattern result = {{char(sign), char(value), char(space),
char(symbol)}};
return result;
}
virtual pattern do_neg_format() const {
pattern result = {{char(sign), char(value), char(space),
char(symbol)}};
return result;
}
};
Then you can install it in a locale with:
std::locale Fr_Locale(std::locale(), new Fr_Moneypunct);
That ought to do it. Ok, I know it may look a little unfamiliar, but you've got to admit, it's not much code.
Dave: Agreed. How about a little more detail?
Howard: The base class std::moneypunct is a standard facet. Its default behavior is to format US currency. However all of this behavior lives under virtual methods that are overridable. The behaviors which are most likely to be changed, are the most easily overridable. For example, changing the decimal point character to ',' is demonstrated above with the do_decimal_point method.
The do_pos_format and do_neg_format methods have the responsibility for dictating the relative position of the sign, value and currency symbols. The default order is "symbol, sign, none, value". A "none" means that there may be zero or more spaces. A "space" means that there are one or more spaces. In our example, it was not enough to change the currency symbol from "$" to "Franc", we also wanted it to be printed after the amount, instead of before. That is why we changed the pattern order to "sign, value, space, symbol".
There is a do_negative_sign() method that returns a string. The first character of this string gets put in the "sign" spot designated by the pattern returned from do_neg_format(). If the string has more than one character, the remaining characters get put at the end of the entire formatting sequence. Thus if do_negative_sign() returns "()", then negative amounts can be formatted as:
(10.000,00 Franc)
Dave: There's that 10.000,00 Franc again. Where does the exchange rate go in?
Howard: Let's take a closer look at our Money class. We could put the exchange logic as a method in there with a method called us_to_france:
class Money
{
public:
Money(double amount) : amount_(100*amount) {}
Money& us_to_france() {amount_ /= .1685; return *this;}
double to_double() const {return amount_;}
private:
double amount_;
};
So now, assuming you had read in US $10,000.00, you might print a French version like:
os << amount.us_to_france();
and "59.347,18 Franc" will be printed.
I'm going to throw my standard disclaimer in here: The Money class presented here is just to demo locale. I really haven't put any serious design thought into Money itself.
Dave: Ok, so we've created a customized locale, with a customized facet. How does our Money class make use of this machinery?
Howard: The work we've done so far will affect both input and output. To keep this discussion short, let's just look at the output routine for our example Money class:
std::ostream&
operator<< (std::ostream& os, const Money& amount)
{
using namespace std;
ostream::sentry ok(os);
if (ok)
{
const money_put<char>& fmt = use_facet<money_put<char> > (os.getloc());
ios_base::fmtflags saveflags = os.flags();
showbase(os);
if (fmt.put(os, false, os, os.fill(), amount.to_double()).failed())
os.setstate(ios_base::badbit);
os.flags(saveflags);
}
return os;
}
The bad news is that this code may look strange to you. The good news is that it is pretty much boiler plate code. Most locale-savey output routines you write will look similar to this.
The first line with the sentry animal prepares the stream for output, and ensures that the stream is in good shape. Next the ostream is queried for its locale (which you earlier imbued). Then the locale is searched for its money_put facet via the global method use_facet. The money_put facet is a standard facet that will use your Fr_Moneypunct class through the magic of virtual functions. Finally, you simply ask the money_put facet to "put" your Money on the ostream. money_put knows how to handle amounts expressed as doubles or strings, so you must be able to convert to one of those. That is why the to_double method has been part of the Money class from the beginning.
Note that I've specialized Money to only output to narrow (char) streams. We could have easily templated everything on the character type in order to handle both narrow and wide streams. This is in fact what the standard library does.
Dave: Can you give me some other possibilities for locale?
Howard: I think space will prohibit us from discussing all the facets in this much detail, but I'll try to hit some highlights.
A collate facet is available for sorting characters. The default behavior is the case sensitive stuff you've come to know and love.
A collection of ctype facets wrap the functionality found in the C header <ctype.h> (and <wctype.h>).
Numeric facets control the formatting of both integers and floating point numbers (input and output). There are several similarities here with the monetary facets described above, including the ability to specify decimal point characters and thousands seperators.
A collection of time facets are available to help you format times and dates. If you need to read or write "janvier" instead of "January", you can do it here. CodeWarrior offers a timepunct facet similar to monepunct and numpunct. This facet is non-standard, but so common as to have become a defacto standard. This makes it much easier to customize the names of the days of the week, or the names of the months.
Dave: locale sounds both powerful and flexible. But I can still safely ignore it if I want to right?
Howard: Yes, the default locale will give you the traditional formatting that we are all accustomed to.