Introduction
Messages like "%d file(s) found" are notoriously hard to localize. In English language, there are only 2 forms: 1 file (singular) and 2 or more files (plural), but other languages use up to 4 plural forms. For example, there are 3 forms in Polish:
0 plików
1 plik
2-4 pliki
5-21 plików
22-24 pliki
25-31 plików
etc.
Other languages (French, Russian, Czech, etc.) also use rules different from English and from each other.
The gettext library extracts a rule for plural form selection from the localization file. The rule is a C language expression, which is evaluated for each message. It's a universal solution, but an expression evaluator is probably an overkill for this task.
Simpler solution
Here are some observations about the languages mentioned on gettext page:
- All additional plural forms are used for some range of numbers, e.g., from 2 to 4 in Slovak and Czech.
- The pattern is often repeated for each 10 or 100 items. In Russian, it sounds like "twenty-one file", not "twenty-one files", because the noun is agreed with the last figure, "one". The same pattern repeats for 30, 40, etc.
- The numbers from 10 to 19 are often an exception to the rules. Just like 16 is spelled differently from 26, 36, 46, etc. in English: "sixteen" vs. "twenty-six", "thirty-six", and "forty-six".
- Zero is treated differently in some languages, e.g. Romanian.
So, the rule for each plural form will consist of these components:
range_start range_end modulo_for_repetition skip_teens_flag
Here are some examples:
English singular: range_start = 1, range_end = 1 plural: all other numbers Polish singular: range_start = 1, range_end = 1 plural1: range_start = 2, range_end = 4, modulo = 10, skip_teens = true plural2: all other numbers Irish singular: range_start = 1, range_end = 1 plural1: range_start = 2, range_end = 2 plural2: all other numbers Lithuanian singular: range_start = 1, range_end = 1, modulo = 10, skip_teens = true plural1: range_start = 2, range_end = 9, modulo = 10, skip_teens = true plural2: all other numbers (from 10 to 19)
The rules for each language could be written to a short string, which is stored in the language file (e.g., for Lithuanian, the string is "1 1 10 t; 2 9 10 t").
Using the Code
Include plurals.h and plurals.c in your project. The interface consists of two functions. First, you call PluralsReadCfg to read rules from the string. Next, you pass a number to PluralsGetForm. It returns the index of correct plural form for this number, which you use to read the string from your language file:
PLURAL_INFO plurals;
PluralsReadCfg(&plurals, ReadFromLngFile("PluralRules"));
char lng_str_name[16], message[128];
sprintf(lng_str_name, "FilesFound%d", PluralsGetForm(&plurals, number));
sprintf(message, ReadFromLngFile(lng_str_name), number);
In the language file, you have strings for each plural form:
PluralRules = "1" FilesFound0 = "%d file found" FilesFound1 = "%d files found"
ReadFromLngFile is your own function. You could wrap two sprintfs in a higher-level function (and, of course, use a secure function instead of sprintf to protect your program from buffer overflow).
Even better solution is implementing a custom formatting function, so you could write something like "%d %(file|files) found" in the language file. Scott Rippey devised this technique and implemented it in VB .NET.
Conclusion
Two functions, PluralsReadCfg and PluralsGetForm, take 500 bytes in your executable file when compiled with MSVC++. A small price to pay for spelling your messages correctly in any language.