Messages like "%d file(s) found" are notoriously hard to localize. In English language, there are only 2 forms: 1 file (singular) and 2 or more files (plural), but other languages use up to 4 plural forms. For example, there are 3 forms in Polish:
0 plików 1 plik 2-4 pliki 5-21 plików 22-24 pliki 25-31 plików etc.
Other languages (French, Russian, Czech, etc.) also use rules different from English and from each other.
The gettext library extracts a rule for plural form selection from the localization file. The rule is a C language expression, which is evaluated for each message. It's a universal solution, but an expression evaluator is probably an overkill for this task.
Here are some observations about the languages mentioned on gettext page:
- All additional plural forms are used for some range of numbers, e.g., from 2 to 4 in Slovak and Czech.
- The pattern is often repeated for each 10 or 100 items. In Russian, it sounds like "twenty-one file", not "twenty-one files", because the noun is agreed with the last figure, "one". The same pattern repeats for 30, 40, etc.
- The numbers from 10 to 19 are often an exception to the rules. Just like 16 is spelled differently from 26, 36, 46, etc. in English: "sixteen" vs. "twenty-six", "thirty-six", and "forty-six".
- Zero is treated differently in some languages, e.g. Romanian.
So, the rule for each plural form will consist of these components:
range_start range_end modulo_for_repetition skip_teens_flag
Here are some examples:
English singular: range_start = 1, range_end = 1 plural: all other numbers Polish singular: range_start = 1, range_end = 1 plural1: range_start = 2, range_end = 4, modulo = 10, skip_teens = true plural2: all other numbers Irish singular: range_start = 1, range_end = 1 plural1: range_start = 2, range_end = 2 plural2: all other numbers Lithuanian singular: range_start = 1, range_end = 1, modulo = 10, skip_teens = true plural1: range_start = 2, range_end = 9, modulo = 10, skip_teens = true plural2: all other numbers (from 10 to 19)
The rules for each language could be written to a short string, which is stored in the language file (e.g., for Lithuanian, the string is "1 1 10 t; 2 9 10 t").
Using the Code
Include plurals.h and plurals.c in your project. The interface consists of two functions. First, you call PluralsReadCfg to read rules from the string. Next, you pass a number to PluralsGetForm. It returns the index of correct plural form for this number, which you use to read the string from your language file:
PLURAL_INFO plurals; PluralsReadCfg(&plurals, ReadFromLngFile("PluralRules")); char lng_str_name, message; sprintf(lng_str_name, "FilesFound%d", PluralsGetForm(&plurals, number)); sprintf(message, ReadFromLngFile(lng_str_name), number);
In the language file, you have strings for each plural form:
PluralRules = "1" FilesFound0 = "%d file found" FilesFound1 = "%d files found"
ReadFromLngFile is your own function. You could wrap two sprintfs in a higher-level function (and, of course, use a secure function instead of sprintf to protect your program from buffer overflow).
Even better solution is implementing a custom formatting function, so you could write something like "%d %(file|files) found" in the language file. Scott Rippey devised this technique and implemented it in VB .NET.
Two functions, PluralsReadCfg and PluralsGetForm, take 500 bytes in your executable file when compiled with MSVC++. A small price to pay for spelling your messages correctly in any language.