Language Support Overview and Issues

by Dale M.A. Johnson

 

created 4/17/06

 

Here, we will go over some basic issues that will need to be dealt with concerning multilanguage support, and strategies to deal with them.

 

Displaying Text

All the text that the engine uses cannot be stored internally, otherwise we would have to compile completely different versions of the engine for each language, and what's worse, these different versions will likely be incompatible with each other.

 

To get around this, all text is to be stored in external files in the system\language folder. Each language has a code for the engine to tell what it needs, and it will look for files in the language folder that has the name of the given code. Once it "reads" and stores the text, it can display it on the screen. Basically functionality need not change, just the words.

 

How Text is Stored

First off, it must be understood that it would be wasteful to load all the text the engine could possibly need all at the same time. That memory is best left for other, more important things such as graphics and sound. Therefor, each language's folder has several different text files for different situations. Only one file needs to be reffered to at a time. For example, why on earth does the character editor need text like "Click Here to Change the Map Size."

 

The only exceptions to this are some general text strings that are used throughout the engine, and error messages. These need to be loaded into memory at the beginning and should stay there until the engine is exited.

 

Text Data Fromat

The idea behind data storage is quite simple. The engine has a generic "messages" class/struct object where it will store the text from the file. The object stores two pieces of data for each string of text: The string of text itself, and a "name" for that text that the engine can refer to. Because the engine is written in English, the name needs to remain in English, or else it will try and find a string of text in English and won't be able to find it because someone got carried away and changed the names in the text file.

 

Here is how data is stored in the text file:

 

[save] Click here to save your game.^

 

Anything inside the square brackets is the name of the string, and must not be translated. String names will never have spaces. These are how the engine will tell which text is which.

 

The text itself begins immediantly after the string name, and continues on until the ^ sign. This marks the end of the string. This is nice because you might have a hard time fitting text on one line inside the text editor. The parser simply treats the line break like a simple space.

 

Also worthy of note, lines beginning with # are for comments. Translators can use these to leave notes for themselves or others. Each file also needs to begin with some comments in English about what file it is, so that it doesn't get lost in the shuffle or accidently renamed.

 

The following characters should not be used in strings because they are reserved for engine use and possible advanced functions:

 

[ ] { } < > $ ^ |

 

Possible Issues and Conflicts

Unfortunantly, "true" Unicode is out of the question because Windows 95/98 don't support it (along with a few non-Windows operating systems). Therefore, the UTF-8 Unicode standard is likely our best bet, although the text display routines may need to be tweaked to handle it.

 

Asian languages will be a real sticking point, and will have to be handled at a later date once there is enough intrest in implementing them (if ever).