[maemo-developers] Maemo localization to officially non-supported languages

From: Mohammed Hassan mohammed.2.hassan at nokia.com
Date: Fri Oct 26 17:19:09 EEST 2007
On Fri, 2007-10-26 at 16:06 +0300, Marius Vollmer wrote:
> Mohammed Hassan <mohammed.2.hassan at nokia.com> writes:
> 
> > On Wed, 2007-10-24 at 12:12 +0300, Marius Vollmer wrote:
> >> > The problem is the logical IDs are needed to maintain the smooth
> >> > process.  It's not easy to simply drop them. I did not say it's
> >> > impossible. I didn't say it'll be done or it'll not. I'm just saying
> >> > that they are needed ;-)
> >> 
> >> This might be so, but I honestly doubt it.  Can you elaborate?
> >
> > They are used by l10n testing to easily check for untranslated strings
> > (By me as well), to easily determine the originating UI spec, to design
> > test cases and I'm sure they have more uses.
> 
> Let's get some precision here:
> 
>  A) a UI string is a abstract entity that is used in the UI of a program
> 
>  B) code is supposed to use the right UI string for the right purpose,
>     as defined by the UI spec
> 
>  C) code is supposed to get the final string to display by passing the
>     UI strings to gettext
> 
>  D) there should be translations of all UI strings in all supported
>     languages (so that gettext has something to return in all cases it
>     is used)

E) I guess it should also be visually distinguishable.


> Right now, a UI string is identified by a symbolic identifier (the
> 'logical ids').  Right now, things are specified so that there is a
> separate UI string for (almost) each single mention of a certain UI
> element in the spec.[1]
> 
> When talking about getting rid of logical IDs, we mean that we will
> identify UI strings not by a (ultimately meaningless) symbolic
> identifier, but by some other form of identifier that already includes
> the translation of the UI string for one specific language
> (Engineering English).  For example, instead of using
> 
>     ai_bd_confirm_ok
>     ai_bd_confirm_cancel
> 
> we could identify these twoparticular UI strings with
> 
>     ai_bd_confirm_ok|OK
>     ai_bd_confirm_cancel|Cancel
> 
> [ The '|' is my way of notating context for pgettext. ]
> 
> This gets rid of some of the problems of logical IDs.  We can now
> tolerate violations of requirement D, for example.  This is important
> (IMHO) since D is violated frequently during the development of code,
> and trying to avoid that is nigh impossible.[2] Also, we can now use
> the gettext suit of tools directly without having to somehow splice
> the Engineering English back into it so that translators know what is
> going on even they don't have the UI spec.  Also also, developers can
> know what is going on directkly without having to consult the UI spec.
> 
> Thus, we could get the benefits of using Engineering English in the
> code, and keep the benefits of specifying a 'logical' context for each
> separate UI string.
> 
> The problem is that the amount of context specification seems
> excessive.  As a second step (or at the same time as step one), I
> think we should try to reduce the context specifications to a level
> where it is not excessive but still useful.[3] This requires work and
> the benefits can be debated.  I hope that making the context
> specification more meaningful will help improve quality since you have
> less noise to deal with.  Luckily, it doesn't need to be done
> globally; each UI spec / program combo could do it to their own level
> and on their own schedule.  Library specs can be changed in a
> compatible way.
> 
> We could use
> 
>     confirm|OK
>     confirm|Cancel
> 
> for the two UI strings from the example.
> 
> This has the same level of context as the original (together with the
> textdomain), but it is already much nicer to have in your code.
> 
> In other words, we should reduce the context specifications by
> reducing the number of individual UI strings in the UI specifications.
> 
> Thus, even when using Engineering English in the code (plus the
> occasional (or maybe frequent) context specification), each UI string
> can be reliably identified.
> 
> We do not have the option anymore of changing the Engineering English
> without changing the identifier of a UI string.  I.e., if we decide to
> use "Edit" instead of "Details" for a certain button, we can't do that
> without actually changing all occurances of this identifier.  This
> might be considered serious by some.
> 
> I'd say it is not a bad thing.  We need to be able to push changes
> (not only changes to UI strings) to all the relevant places anyway,
> and being able to get away without reviewing the impact of a change at
> the other end is not essential.[4] Gettext tools can help with this in
> any case.

I can only speak for myself. I think the above approach is not bad but
it has a few drawbacks (embedded reply).

You (us) might like to also speak to all the parties involved ;-)


> [ Now we get to the reply to your message.  You will see a pattern
>   emerge... :-]
> 
> > [context specifications] are used by l10n testing to easily check
> > for untranslated strings
> 
> This might refer to either the C or the D requirement (i.e., either
> the code doesn't use gettext for a aprticular UI element, or a
> language doesn't have a translation for a particular UI string.)
> 
> Violations of C can be found by comparing the set of UI strings used
> by the code with the set specified in the UI spec.  Any mismatch needs
> to be dealt with, either by changing the code or changing the spec.
> There is no change here compared to our current setup since UI strings
> are still precisely identifiable both in code and in the spec.

Not really/ How do you get the list of strings inside the code ?
xgettext can do that but for strings passed to {n,p,d}gettext.


> Violations of D can be found by comparing the set of UI strings that
> are specified in the UI spec with the set of UI strings actually
> translated.  There is no change here compared to our current setup
> since UI strings are still precisely identifiable both in the spec and
> in the translations.
> 
> > to easily determine the originating UI spec
> 
> There is no change here compared to our current setup since UI strings
> are still precisely identifiable both in the code and in the spec.

Because we are prepending a context ?

 
> > to design test cases
> 
> There is no change here compared to our current setup since UI strings
> are still precisely identifiable both in the test cases and in the
> spec.

Visually by the testers too I guess ?


> If you are interested to find violations to B, you might want to force
> the program to show the untranslated UI string identifiers that are
> used in the code.  You can still do this by providing the 'identity'
> translation from "context|text" to "context|text" that Tollef has
> mentioned.

No. I don't want to force the application. I want to use the application
as a normal user do and spot them.


> Some potential actions:
> 
> - Extend the script that extracts .mpo/.po/.pot/whatever information
>   from the UI specs so that it can optionally produce msgids of the
>   form "logical_id|Engineering English" (or whatever is appropriate
>   for pgettext).

Can you do this if needed ? ;-)


> - Make a script that replaces "logical_id" with
>   "logical_id|Engineering English" (or whatever is appropriate for
>   pgettext) in source code.
> 
> - Tell developers and UI speccers that they can use this approach and
>   that then the logical_ids don't need to be unique anymore and don't
>   need to include the Engineering English as a hint.
They need to be unique.


> [1] For example, we have a generic information note dialog that can be
>     parameterized with the text to display and with the label for its
>     single button.  Each use of that information note specifies both
>     the text and the button label as new UI strings.  Instead, we
>     could specify a specific UI string as the default label of the
>     button (and allow uses to overwrite it.)

Put it in a library then so all the applications can use it.


> [2] For example, I am right now coding a new statusbar plugin, the UI
>     spec is done, but I have no translations yet.  Do I use logical
>     IDs in the code?  Of course not, because they are ugly in the UI
>     and they mess up the layout too badly.  (And they might not have
>     the right formatting codes.)  So I have to go back later and fix
>     that, but my motivation is low because I shouldn't have had to
>     make this mess in the first place.

'logical_is|Engineering English' will also mess up the layout.
If you are having this problem then you can generate your own PO files
from the UI spec. We can work out something. You are the second one to
complain about this is the second time I have to repeat the same
info. :-)


> [3] For example, we should have (and use!) the default button label
>     for the information note dialog.

The default Gtk ones ? I thought it's not always possible with all
languages ? ;-)
I don't really know.


> [4] For example, we now can change the actual text of a UI string
>     without changing its identifier, and the new text will
>     automatically pop up in the help texts.  When making Engineering
>     English a part of the identifier, we would need to adapt the help
>     text.  Not a bad thing, I would say, since you get to review
>     whether it all still makes sense.

I don't get this one ?


-- 
Localization Engineer
OSSO - Nokia Multimedia


More information about the maemo-developers mailing list