How to Localize an E-commerce With Open-Source Tools

How to Localize an E-commerce With Open-Source Tools

OmegaT “Team Project” Feature: a Case Study

Summary

The free and open-source computer-assisted translation (CAT) tool OmegaT allows a team of translators to work simultaneously on the same project, sharing translation memories, glossaries and other reference material thanks to its "team project" feature.

In this article you will learn how to set up and manage the external Subversion or GIT server required for collaborative translation, and you will be presented a case study on how this powerful feature is helping me and my company in the localization (=translation and adaptation) of a major e-commerce website and blog.

What's a CAT tool

A so-called "translation memory" is nothing more than a database of translation units. A "translation unit" is a pair formed by a source sentence and its translation.

Presentación1.png

The above image shows a simplified structure for a CAT tool, that is usually made of several modules:

  • TM: Translation Memory
  • TB: TermBase (terminology database)
  • Editor: the actual module where you input your translation
  • Concordance: fuzzy search feature/module
  • Dictionary: you can link digital dictionaries to your Editor
  • Spell checker
  • Reference Material: you can browse and search external files
  • Alignment tool: used to create a TM from existing translations
  • QA: Quality Assurance tools
  • MT: you can link a Machine Translation engine to your Editor
  • TM maintenance: tools to edit the content of the TM

OmegaT as the perfect tool for us

As a professional technical translator, occasional graphic designer, and localizer, I've been using OmegaT on a daily basis for almost ten years. Together with my business partner Sergio Alasia, I ran Qabiria, a small translation and localization company based in Badalona, near Barcelona, in Spain.

When we start up our company we had a mission: improve the productivity of language service providers by making creative use of technology.

And really this is what it's all about: using technology in a smart way. When I say technology I don't mean huge mainframe or corporate systems, I rather think about open source software, freeware, shareware, small pieces that - when put together - could satisfy your needs in a way you hadn't thought of.

Localizing product info sheets and blog posts for the fashion industry

The story I'm going to tell you begins in Barcelona, a few years ago at a networking event, for developers.

At that event I met a guy who was running a software house in Italy. We shook hands, exchanged our business cards and that was all. I didn't hear from this person in the next year and half.

Actually, I've never met him again, at least in person. So we can say that this story really ends in... the cloud, like you will see.

One day, all of a sudden, I received a Skype request from a new contact. It was him. He told me that one of his customers was launching a new international website and they needed translations into English.

This customer was AW LAB.

AW LAB is one of the most important Italian retailers for lifestyle sneakers and sportswear, a competitor to Footlocker. As part of a wide project to expand abroad and consolidate the online sales channel, they wanted to translate their own e-shop and the blog from Italian into English.

AWLAB.jpg

Project scope

The project involved translating product info sheets and blog posts for a total amount of 250,000 words.

This needed to be published within 5 weeks.

So, quite a challenging project, although not really in the "ohmygod" category. But of course we needed a reliable team in order to meet the deadline.

The first analysis, carried out alongside the AW Lab IT provider, enabled us to understand how to treat and process the original file format, so as to extract all translatable contents.

In so doing, the numerous repetitions were excluded and the initial word count updated.

Afterwards, the translation was assigned to a small group of native translators, with expertise in the fashion industry. The translation was then proofread and edited.

In little more than one month and within the agreed deadlines, the customer received the localized version of their website. Both product files and blog posts were correctly imported in their Content Management System without any noticeable problem.

By the way, after many years, the collaboration is still going on satisfactorily, with the translation of all the updates from both the product catalogue and the blog. Moreover they later turned to us for the localization of the whole e-shop into Spanish.

Project workflow

So, how did we accomplish this?

We're a small company, a micro-company, I would say. Qabiria is just Sergio Alasia, me, an outsourced project manager and a network of freelancers. So we don't really have the resources other larger companies might have.

At that time, we had already replaced SDL Trados, the de-facto industry standard CAT tool, with OmegaT in our daily operations. We were so happy with it, that we even wrote a book about it.

guida-omegat.png

So we thought to give the team project feature a try.

For the ones of you that don't know OmegaT, I would only say that the program is completely open source, is based on Java, therefore is completely multiplatform, meaning that it works exactly the same way on Windows, Mac, Linux.

On top of that, it is extendable through plug-ins and scripts. This feature is quite important because we will deal with scripts later on.

So far I've told you that our customer wanted both blog posts and product info sheets translated. Let's take a look at their workflow, from the content creation to the publishing stage.

workflow.png

Blog post are outsourced and written manually in Microsoft Word, while the product sheets are composed in Magento CMS.

Those product files are later exported from Magento, translated by us in OmegaT, imported back and then published through Typo3 and Magento, on their international website.

workflow2.png

Product info sheets are exported as Excel XML files, a structured format, while as I said, blog posts are simple Word files, later imported into Typo3, the CMS they used for their blog.

If we take a look at the workflow divided into roles, we can see that everything starts with a customer request, then the project manager prepares, uploads and analyses the file.

workflow-3.png

The results of the analysis or a word count are then sent to the customer.

After that, the project manager assigns the translation jobs, alerting the translators. This notification is done through our translation management system.

The translator translates, the editor edits and the project manager later reviews the jobs and and delivers them to the customer.

File preparation before the translation

The first issue we faced was the absence of a specific file filter to translate the Excel XML files in OmegaT.

At first we simply converted these XML files into Excel XLSX files. However this meant extra work by our project manager and an extra step in the whole process.

So eventually, we decided to sponsor the development of a new file filter, that was introduced in OmegaT standard version.

A second issue we faced during this phase is the so-called "tag soup". Once loaded the blog posts into OmegaT, some of them had apparently superfluous tags interspersed between words.

Actually there isn't much you can do, if you want to keep the original formatting. The best option is to rely on third party tools such as "Translator tools" by Stanislav Okhvat and "CodeZapper" by David Turner. Both are sets of macros designed to help translators in the job and both feature a specific macro to clean Word files from junk tags. Lately, OmegaT introduced this feature out-of-the-box, so you can also use the native OmegaT feature to "clean up" Word files.

After cleaning up the files, it comes the time to upload them to a server. And this is covered with the "team project" feature.

The "team project" feature was introduced with OmegaT version 2.6.0. The "team project" feature allows several translators to work together on the same project, sharing glossary and translation memories almost in real time.

The collaborative translation offered by OmegaT is based on the functionality of Version Control, that should be well known among software developers. If you're not into it, they're system used to keep control of the changes made to the code of a program.

Several programs implement version or revision control. The two main ones are Subversion and GIT, not surprisingly the ones supported by OmegaT. We will not cover GIT today, because for this specific project we used Subversion.

The Apache Subversion software is installed on a server and allows you to host the source code of a program under development. The programmers who participate to the project can connect to that server to add their code changes. At the end, Subversion creates a single version by "merging" the various modifications made by the contributors. Similarly, by loading an OmegaT translation project on a SVN server, you can assign it to various translators who can work on it at the same time.

The tools used are:

  • OmegaT, of course, acting as a SVN client
  • Subversion, the version control software
  • TortoiseSVN, an SVN client
  • a server where you can install Subversion

First of all, you need to run a SVN server. There are two options: either you install SVN on your own server or you use a hosted service. You can find several sites that offer this service for free, especially to non-profit or educational organizations or project.

Of course, while using an external service you must be aware of the possible implications in terms of confidentiality, since you are loading the original document on a server you can't directly control.

You can avoid any issues if you set up a private SVN server. For example, if you already have an Apache server that includes this piece of software. This action is performed by you or your IT department or by the project manager. You need to do this only once.

Once you have a SVN server available, you must also locally install a SVN client, in order to manage the folders on your computer. The average user on Windows can use TortoiseSVN. Similar tools are available for Linux and Mac. If you're a dev, you can simply use the command line without a GUI program.

The procedure to share a complete OmegaT project among several translators consists of three more logical steps:

  1. Creating the repository on the server
  2. Importing the OmegaT project into the repository
  3. Checking out the OmegaT project

The first step is the same for all operating systems and is usually done by the project manager, or by the translator in charge of the management. They create a root folder on the SVN server that will contain all the OmegaT projects to share. This step is only performed once.

Step number 2 depends on the operating system and is used to import each OmegaT project into the main repository. This operation is also done by the project manager only. You can use TortoiseSVN or the command line.

Finally, the project checkout by each translator is handled by OmegaT itself and requires only one click after you open the program. This is the only step that needs to be performed by translators, if they're not in charge of project management.

Creating the repo

For the project I described before, creating the repository and assigning users to the repository was done by me on our internal server. We used a Synology NAS, which has a few apps, Subversion included. So we simply click on that app to run an instance, create the repo and add the needed users.

Then it's time to import the project into the repo.

  1. Open OmegaT and create a new project in a temporary folder.
  2. Select the newly created folder, right-click and choose Import... from the context menu added by TortoiseSVN. This command adds the files to the repository folder you created earlier.
  3. You can delete the temporary folder, as the project will then be downloaded and managed using OmegaT.

Accessing the project from OmegaT

All people working on the project (including the project manager) are required to download the project:

  1. You need to start OmegaT and select Download team project... from the Project menu.
  2. In the popup window just type the URL of the repository and specify on which local folder you want to save the project.
  3. If authentication is required, enter the user name and password for the SVN server.

During the first connection to the SVN server all files created in the repository are downloaded locally.

It is important to remark that if the project is not completed in a single session, during the next session it is not necessary to download the project again. You simply open the project which has already been downloaded. OmegaT detects that it is a team project and will automatically connect to the server. In the event of a connection error, close the project and reopen it.

After having imported the project, it's time to analyse it. This is done through Menu > Tools > Match Statistics per File. It gives you the usual statistics, like in all CAT tools.

Checking out the project

During translation, each time you save the project, the project-save.tmx file containing all the changes is synchronized, merging all changes with those of other translators.

By setting the auto-save option every 2 or 3 minutes you will make sure to frequently synchronize your work with that of all others.

Once the translation is completed, the editor creates the translated files and the PM does a final review on the target files. Although OmegaT has its own Quality Assurance features, before delivering them, we also verify that the XML files are valid.

Key Advantages

Using the "team project" feature has some clear advantages. Despite a little complexity in the initial configuration, with the "team project" feature you can organize working groups consisting of more than two translators.

When I say "complexity", I'm thinking of the average translator, that usually doesn't have higher IT skills.

Moreover, in the spirit of OmegaT as a project, you only use open source tools (both Apache Subversion and Git are distributed under open source licenses) without relying on commercial software.

Leveraging the characteristics of the SVN server you can also provide your clients a read-only access to the folder containing the target files, thus facilitating the delivery of the project (we haven't implemented this yet).

It goes without mention that all these benefits are available to users completely for free. A quick comparison with the prices of server-based solutions for commercial competitors should be sufficient to understand the scope and importance of this feature in OmegaT.

Lessons learned

  • We should have requested the filter development earlier.
  • Team project is more reliable on a hosted server than on free servers. We experienced frequent connection errors when we used free cloud-based servers.
  • At the beginning, translators need some technical support. This support hours must be taken into account.
  • Editing in OmegaT is not straight-forward, because it lacks a status field for the translation units.

Future improvements

In the future, we would like to introduce a few improvements to the whole workflow.

We would like to automatize all communication and upload, avoiding email altogether. We also look forward to synchronize team projects with our Translation Management System.

Another key point is to keep translators up-to-date regarding OmegaT with specific online training sessions, so that we all use the latest version and can take advantage of latest features.

If you happen to need help in localizing your software or devise the right strategy for your internationalization efforts, please get in touch with me. I'll be happy to help.