Data Clean-Up

All sorts of problems you may have with your legacy language data can be approached by our data clean-up tools. The tools execute sophisticated analysis procedures which are based on linguistic intelligence. Step by step, your corrupted data are converted into high quality texts that comply with linguistic, terminological, and editorial quality standards. In this way, your texts are ready for being processed by modern language technologies such as Translation Memories and Authoring Memories.

You have a number of legacy language data which you would like to convert into high quality texts in an up-to-date format?

Your legacy language data is written in capital letters only, or it lacks umlauts in German?

The texts contain many orthographic errors, and spaces are more or less arbitrarily put or omitted between words and punctuation marks?

Due to input field limitations, the texts contain many arbitrarily abbreviated items?

Standards for measurements, item numbers, or screw thread designations are not observed, let alone terminological consistency or quality standards for technical documents?

Then it is time to get your language data cleaned up!

Our Procedures Include

  1. normalization of whitespace
  2. standardization of special data types
  3. rule-based string replacement
  4. correction of orthographic issues
  5. harmonization of terminology
  • Maßangaben
  • Schraubenbezeichnungen
  • Teilenummern u. a.
  • Schreibvarianten
  • Abkürzungen u. a.
  • Versalschreibung
  • alte Rechtschreibung
  • Falschschreibungen u. a.
  • Negativbenennungen
  • Termvarianten

By cleaning up text repetitions, you can reduce the volume of translation-relevant texts step by step. This means that you are ideally prepared for the use of a Translation Memory System. And your consolidated text stock is ideally suited for the initial filling of an Authoring Memory.

We Clean Up Your Data Systematically

Contact us with your request to clean up and consolidate your legacy linguistic data such as product master data, parts lists, parts catalogs and many others.

We can meet your special requirements for linguistic data clean-up as well.

Your Contact Person

Our data preparation specialists are happy to provide assistance with all questions regarding language data clean-up.

E-mail+49 (0) 721 6677570

Axel Theofilidis
Language Technology Developer

By submitting your data, you consent to our processing the data in the scope specified in the Privacy Policy for the purpose of handling your request and to our contacting you via the same communication route. Moreover, we will process the data on the basis of our legitimate interests. We will delete the data as soon as your request has been completed or if you effectively object to the further processing of your data by us. For more information on this subject, please refer to our Privacy Policy.