Say goodbye to tedious alignments with horrendous tagging and formatting – and use Mistral AI to generate clean and working tmx code to import in a translation memory!
Here is an AI positive post – you guys know my approach to this: AI is my copilot but I’m the captain on board! Still – while it is fairly meh for actual translation (though I find GenAI rather superior to MT for pre-translation if you really must), some AIs are described as “assistant AIs”, like ChatGPT, Claude, Gemini or Mistral AI. Meaning they can automate and help you with mundane, manual tasks.
I have been swamped with project management these past weeks and looking for ways to further optimise project management processes. I asked myself what tedious tasks (particularly terminology/alignment, etc.) I would be eager to reduce, in terms of time and manual labour – and can AI help me automate some of these?
đź’ˇHere’s a concrete use case: GenAI can generate TMX to import into a TM – no more tedious alignments!
Claude AI can do it, but even with a paid subscription, it has usage limits, so it gets frustrating fast. ChatGPT struggles: none of the tmx code it’s been generating has been working and could be imported successfully into MemoQ. But MistralAI, the French AI, will generate you the complete code for a working tmx, that you just need to copy-paste in Notepad and save with the .tmx file extension. I was eager to test Mistral lately and finally got around do it. It has a free and a paid version, the latter being more or less the same price than ChatGPT.
Now, obviously this is only for non-confidential content! Say, you want, for instance, to align the entire Swiss law on medical devices GER>FRA – no confidentiality issue here, it’s an official text and available online. Mistral AI is French, ergo the servers and data centres are in th EU, so it is GDPR-compliant, nonetheless – use your common sense and do not input any sensitive or confidential content from clients, let alone medical data!
Make sure the texts to align (in both languages) are clean and bare of formatting or links, etc. – you can ask the AI tool to do so in a first step, either by pasting it directly or via txt or Word – MistralAI cannot modify files directly, you can copy-paste both the GER and the FRA texts there and ask it to clean them up. Claude AI can actually modify file contents – so you can for instance copy-paste the Swiss law text in both languages into Word, upload to Claude, ask it to remove all formatting, links, etc., turn the whole into a block of text with proper punctuation. Then you copy-paste the cleaned texts into Mistral and prompt it to generate a tmx – all you need to do is copy-paste the output in Editor or in Notepad++ and save as .tmx – et voilĂ .
Everything you don’t specify in the prompt is “creative margin” for AI: make sure to specify e.g. tmx with proper DOCTYPE declaration, full language codes (de-DE, fr-FR), complete header metadata, encoding should be UTF-8, unique TUIDs, etc. Outline e.g. that every single sentence should be properly segmented and aligned, otherwise the AI may “skip” segments.
And now, I have a working TM parsed with a Swiss law, with minimal effort. Now, obviously, while translating in your CAT-tool with this AI-generated TM, use your common sense and be careful, as some sentences might have been skipped or misaligned – but so far it happened only once on my end 🤞
Check out Le Chat by Mistral and see for yourself – after testing it, I personally decided to cancel my ChatGPT subscription to switch to Mistral – plus it is EU-based, so GDPR-compliant.
Discover more from eLoc Smart Solutions GmbH
Subscribe to get the latest posts sent to your email.
Comments are closed