-->
Translating large sets of legacy documentation is always a challenge. Professional human translation ensures quality, but it is time-consuming and costly—especially when dealing with dozens or even hundreds of Markdown files, diagrams, and embedded metadata.
For my Redaction Technique legacy website, I set up an AI-based iterative workflow that automates much of the heavy lifting while still leaving space for human refinement.
Like most efficient processes, it relies on iteration: a loop that combines different AI tools to enable steady, incremental publishing. Human intervention remains critical at every stage—guiding the process, correcting errors, and keeping the results on track. The final step is a thorough human correction, which you can sometimes defer for legacy or non-critical content until analytics confirm that the material is worth the extra investment.
The first step is to generate raw English translations of all French Markdown files. To do this, write a simple Python script that:
This produces usable English content quickly, but the results often contain broken Markdown tables, mismatched frontmatter, or literal translations that sound clumsy.
Before moving on, fix structural issues in the translated Markdown:
This ensures the files build into the website without errors.
DeepL produces decent raw translations, but the style often needs polishing. To automate this, create another Python script that sends the translated file to GPT-4o with a strict editing prompt:
You are an expert technical writing editor. The text is about technical writing, DITA, and structured authoring. Fix inconsistencies, unprofessional style, and poor French-to-English translations. Keep Markdown formatting intact. Return only the corrected text, without explanations.
To stay in control, have the script process one file at a time and then stop. This makes it easier to review and cherry-pick changes.
Review edits selectively with:
$ git add -p
Git shows each change hunk by hunk and lets you decide whether to stage it (y = yes, n = no). You can split hunks with s, edit them manually with e, or quit anytime with q.
This workflow is ideal when a file contains unrelated edits—such as AI-generated suggestions—because it gives you full control over what goes into your commit history.
diff —git a/communication-technique.md b/communication-technique.md index d5b0c9b8..7a5632af 100644 +++ b/communication-technique.md --- a/communication-technique.md @@ -1,31 +1,30 @@ +The goal of technical communication is to convert prospects into -The goal of technical communication is to turn prospects into satisfied customers. The technical writer provides the market with (1/1) Stage this hunk [y,n,q,a,d,s,e,p,?]?
Stage the changes you want, commit them with:
$ git commit -m "commit message"
Then discard the rest with:
$ git reset --hard
When the text is ready, build the site locally to catch any remaining errors. Common fixes include:
The process is iterative:
Repeat until the full documentation set is translated.
This workflow isn’t a replacement for professional translation, but it’s “good enough” for legacy content where perfect nuance is less critical. For high-visibility or customer-facing material, plan on a final round of human proofreading.
With this pipeline, you can translate and modernize a large corpus of technical documentation efficiently—combining the strengths of DeepL, GPT, and a bit of manual oversight.