Yes, I’ve tackled this too. LLMs like o4-mini give great quality, but latency is a real issue for larger pages. One trick is converting HTML to structured JSON (just the visible text), translating that, then rebuilding the HTML. You can also parallelize translation and use faster models like Claude Haiku or Gemini Flash. For bulk tasks, traditional models like MarianMT via Hugging Face are much faster. Caching also helps a lot. Curious how your JSON approach works out!
Did you use any library to go from html to json ? Did you have any inconsistensies with stuff like "<div>text<span> is <span> really</span><b> difficult to convert to json</b><span></div>" ?
I think using 3rd party API might help with latency as they won't have the same latency problem.
Yes, I’ve tackled this too. LLMs like o4-mini give great quality, but latency is a real issue for larger pages. One trick is converting HTML to structured JSON (just the visible text), translating that, then rebuilding the HTML. You can also parallelize translation and use faster models like Claude Haiku or Gemini Flash. For bulk tasks, traditional models like MarianMT via Hugging Face are much faster. Caching also helps a lot. Curious how your JSON approach works out!
Did you use any library to go from html to json ? Did you have any inconsistensies with stuff like "<div>text<span> is <span> really</span><b> difficult to convert to json</b><span></div>" ?
I think using 3rd party API might help with latency as they won't have the same latency problem.