New technologies almost always debut as luxuries. Think early smartphones, air conditioners, smartwatches. At first they compete on everything but price. Once there’s scale and operational know-how, prices fall. Machine translation (MT) followed the same arc: once a premium capability, now a ubiquitous utility embedded in websites, apps, and workflows.
TranslateX was built for this next phase. We’ve launched a new Translation API designed to dramatically cut costs while keeping quality on par with the top engines, so teams can translate at product scale without the CFO side-eye.
Why cost matters now
Google Translate helped set expectations: it was free (circa 2006 - 2011) and later normalized pricing around $20 per 1M characters for API usage. DeepL followed with €20 per 1M and invested heavily in positioning itself as the quality-first alternative. As the neural era began (transformer-based systems arrived in mainstream MT around 2017), quality leapt forward and continues to benefit from LLM-driven features like more adaptive/context-aware translations. Today, price and not just accuracy is increasingly the deciding factor at scale.
Our view: the luxury phase for automatic translation is over; the commodity race has begun. Your stack (and your budget) should reflect that.
TranslateX’s bet: a “volume unlock”
Instead of metering every character, TranslateX offers unlimited plans (starting ~$19.99/month) that make budgeting dull in the best way possible. For SaaS apps, games, marketplaces, and user-generated platforms that routinely cross the billion-character mark, predictable pricing is a superpower. We don’t claim to “beat” everyone on quality; we claim to match them closely while being far cheaper.
How we measured quality (EN → ES)
We ran a head-to-head evaluation of Google Translate, DeepL, and TranslateX:
- Test set: FLORES-Plus DEVTEST
- Automatic metrics: BLEU, SacreBLEU, COMET (Unbabel/wmt22-cometkiwi-da)
- LLM evaluation: model-as-a-judge cross-check using OpenAI GPT-4.1 and Claude 4 Sonnet
- Human evaluation: blind review by a specialist (see bio below)
Evaluator bioMarine Ovesyan holds a joint MA from the University of Wolverhampton (UK) and the University of Málaga (Spain), with a specialisation in AI linguistic evaluation. But when she manages to sneak a glance away from the world of algorithms, she dives into literary translation - just to remind herself that poetry still exists beyond the code.
Results at a glance (EN → ES)
Provider | BLEU | SacreBLEU | COMET | GPT-4.1 | Claude 4 | Human |
---|---|---|---|---|---|---|
TranslateX Free | 28.02 | 28.26 | 0.8485 | 92 | 85 | 93.87 |
TranslateX Paid | 28.72 | 28.93 | 0.8670 | 97 | 94 | 96.33 |
DeepL | 27.28 | 27.28 | 0.8700 | 98 | 94 | 96.5 |
Google Translate | 27.66 | 27.92 | 0.8745 | 97 | 95 | 97.27 |
Takeaway: TranslateX (Paid) lands within ~1 point of Google and DeepL in human judgment i.e., publish-ready quality for most EN → ES web and app use cases while enabling unlimited usage at a flat monthly price.
What this means for teams
- Quality is effectively on-par. Across COMET and blind human scores, TranslateX (Paid) trails Google/DeepL by a hair, well within the margin that post-editing or simple QA can close.
- Cost profile flips the decision. With Google and DeepL, each additional million characters adds ~$20/€20 to your bill. With TranslateX, the nth million is free, because they’re all included.
- Operational simplicity scales better. Unlimited plans remove metering logic, cost alarms, and hard tradeoffs between growth and localization. That’s the volume unlock.
The fine print (so you know we did our homework)
- Scoring bands: We follow a 4-tier rubric (Publish-Ready 91-100; Acceptable 70-90; Fair 50-69; Unusable 1-49). TranslateX (Paid), Google, and DeepL produced overwhelmingly Publish-Ready outputs in the blind review.
- Engines & settings: We compared out-of-the-box engines (no custom glossaries or domain adaptation) to reflect realistic website/app localization.
- Tools: Automatic scoring used BLEU, SacreBLEU, and COMET; we also cross-checked with LLM judges (GPT-4.1, Claude 4 Sonnet) via the Alconost.MT Evaluation Tool.
Pricing snapshot
Provider | Price per 1 million characters | Notes |
---|---|---|
TranslateX | $19.99/mo unlimited | Flat, unlimited usage |
Google Translate | $20 | Pay-as-you-go |
DeepL | €20 (~$21.50) | Pay-as-you-go |
Ready to test?
- Download the human-evaluation workbook (with comments and corrections): Blind Human Evaluation of TranslateX Paid, TranslateX Free, Google Translate, DeepL (English → Spanish, July 2025)
- Explore the evaluation scripts in our GitHub repository.
- Spin up the API and run your own side-by-side tests.
- Try the quality now: translatex.com
- See all supported languages.
Bottom line: The quality race is now a price/performance race. TranslateX delivers publish-ready translations at commodity pricing, so you can localize more, sooner, for less.
Disclosure: This post summarizes measurements from our July 2025 evaluation runs and a blind human review. Methods and raw results (automatic + human) are linked above for transparency.