If you have a store with catalog in Ukrainian, Russian, Polish, German — standard search solutions (including popular US SaaS — Klevu, Searchanise) often show lower quality on non-English languages. Reason is simple: these models were trained mostly on English-speaking e-commerce data. We discuss why this happens and how multilingual-e5-large solves it out of the box.

Problem: why US-trained models fail on CIS/EU

Most smart-search SaaS platforms were created for US market: Shopify stores with English catalogs, fashion / electronics / beauty domains. When these products later "expand to international markets" — they add basic translations, but the core model stays English-trained.

📊
On our benchmark (UA store 30k SKU houseware): Klevu showed ~88% top-3 on Ukrainian, ~92% on English. Searchanise — 78% UA, 89% EN. AI Search — 92.7% UA, 91% EN. Multilingual-first models give consistent quality across languages.

Standard OpenCart LIKE search and its limits

OpenCart standard search uses SQL LIKE '%query%'. Means literal string match. On UA/RU/PL fails on 5 levels:

1. Morphology

Slavic languages are flective — one word has 6-12 forms:

Morphology example

Чашка → Чашки, Чашці, Чашку, Чашкою, Чашок, Чашкам...

2. Latin/Cyrillic

User can type "stiklokeramika" or "склокераміка" — standard search won't understand they're the same.

3. Synonyms

"refrigerator" / "fridge", "trousers" / "pants" — different words for same products. LIKE doesn't know synonyms.

4. Transliteration

"Айфон" / "iPhone", "Панасонік" / "Panasonic" — UA users partially transliterate brands.

5. Cross-language

Buyer with Russian UI types "сковорода" — but product indexed only in Ukrainian as "пательня" — search fails.

Models compared: e5 vs US-models

ModelTraining dataUA/RU/PL quality
OpenAI text-embedding-ada-002~93% Englishmediocre
Klevu (proprietary)US e-commercemediocre
Searchanise (proprietary)US/UK e-commercelimited
BGE-M3 (Baidu)multilingualgood
multilingual-e5-large100+ languages parallelexcellent

Why multilingual-e5 is better for UA/RU

multilingual-e5-large-instruct — open-source model from Microsoft Research. Trained on 100+ languages in parallel (not "English + translations"). Means:

  • Morphology — model understands "чашки" and "чашка" as close concepts without dictionary
  • Cross-language — "сковорода" (RU) and "пательня" (UA) end up close in vector space
  • Synonyms — "холодильник" and "фрідж" model understands from training context
  • Transliteration — "iPhone" and "Айфон" have cosine ~0.9

Cross-language matching in practice

On Ukrainian store with trilingual catalog (UA/RU/EN) real examples:

Buyer queryUI languageFound productProduct language
steklokeramikaENСклокерамічна тарілкаUA
сковородаRUПательня з антипригарним покриттямUA
kettleENЧайник електричний 1.7лUA
чайнікUAЧайник Bosch (description in EN)EN
iPhneUAiPhone 15 Pro MaxUA
фріжUAХолодильник SamsungUA

All these queries return relevant products in AI Search v1.0.5. On standard OpenCart LIKE all of them — 0 results.

Real examples from UA stores

Houseware store (~30k SKU, isklad.com.ua)

  • Query "чашка з блюдцем" — finds "Чашка кавова з блюдцем 250мл"
  • Query "тарілка для пасти" — finds "Глибока тарілка для першого 23см"
  • Query "білий керамічний горщик" — finds "Кашпо керамічне біле для квітів"

Apparel store (5k SKU)

  • Query "сорочка з довгим рукавом" — finds blouses + casual shirts
  • Query "trousers black" (EN) — finds "Штани чорні класичні" (UA catalog)
  • Query "плаття літне" (typo) — finds "Сукня літня"

Statistics: how much search improves

+30%
more successful searches
-26 p.p.
bounce from search
+15%
conversion (mobile)
-90%
"I can't find" in chat

Data from 5 OpenCart stores that switched from LIKE to AI Search during 2025-2026. Details in "isklad.com.ua case study".

How to enable multilingual mode

In AI Search multilingual is default. Nothing to configure:

  1. Install module (5 minutes — guide here)
  2. Reindex — module auto-indexes products in all active languages
  3. Done — search works multilingually out of the box
🌍
Important: in Klevu/Searchanise/Doofinder multilingual often requires separate Pro+ tier or add-on. In AI Search — included in every plan. Free plan with 200 SKU also gets all languages.

FAQ

How many languages does AI Search support?

100+ via multilingual-e5-large. Best on: UA, RU, PL, DE, ES, IT, FR, EN. For Chinese/Korean/Japanese efficiency slightly lower but works.

What if my catalog is English only?

Multilingual model isn't worse than US-models on English — our benchmark shows 91-93% top-3 on EN catalogs vs ~94% Klevu.

Need separate indexing for each language?

No. AI Search auto-indexes all active store languages on reindex. 3 langs × 30k SKU = 90k embeddings, 30-90 min.

What if some products don't have translation?

Fallback works. Product indexed only in UA still findable via cross-language matching.

Can I disable cross-language matching?

Yes. AI Search → Settings → Strict Language Mode: Enabled. Then query in language X returns only results in same language.

Does multilingual affect search speed?

No. Vector space size doesn't depend on language count — each embedding 1024-dim. Speed stays ~200ms on 30k SKU regardless of locales.