Seoxpert.io
lowBest Practices

URL Paths Contain Non-ASCII Characters

Some URLs on your site contain non-ASCII characters in their paths, which can cause encoding issues and inconsistent links.

By Seoxpert Editorial · Published

Why it matters

Non-ASCII characters in URL paths are percent-encoded by browsers, but not all systems handle this encoding consistently. This can lead to broken share links, inconsistent canonical URLs, and indexing issues, especially when links are copied, shared, or processed by third-party tools.

Impact

Leaving this unresolved can cause broken links, duplicate content, and SEO inconsistencies.

How it's detected

Automated crawlers scan URL paths for characters outside the ASCII range (0x00-0x7F) and flag any matches.

Common causes

  • Using accented or non-Latin characters in URL slugs
  • CMS or plugins that generate URLs from non-ASCII titles
  • Manual creation of URLs with special characters
  • Lack of transliteration or slugification in URL generation

How to fix it

For new URLs, use ASCII-only slugs by transliterating accented or non-Latin characters (e.g., 'München' to 'muenchen'). For existing URLs, ensure your sitemap and canonical tags use the percent-encoded form consistently. Update internal links and references to match the encoded version to avoid mismatches.

Code examples

Problem: Non-ASCII URL in canonical tag

<link rel="canonical" href="https://example.com/café" />

Fix: Percent-encoded ASCII URL in canonical tag

<link rel="canonical" href="https://example.com/caf%C3%A9" />

Transliterate to ASCII for new slugs (example)

const slug = 'München'.normalize('NFD').replace(/[^\w]/g, '').toLowerCase(); // 'munchen'

FAQ

Why are non-ASCII characters in URLs a problem for SEO?

They can cause inconsistent encoding, leading to broken links and duplicate content in search engines.

Should I change existing non-ASCII URLs?

Only if you can implement proper redirects and update all references. Otherwise, ensure consistent encoding in sitemaps and canonicals.

How do I make sure my canonical tags use the right encoding?

Use the percent-encoded version of the URL in canonical tags to match how browsers and crawlers interpret the path.

Can I use non-ASCII characters in query strings or fragments?

The same encoding issues apply, but the main SEO concern is with the path portion of the URL.

Found this issue on your site?

Run a scan to see if URL Paths Contain Non-ASCII Characters affects your pages.

Scan my website →