MathML markup for equations on STEM and technical pages

Encode formulas as structured MathML rather than rasterised images or raw LaTeX strings.

Scan your site

What this signal tests

We detect pages that contain equations (using cues like LaTeX patterns, image filenames containing formula or equation, and known math-rendering classes) and check whether the equations are rendered as MathML (a structured XML format with tags like <math>, <mfrac>, <msup>) or as something less machine-readable like rendered PNG images or unrendered LaTeX source.

Why it matters for your visibility in AI

Equations are dense, semantically rich, and notoriously hard for AI to ingest unless they are properly structured. A rasterised image of a formula is meaningless to a text crawler. Raw LaTeX source is better but still unreliable: AI extractors that strip HTML often mangle backslash-heavy syntax. MathML gives every variable, operator, and superscript its own element so the model can read the formula as structured data. For STEM publishers, research labs, technical documentation, and educational sites, this is the difference between AI being able to discuss your work and not. An AI assistant asked to explain a theorem will quote sites that publish MathML; sites that ship equations as PNGs may not even be detected as containing the relevant content.

Pass criteria at a glance

Criterion Passes when
On STEM pages, >=50% of equations are MathML.

How we test it

We scan your page for STEM signals (LaTeX delimiters like \$\$, MathJax script tags, image alt or filename matching equation/formula). On flagged pages, we count occurrences of inline or block <math xmlns="http://www.w3.org/1998/Math/MathML"> elements and compare against the equation count detected from other sources. A page with several equations but no MathML fails.

Show technical detection method
Detect STEM pages (LaTeX-like patterns, image alt/filenames matching equation/formula); score MathML presence vs equations-as-images.

If your site fails: how to fix it

  1. If you write in LaTeX, configure MathJax (the most common renderer) to output MathML rather than HTML+CSS by setting `output: 'mml'` in your MathJax configuration.
  2. If you use Pandoc to convert source documents, add the `--mathml` flag. Pandoc converts LaTeX math directly to MathML in the output HTML.
  3. Keep your LaTeX source inside an <annotation encoding="application/x-tex"> child of the <math> element. This gives consumers both MathML (for structure) and LaTeX (for round-tripping) in one block.
  4. For Jupyter notebooks exported to HTML, use the MathJax MathML output mode. For Sphinx documentation, use the sphinx-mathjax3 extension with MathML output.
  5. Stop generating equation images via screenshot, LaTeX-to-PNG services, or PowerPoint screenshots. Always render with a MathML-capable engine instead.

Quick facts

MaturityEMERGING
Weightlow
CategoryMultimodal

Primary sources

Related signals

Frequently asked questions

Do modern browsers actually render MathML well?

Yes, finally. Chrome, Edge, Firefox, and Safari all ship native MathML Core support as of 2023. You no longer need MathJax just for rendering. But if you want fallback for older browsers or extra typography control, MathJax with MathML output is a good middle ground.

Is LaTeX in plain text good enough for AI?

Better than image-only, but not great. AI models can usually parse LaTeX strings, but the rate of error on complex equations is meaningful. Stripping HTML on the way into a model can also mangle the backslashes. MathML is more reliable because every operator has its own element.

My equation editor only outputs images. What now?

Migrate to MathJax, KaTeX (with MathML output), Pandoc, or a CMS plugin that supports MathML. As an interim measure, put a LaTeX source string in the image's alt attribute so at least the source is recoverable by a crawler that bothers to read alt.

Run your own scan

Run a free scan and see how your site grades across all 155 AI-readiness signals.

Scan your site