AudioObject markup with transcript on every audio-bearing page

Give AI systems a structured pointer to your audio plus the spoken text.

Scan your site

What this signal tests

We check that any page with playable audio carries an AudioObject block in your structured data. The block names the audio file, gives a duration, declares the file format (such as audio/mpeg), and ideally includes a transcript. The transcript turns audio that AI cannot listen to into text it can read directly.

Why it matters for your visibility in AI

AI ingestion pipelines almost never run speech-to-text against the audio on your page. Even multimodal models that can transcribe will skip the step on millions of pages because it is too expensive at crawl scale. If your audio carries no transcript and no AudioObject markup, an AI summary of your page will simply ignore everything that was spoken. That means a podcast episode answering exactly the question a user just asked an AI assistant will never surface as a citation. The answer engine will quote a competitor's blog post instead, because the competitor wrote their content as text. Publishing a transcript reclaims that audio content as indexable text.

Pass criteria at a glance

Criterion Passes when
100% of audio-bearing pages have AudioObject; >=50% include transcript.

How we test it

We parse JSON-LD blocks on the page and look for @type AudioObject. We verify it has contentUrl, duration in ISO 8601 format, encodingFormat (an audio MIME type), and uploadDate. We then check whether a transcript field is present and at least 200 characters long. Pages that host audio but omit AudioObject entirely fail the check.

Show technical detection method
@type AudioObject with required fields; transcript >=200 chars on >=50% of pages.

If your site fails: how to fix it

  1. Add an AudioObject JSON-LD block to every page that plays or links to audio. Required fields: contentUrl, duration (ISO 8601 like PT12M30S), encodingFormat (audio/mpeg, audio/wav, audio/ogg), uploadDate.
  2. Generate a transcript with Whisper (free, runs locally), Otter, Rev, Descript, or your podcast host's built-in transcription, then paste the cleaned text into the transcript field of the AudioObject.
  3. Human-edit the transcript for proper nouns, brand names, and technical terms. Auto-generated transcripts mangle names and names are exactly what AI systems use to attribute citations.
  4. Also publish the transcript as visible HTML text on the page, ideally inside an <article> element. This gives non-JSON-LD-aware crawlers (and humans) the same content.
  5. Validate the JSON-LD with Google's Rich Results Test or schema.org's validator before shipping.

Quick facts

MaturityESTABLISHED
Weightmedium
CategoryMultimodal

Primary sources

Related signals

Frequently asked questions

Does this matter if my audio is hosted on Spotify or Apple Podcasts?

Yes. Those platforms have their own catalog metadata, but anyone landing on your website (including AI crawlers) needs AudioObject in your HTML to know audio exists. Otherwise your page looks like an empty player to a text-only crawler.

Is a transcript really worth the effort if I already provide show notes?

Show notes summarise. A transcript captures every quotable phrase, every question, every guest answer. AI systems answering specific questions cite specific quotes, so the more verbatim text you publish, the more chances your content gets pulled into answers.

What if my transcript is very long? Will it bloat the JSON-LD?

JSON-LD blocks can be large; there is no hard limit. If you are worried about page weight, render the visible transcript as HTML inside the page and use a shorter excerpt in the JSON-LD transcript field. Both are valid; the visible HTML is what most AI tools will actually parse.

Do I need this for short audio like UI sound effects or notification chimes?

No. AudioObject is intended for content-bearing audio: podcasts, interviews, lectures, narrations. Decorative or functional audio (button clicks, alarms) does not need structured data and including it would just clutter the page.

Run your own scan

Run a free scan and see how your site grades across all 155 AI-readiness signals.

Scan your site