Image Content-Type header matches the file's actual format

Make sure your server tells the truth about what kind of image it is serving.

What this signal tests

We fetch a sample of your images, read the Content-Type response header, and inspect the first 16 bytes of the file (the magic bytes that uniquely identify the format). For example, a PNG starts with 89 50 4E 47 and a JPEG with FF D8 FF. We then compare the Content-Type and file extension to the actual format detected from the magic bytes. They must all agree.

Why it matters for your visibility in AI

AI ingestion pipelines often branch on Content-Type to decide how to handle a response. If your server claims to send image/jpeg but the bytes are actually a PNG (or worse, an HTML error page), the consumer may either parse the bytes incorrectly, reject the response, or in some cases silently corrupt their pipeline. The same problem affects multimodal LLMs: they cannot pick the right decoder if the declared format lies. Mismatched Content-Types are also a security signal. They sometimes indicate a misconfigured CDN, an open redirect, or a poorly maintained server. Trust scoring systems used by AI vendors notice these patterns. A site with consistent, honest MIME types looks operationally mature; a site with mismatches looks suspect.

Pass criteria at a glance

Criterion	Passes when
100% of sampled images: Content-Type matches magic bytes.

How we test it

For each sampled image, we send a HEAD or GET request, read the Content-Type response header, and download the first 16 bytes. We then check whether those bytes match the declared format: PNG (89 50 4E 47), JPEG (FF D8 FF), WebP (RIFF followed by WEBP at offset 8), AVIF (ftypavif), GIF (47 49 46 38). We also compare to the file extension. All three (extension, Content-Type, magic bytes) must agree.

Show technical detection method

HEAD/GET each image; compare Content-Type and extension to first 16 bytes.

If your site fails: how to fix it

Audit your server's MIME-type configuration. In nginx the default mime.types file usually gets this right; in Apache the AddType directive controls it. Check your CDN's content negotiation rules too.
Hunt down mislabeled files. Common case: a WebP file uploaded with a .jpg extension because the CMS reused the original filename. Re-encode and rename the file or update the Content-Type header.
For dynamic image proxies and on-the-fly converters, ensure the output Content-Type reflects the converted format, not the source. Cloudinary, ImageKit, and Cloudflare Images handle this; custom proxies often do not.
Block your CDN from setting application/octet-stream as a fallback Content-Type. That generic header tells crawlers nothing and many will skip the asset.
Add an automated check to your CI: download a sample of public images from your domain, verify magic bytes match Content-Type, fail the build on mismatches.

Quick facts

Maturity	ESTABLISHED
Weight	low
Category	Multimodal

Primary sources

https://mimesniff.spec.whatwg.org/

Related signals

modern-image-formats

Frequently asked questions

What if my CDN strips or rewrites Content-Type headers?

This is the most common cause of mismatches. Check your CDN's image transformation rules and the origin response. Cloudflare's Polish, Cloudinary's auto-format, and Shopify's image filters all rewrite formats; verify they also rewrite the Content-Type to match.

Does this matter for SVG?

Yes. SVG should serve as image/svg+xml. Some servers send application/xml or text/xml, which causes browsers and crawlers to refuse to render it as an image. The SVG magic bytes start with <?xml or <svg, so the detector can verify.

How can I scan my whole site for mismatches?

Crawl your sitemap, extract image URLs, and run them through a script that compares Content-Type to magic bytes. The `file` command on Unix can identify format from bytes; combine with curl -I for the header. Sites like Sitebulb and Screaming Frog also flag this in their audits.

Run your own scan

Run a free scan and see how your site grades across all 155 AI-readiness signals.

Scan your site