Schema.org markup LLMs actually parse — a tested guide.
Half of schema.org markup is treated as decoration by the LLMs. The other half visibly lifts citation rate. Here's the working subset, with a test for each.
The working subset.
Across our first-cohort A/B testing — schema-marked vs. schema-bare versions of the same content, audited weekly across Claude, ChatGPT, and Gemini — the markup types that consistently move the needle are:
- FAQPage — large lift on Claude and ChatGPT, moderate on Gemini. Perplexity sample size is still small in our data; we treat the lift as unproven until the integration matures. - Article — moderate, consistent lift on the three engines we audit in production. - HowTo — large lift on Claude when the page is a true step-by-step; smaller on ChatGPT; inconsistent on Gemini. - Organization + LocalBusiness — small but stable lift on geo-anchored queries; mostly a backstop rather than a primary signal. - BreadcrumbList — small lift that compounds with the above.
What doesn't move the needle, in our testing:
- Product, on a service-business service page. The model treats it as a category cue, not as a quotable signal. - VideoObject, ImageObject without surrounding context. The model uses these to discover media, not to quote text. - PostalAddress alone. Pair it with LocalBusiness.
How to test it yourself.
The cheapest test is to take a page that already gets some citation traffic, fork it onto a different URL, add (or remove) one schema block, and audit both URLs over two weeks. The signal shows up in the cell-level data within a week if the lift is real.
Example: FAQPage on a real customer page.
The customer's existing page had eight Q&A pairs in standard HTML (h3 + p). We added the FAQPage schema. Citation rate on the eight questions, week-over-week:
- Week 1 (no markup): 12% of the eight prompts.
- Week 2 (markup added): 24%.
- Week 3: 31%.
- Week 4: 28%.
The lift compounded for two weeks and then stabilised at about 2.3× the baseline. Other variables were held constant: same URL, same content, same word count, same publication date.
Example: HowTo on a step-by-step.
A plumber's "how to bleed a radiator" page. We added a HowTo schema with five named steps. Claude's citation rate on the single most relevant prompt ("how do I bleed a radiator") doubled within ten days. ChatGPT lifted more modestly; Gemini didn't move in this sample. We don't yet have enough Perplexity data on this page type to draw a conclusion either way.
Example: Article on long-form.
The most consistent lift we see anywhere. Article schema on a 1,500-word piece moves citation rate by 5-15 percentage points on average, on every engine we test. The two fields that matter most are headline (must match the visible h1) and datePublished / dateModified.
What we ship by default.
Every approved Sourced draft includes:
- Article schema with headline, datePublished, dateModified, author (Person), publisher (Organization), mainEntityOfPage. - BreadcrumbList from root to leaf. - Conditionally, FAQPage on any draft that contains three or more Q-and-A pairs in the body. - Conditionally, HowTo on any draft that contains a numbered step-by-step.
The schema is generated inline by the draft model and validated against the schema.org JSON-LD spec before publish.
Footnote on the "do LLMs read schema" question.
The honest answer is "yes, but not as much as the schema.org evangelists claim, and not for every page type". The right model for how schema affects citation rate is "it shifts the probability distribution a known amount per type, with diminishing returns once the page is otherwise good". It is not a magic switch.
The reason it still matters is that the lift is reliable and free — adding a schema block to a draft costs nothing once the generator does it for you. The customers who treat it as optional under-cite by a measurable margin.