How to get cited by ChatGPT — the five technical moves that actually move the needle
Most of what you read about ranking on AI engines is wrong. The mechanics are different from SEO and the five moves below are ordered by ROI, not by what sounds clever.
Half of the buying journey now happens inside ChatGPT, Claude, Perplexity, and Google's AI Overviews. If your brand is not cited there, a meaningful chunk of intent is invisible to you. The good news: the technical mechanics that move the needle are concrete, finite, and cheap to ship.
Most of the writing on "ranking in ChatGPT" mistakes the mechanics. It treats GEO like a re-skin of SEO. It is not. The engines pull from four distinct lanes — live web index, declared entity surface, training data, and partner feeds — and each move you make hits a different lane. Knowing which lane matters changes what you build.
This piece is the technical checklist. Five moves, ordered by return on engineering effort, with the anti-pattern at the end so you do not chase it.
Move 1: deploy llms.txt at the root of your domain. The proposal at llmstxt.org [1] specifies a plain-text declaration file that AI engines crawl to understand your entity surface — who you are, what you offer, what content matters. We have seen citation surface respond within four to eight weeks of deployment in production. Cost: one file. Hosting: same as robots.txt. There is no reason a domain that wants to be cited does not have this.
Move 2: ship Schema.org coverage on every route. Organization on the root, Service on each capability page, Article on every post, FAQPage on the routes that answer questions, Review and AggregateRating when you have verified data. The vocabulary [2] is well documented and Google's Structured Data Testing tool catches errors in minutes. Engines parse this BEFORE they parse your prose — clean schema is the foundation that everything else compounds on.
Move 3: allow the bots in robots.txt. GPTBot [3], ClaudeBot [4], PerplexityBot [5], Google-Extended [6]. Each engine publishes its crawler name and the opt-in mechanism. Most domains we audit are silently blocking one or more of these — usually GPTBot because the default robots.txt template treated it like a generic scraper. Opt in explicitly. If you have proprietary content you do not want cited, gate it behind auth, not behind robots.txt.
Move 4: name the expert and date the content. Add a Person schema [7] author with a verifiable role, publish the datePublished and dateModified fields, link to the author's LinkedIn or another sameAs profile. The engines weight named expert authorship heavily because it gives them a citation anchor that survives quoting. Anonymous brand-voice content rarely surfaces in answers; named operator content does.
Move 5: write with specific numbers and primary sources. Engines prefer to quote sentences that have specific quantities, dated facts, or named cites. "Most teams" is invisible; "17 agents in three weeks" gets quoted verbatim. The piece you are reading is structured this way intentionally — every paragraph aims for a quotable line, every claim points at a primary source.
The anti-pattern at the bottom of the chart: trying to "game training data" by publishing thousands of low-quality pages or buying citation links. Training data is updated on the model providers' cadence, not yours, and the engines weight the open web cleanup pass that runs before each cutoff. Money spent here is mostly wasted; the five moves above compound faster.
Citation lives downstream of indexing. Clean Schema.org and authoritative content rank well on Google AND get cited by Claude. Build the same technical foundation, the rest follows.
Some practitioners argue live retrieval is a small fraction of citations and the real money is influencing what shows up in the next training cutoff. Ship anchor content that ranks well in 2026 so it is in the 2027 training set.
Citation mechanics are still volatile. Google's AI Overviews launched, contracted, expanded. ChatGPT search rolled out, paused, re-rolled. Some argue: ship Schema and llms.txt now, defer the content investment until 2027.
A and B are not in conflict — moves 1 through 5 above hit both lanes. C is the most expensive position to hold. Citation share is being decided right now; the operators investing in 2026 will sit at the top of the citation graph for the next decade.
- 01Does llms.txt actually influence citation selection in production, or is it a polite hint the engines mostly ignore today?
- 02How do you measure citation share when no major engine publishes reliable citation telemetry?
- 03What is the right opt-out posture for high-value proprietary content — block GPTBot or accept citation in exchange for visibility?
- 04Does named-expert authorship matter equally for B2B and consumer queries, or is it weighted differently per vertical?
- 05When citation goes mainstream and every domain ships these five moves, what is the next moat — schema density, recency, or authority?
- [1]llms.txt — proposed plain-text entity declaration for LLM-friendly sites.
- [2]Schema.org — official structured-data vocabulary used by every major search and answer engine.
- [3]OpenAI — GPTBot crawler documentation and opt-in instructions.
- [4]Anthropic — ClaudeBot and Claude-User crawler documentation.
- [5]Perplexity — PerplexityBot crawler documentation.
- [6]Google — Google-Extended crawler controls for generative AI training opt-out.
- [7]Schema.org — Person type for named-expert authorship markup.
We do GEO + AEO + SEO as a monthly retainer.
Technical foundation in two weeks — Schema, llms.txt, robots.txt opt-in, IndexNow. One opinionated piece per month with named authorship and primary-source numbers. Real citations, monthly tracked.
Book a free 15-min consultation