Build Your First AI Project This Weekend
Stop consuming tutorials. Start creating. Get the free step-by-step guide.
The old playbook was “rank number one and collect the clicks.” That world is shrinking fast.
Here’s how to get cited by ChatGPT, Perplexity, and AI search. You do three things: make sure each engine’s crawler can reach your page, structure your content answer-first so the model can lift a clean passage, then stack authority signals like a named author, current dates, and cited sources. That’s the whole game of generative engine optimization (GEO), and the good news is a high Google ranking is no longer required to win it.
AI answer engines now read the web, synthesize one answer, and cite a handful of sources. If you’re not one of those sources, you don’t exist, even if you rank.
I’ve been optimizing pages for this across ChatGPT, Perplexity, and Claude. The mechanics are different for each engine, and most guides blur them into one “AI search” blob. This one breaks them apart and gives you a repeatable process you can run today, mostly with AI doing the heavy lifting.
This is part of building content sites that actually earn traffic in 2026.
How to Get Cited by ChatGPT, Perplexity, and AI Search (Short Answer)
To get cited by ChatGPT and other AI search engines, follow five steps:
- Let the AI crawlers in. Allow OAI-SearchBot, GPTBot, PerplexityBot, ClaudeBot, and Google-Extended in your robots.txt.
- Write answer-first. Put a direct 40 to 100 word answer under every question-format heading.
- Add authority signals. Cite sources, add stats and quotes, name your author, and show dates.
- Add schema markup. Use FAQPage, HowTo, and Article JSON-LD so machines understand the page.
- Audit with AI. Run your target queries through each engine, find citation gaps, and rewrite weak pages.
This matters more than ever because the audience is now massive. ChatGPT reached 900 million weekly active users as of February 2026, up from 800 million in October 2025 (TechCrunch). That’s not a niche channel. That’s mass distribution.
Now let me define the thing properly, then walk through each step.
What Is Generative Engine Optimization (GEO)?
Generative engine optimization (GEO) is the practice of structuring your content so AI answer engines like ChatGPT, Perplexity, Claude, and Google AI Overviews quote and cite it inside their generated answers.
The term comes from a 2024 Princeton-led research paper, “GEO: Generative Engine Optimization” (Aggarwal et al., presented at KDD 2024) (arXiv). That paper is the origin point, and it’s worth knowing because vague competitor guides cite “studies” without ever naming one.
GEO vs Answer Engine Optimization (AEO)
You’ll also hear answer engine optimization (AEO). People use the two terms loosely. Here’s the practical split I use:
- AEO is about winning the answer: featured snippets, People Also Ask boxes, voice answers, and the extractable quote.
- GEO is the broader practice of being the source a generative engine grounds its answer on.
In plain English, both come down to the same thing. Be the cleanest, most trustworthy, most quotable source on the page.
What Are LLM Citations? (and the entities behind them)
A few entities you need to understand:
- Retrieval. When you ask ChatGPT or Perplexity a question, it often searches the live web, pulls back candidate pages, and reads them. That search-and-fetch step is retrieval.
- Grounding. The model writes its answer based on those retrieved pages instead of pure memory. This is also called retrieval-augmented generation (RAG).
- LLM citations. The little numbered source links the engine attaches to its answer. That’s what you’re competing for.
Get cited, and you get visibility, traffic, and trust inside an answer the user already believes.
How Is GEO Different From Traditional SEO?
Traditional SEO competes to rank in Google’s list of ten blue links, where the clicks get split across results. GEO competes to be quoted inside one synthesized answer, where being cited is basically everything.
Here’s the table that reframes the whole opportunity:
| Traditional SEO | Generative Engine Optimization (GEO) | |
|---|---|---|
| Goal | Rank in the list of links | Be quoted in the AI answer |
| Surface | 10 blue links | One synthesized answer with citations |
| Winner takes | Clicks split across page one | The cited source wins the mindshare |
| Ranking required? | Yes, position 1 to 3 matters most | No, ChatGPT cites position 21+ ~90% of the time |
| Key lever | Backlinks + keywords | Structure, extractability, authority signals |
| Format that wins | Comprehensive page | Self-contained answer blocks |
That fourth row is the unlock. Semrush clickstream analysis found ChatGPT Search cites webpages ranking in traditional Google positions 21 or lower about 90% of the time (Semrush). Ranking and citation are decoupled.
Read that again. You don’t need to outrank the incumbents to get cited. You need to be more extractable and more trustworthy on the specific question.
And the clicks are draining from the old surface anyway. In a Pew Research study, Google users clicked a traditional search result in only 8% of searches that showed an AI summary, versus 15% on searches without one, and clicked a link inside the AI summary just 1% of the time (Pew Research Center).
So the question isn’t “rank or get cited.” You do both. GEO is the bolt-on to your AI SEO strategy, not a replacement for it. The two playbooks reinforce each other.
How ChatGPT, Perplexity, and Claude Actually Find Sources
This is where every other guide gets lazy. They say “optimize for AI search” as if ChatGPT, Perplexity, Claude, and Google AI Overviews work the same way. They don’t. Each one has its own crawler and its own retrieval behavior.
Here is the actual breakdown:
| Engine | Crawler / bot | How it retrieves | What you must do |
|---|---|---|---|
| ChatGPT Search | OAI-SearchBot, GPTBot, ChatGPT-User | Runs on the Bing index; searches on ~34.5% of queries | Allow OAI-SearchBot, submit to Bing Webmaster Tools |
| Perplexity | PerplexityBot | Live retrieval on nearly every query, recency-weighted | Allow PerplexityBot, keep pages fresh |
| Claude | ClaudeBot, Claude-User | Real-time web browsing when search is enabled | Allow ClaudeBot, clean extractable answers |
| Google AI Overviews | Googlebot, Google-Extended | Built on Google’s index and Gemini | Standard SEO + allow Google-Extended |
Two things matter most here.
First, ChatGPT search runs on the Bing index. That’s why submitting your sitemap to Bing Webmaster Tools isn’t optional. If Bing hasn’t crawled you well, ChatGPT struggles to find you. OpenAI also enabled its web-search feature on about 34.5% of queries as of February 2026, and outbound referral traffic from ChatGPT to the wider web grew 206% in 2025 (Semrush). The pipe is real and growing.
Second, Perplexity is live and recency-hungry. It retrieves on almost every query and weights fresh content heavily. If you update a page today, Perplexity can pick it up fast.
For the Google AI Overviews side specifically, I’m not going to repeat myself here. I wrote a full deep-dive on optimizing for Google AI Overviews, so go there for the Google-only mechanics. This page owns the multi-engine ChatGPT, Perplexity, and Claude cluster.
If you want to understand why the two models surface and word answers differently, my ChatGPT vs Claude breakdown covers the behavioral differences.
Here’s the deal: five steps, and AI runs most of them.
Step 1: Make Sure AI Crawlers Can Reach You
You can’t get cited by a bot you blocked. Step one is a thirty-minute technical check that most sites get wrong.
Open your robots.txt and confirm you’re allowing the AI search crawlers. OpenAI documents that sites which disallow its OAI-SearchBot crawler in robots.txt won’t be shown in ChatGPT search answers (OpenAI Developer Documentation). That’s a hard prerequisite, not a suggestion.
Here is a baseline allowlist to confirm in your robots.txt:
User-agent: OAI-SearchBot
Allow: /
Disallow:
User-agent: GPTBot
Allow: /
Disallow:
User-agent: ChatGPT-User
Allow: /
Disallow:
User-agent: PerplexityBot
Allow: /
Disallow:
User-agent: ClaudeBot
Allow: /
Disallow:
User-agent: Google-Extended
Allow: /
Disallow:
Sitemap: https://yoursite.com/sitemap.xml
The canonical “allow everything” directive in the robots.txt spec is an empty Disallow: line, which is why each group above pairs Allow: / with Disallow:. The safest rule of all is simply never listing a bot under any Disallow block. Absence of a Disallow rule is what actually grants access, so if you’re unsure, the move is to remove the bot from any blocking group, not add a clever new line.
A few notes from experience:
- OAI-SearchBot powers ChatGPT search citations. GPTBot is training. You generally want both allowed, but OAI-SearchBot is the one that affects whether you get cited.
- Submit your sitemap to Bing Webmaster Tools, not just Google Search Console. ChatGPT runs on Bing. If you’ve never touched Bing, that single step can get you found.
- Optional: add an llms.txt file at your site root listing your 20 to 50 best pages. It’s an emerging convention, not a guarantee, but it’s cheap to add and points engines at your strongest content.
Step 2: Structure Content So AI Can Extract and Quote It
Crawlable is table stakes. Now you make the page easy to quote.
The winning pattern is answer-first, also called the inverted pyramid. Put a direct, self-contained answer of 40 to 100 words immediately under each heading, then expand below it. The model needs a clean block it can lift without rewriting your whole page.
Do this on every key page:
- Write H2 headings as the exact questions users ask AI. “How is GEO different from traditional SEO?” beats “GEO vs SEO.” Question headings match the query.
- Lead each section with a standalone answer. No “as we discussed above.” It has to make sense quoted in isolation.
- Add a short TL;DR near the top. Both readers and models love a tight summary.
- Use tables and numbered lists. Structured formats get pulled into AI browsing answers far more often than walls of prose, because the data is already parsed for the model.
Think of it like writing for a busy assistant who’s skimming. If your answer is buried in paragraph four, it loses to the competitor who put it in sentence one.
This maps directly to the Princeton GEO findings on structure and fluency, and it’s the same answer-first discipline I cover in depth in structuring content with AI. Go there for the on-page templates.
Here’s the difference in practice. Say you run a dog training site and you want to get cited for “how long does it take to potty train a puppy.” Don’t open with a 300-word story about your golden retriever. Open with: “Most puppies are reliably potty trained in 4 to 6 months, though some take up to a year. Consistency, crate training, and feeding on a schedule speed it up.” That block can be lifted verbatim. That’s what gets cited.
Step 3: Add the Authority Signals That Make AI Trust You
But structure alone won’t save you. Structure gets you extractable. Authority gets you trusted. These are the signals that tip a model toward citing you over the other guy.
The Princeton-led GEO paper found that generative engine optimization methods can boost a source’s visibility in AI answers by up to 40%, with citing sources, adding quotations, and adding statistics among the most effective tactics (Aggarwal et al., KDD 2024).
So apply all five of the tactics that study validated:
- Cite your sources with real outbound links. (Notice how many citations are in this article. That’s on purpose.)
- Add direct quotations from credible sources.
- Add concrete statistics with attribution. “Lifts citations up to 40%” beats “works well.”
- Improve fluency. Clean, confident, readable prose.
- Write in an authoritative voice. Take a position. Hedge-everything content doesn’t get quoted.
Then layer on E-E-A-T (Experience, Expertise, Authoritativeness, Trustworthiness):
- A named author with a real bio. Not “admin.” Not “the team.”
- Visible publish and last-updated dates. Recency is a ranking and citation signal, especially for Perplexity.
- Inline references throughout, like the ones in this piece.
Authority is also topical. One orphan page gets less trust than a tight cluster of pages that cover the full topic. That’s where AI-assisted internal linking earns its keep. It signals to engines that you own the topic, not just one keyword.
Step 4: Add Schema Markup So Machines Understand Your Page
Schema markup (structured data in JSON-LD) doesn’t force a citation. Anyone who tells you it’s a magic ranking trick is selling something. But it absolutely helps AI engines parse what your page is about, who wrote it, and when it was updated.
Treat schema as a comprehension aid that reinforces everything in Steps 2 and 3.
The schema types that matter for GEO:
| Schema type | What it tells the engine | Use it on |
|---|---|---|
| FAQPage | These are questions and direct answers | Pages with a Q&A section |
| HowTo | This is a step-by-step process | Tutorials and guides like this one |
| Article | This is editorial content, by this author, on this date | Blog posts and guides |
| Organization | This is the brand behind the content | Sitewide |
| Person | This is the named author and their credentials | Author bylines |
The point of FAQPage schema is that it maps your visible Q&A to a machine-readable structure the engine can lift cleanly. Article and Person schema spell out authorship and freshness, which feeds the E-E-A-T signals from Step 3.
I’m not going to rewrite the implementation here. For the full copy-paste JSON-LD and validation walkthrough, see the schema and JSON-LD markup guide.
The honest take? Schema is necessary hygiene, not a silver bullet. A page with perfect schema but buried answers still loses. Schema reinforces good structure. It doesn’t replace it.
Step 5: Audit Your AI Visibility With AI (the MarketUnlock Way)
Here is the part no SEO blog covers, because it’s the whole MarketUnlock thesis. You do the audit WITH AI. You let ChatGPT, Perplexity, and Claude grade your own homework.
The workflow:
- List your 20 to 30 most important target queries. The exact questions a buyer would ask.
- Run each one through ChatGPT, Perplexity, Claude, and Google AI Overviews. Log who gets cited.
- Find the gaps. Where are competitors cited and you’re not? That’s your hit list.
- Have Claude rewrite your weak pages answer-first against the page that’s actually winning the citation.
Paste this into Claude or ChatGPT to run the gap analysis on a single query:
You are a generative engine optimization (GEO) analyst.
My page: [PASTE YOUR PAGE URL OR FULL TEXT]
Target query: "[THE EXACT QUESTION USERS ASK AI]"
Currently cited source for this query: [PASTE THE COMPETITOR PAGE TEXT OR URL]
Do the following:
1. Compare my page to the cited competitor. Why is the
competitor being quoted and not me?
2. Identify the single best 40-100 word passage on the page
that an AI engine could lift to answer this query. If none
exists, say so.
3. Rewrite the opening of my relevant section as a self-
contained, answer-first block that directly answers the
query in the first sentence.
4. List 3 authority signals I am missing (cited stats,
quotes, author, dates, schema) that the competitor has.
Be specific and blunt. No filler.
Run that across your hit list and you’ll have a prioritized rewrite queue in an afternoon. To produce the answer-first content at scale once you know the gaps, plug it into your AI content workflow.
Track it like a KPI. Citation frequency is your new ranking. Tools that monitor where AI engines cite you include Semrush AI Toolkit, Otterly.AI, and Profound. Pick one, log a baseline, and watch the trend. Reddit and Quora also get cited heavily by these engines, so it’s worth noting where the conversation about your topic is happening, not just your own pages.
Why ChatGPT Cites Some Sites and Not Others
ChatGPT cites a site when five things line up: it can crawl the page (OAI-SearchBot allowed), the page answers the query in a clean extractable passage, the page shows authority through author, dates, and cited sources, the content is reasonably fresh, and Bing’s index already trusts the domain.
Miss any one of those and you fall out of the running.
The catch? The engines barely share a source pool. Averi’s analysis of roughly 680 million AI citations found only about 11% of domains are cited by both ChatGPT and Perplexity (Averi). A page that crushes on Perplexity can be invisible on ChatGPT, because one runs live retrieval and the other leans on the Bing index.
So you don’t optimize for “AI search.” You optimize per engine:
- Perplexity rewards freshness and live structure. Update dates and republish your strongest pages, and it can surface you within days to a few weeks.
- ChatGPT depends on Bing’s index and OAI-SearchBot re-crawls. Expect a four to eight week lag. Submit to Bing Webmaster Tools and be patient.
- Claude rewards clean, well-structured, authoritative pages it can browse and quote directly.
Frequently Asked Questions
How do you get cited by ChatGPT and Perplexity?
Make your pages crawlable by each engine’s bot (allow OAI-SearchBot, PerplexityBot, and ClaudeBot in robots.txt), structure content answer-first with question-format headings and tables, and add authority signals: named author, current dates, cited statistics, and schema markup. The Princeton GEO study found these tactics can lift AI citations by up to 40%.
What is generative engine optimization (GEO)?
Generative engine optimization (GEO) is the practice of structuring your content so AI answer engines like ChatGPT, Perplexity, Claude, and Google AI Overviews quote and cite it inside their generated answers. The term comes from a 2024 Princeton-led paper. Unlike SEO, the goal is being the source AI trusts, not ranking in ten blue links.
How is GEO different from traditional SEO?
SEO competes for a spot in Google’s list of links, where ten results split the clicks. GEO competes to be the one source quoted inside an AI answer. The two are decoupled: Semrush found ChatGPT cites pages ranking position 21 or lower about 90% of the time, so clean structure and authority beat raw ranking.
What content format gets quoted in AI answers?
Self-contained answer blocks of 40 to 100 words placed directly under question-format H2 headings get quoted most. Back them with at least one comparison table, a numbered list, and a short TL;DR. AI browsing pulls structured, pre-parsed formats far more readily than walls of prose because the answer is easy to lift.
Does schema markup help you get cited by AI search engines?
Schema markup (FAQPage, HowTo, Article, Organization JSON-LD) won’t guarantee a citation, but it helps AI engines parse what your page covers, who wrote it, and when it changed. Think of it as a comprehension aid that reinforces your answer-first structure and E-E-A-T signals, not a standalone ranking trick.
Why does ChatGPT cite some websites and not others?
ChatGPT cites sites it can crawl (OAI-SearchBot allowed), that answer the query in a clean extractable passage, that show authority through author, dates, and cited sources, and that Bing’s index already trusts, since ChatGPT search runs on Bing. Averi’s analysis of 680 million citations found only about 11% of domains are cited by both ChatGPT and Perplexity (Averi), so you have to optimize per engine.
Next Steps
You now have the full multi-engine GEO playbook. Here is where to go from here:
- Optimize for Google AI Overviews specifically -> AI Overview optimization
- Implement the schema -> Schema and JSON-LD markup guide
- Nail the on-page structure -> Structuring content with AI
- Build topical authority -> AI-assisted internal linking
- Connect it to your broader plan -> AI SEO strategy
- Produce answer-first content at scale -> AI content workflow
Want the bigger picture? Start from the content sites hub and build the whole asset, not just one page.
Have any questions about getting cited by AI search? Run the audit prompt above on your three most important pages this week, log the gaps, and let Claude rewrite the weak ones. That single afternoon is the highest-leverage GEO work you can do.
Now go get cited.