▎ Ultimate Guide to GEO & AEO · 2026 Edition · 11 chapters · ~8,500 words ← bestaeoskill.com
The Ultimate Guide · 2026 Edition

The complete guide to GEO & AEO.

How to make your website citable by ChatGPT, Claude, Perplexity, Gemini, and Google AI Overviews — built on the peer-reviewed Princeton KDD 2024 paper. Eleven chapters. Eight thousand five hundred words. One source of truth.

Word count ~8,500 Reading time ~32 min Updated 2026-05-01 Engines covered 5
▎ TL;DR — read this first

AI search is the new search. 25.11% of Google queries trigger AI Overviews; 87% of all AI referral traffic flows through ChatGPT alone; AI traffic converts at 5× the rate of Google organic. If your site isn't cited by AI engines, you're invisible to a large and growing share of high-intent users.

The good news: it's measurable and fixable. The peer-reviewed Princeton KDD 2024 paper tested 9 optimization tactics across 10,000 generative engine queries. Three dominated: emphasize sources (+115%), add expert quotes (+41%), add statistics (+40%). Together they more than double citation likelihood.

This guide teaches the rest. Eleven chapters covering technical foundations (robots.txt, AI bots, schema), content tactics, entity signals, per-engine optimization (ChatGPT, Claude, Perplexity, Gemini, AI Overviews), measurement, common mistakes, and a week-by-week 30-day action plan. Run a free GEO audit on your site to see where you stand before you start.

Chapter 1The Stakes — Why AI search matters now

For the first thirty years of the web, search meant Google. You typed a query, saw a list of ten blue links, picked one, clicked through. The site that ranked higher got more traffic. Optimize for keywords, build backlinks, ship a sitemap — that was SEO.

That model is breaking. The breaking happened slowly between 2022 (ChatGPT's launch) and 2025, then accelerated through 2026. Three numbers tell the story:

25.11%
Google queries triggering AI Overviews (Q1 2026)
900M
ChatGPT weekly active users
AI traffic conversion vs Google organic
$33.7B
GEO market by 2034 (CAGR 50.5%)

Together these say: AI search is no longer experimental. It's where a large and growing share of high-intent traffic comes from, with 5× better conversion than the source it's replacing. The companies that get cited by AI engines win the next decade.

What "being cited" actually means

When a user types a question into ChatGPT or Perplexity or Google's AI Overviews, the engine doesn't return ten links. It returns one synthesized answer, drawn from a set of cited sources. Citation looks like this:

"GEO market is projected to reach $33.7B by 2034, growing at a CAGR of 50.5% [1]. The largest single contributor is enterprise content marketing, where AI Overview citation has become the primary source of brand visibility [2]." Hypothetical Perplexity response · sources [1] [2] embedded

Those [1] and [2] are clickable. They link back to the source. Being a source means being the link.

Across all five major engines (ChatGPT, Claude, Perplexity, Gemini, Google AI Overviews), a citation in the answer is worth roughly what a #1 organic search result was in 2015 — the highest-trust, highest-conversion form of inbound visibility available. Yet most websites are not optimized to be cited. They're optimized to rank in lists of links, which is a different problem.

The shift from ranking to citation

Traditional SEO optimizes for ranking. The question is: among ten possible links, will yours appear higher? GEO optimizes for citation. The question is: among the entire indexable web, will the AI engine choose your content as a source for its synthesized answer?

These are different problems with different mechanics. Ranking rewards backlinks, domain authority, keyword targeting. Citation rewards citability — content that is structurally easy to lift, attribute, and quote. The Princeton paper formalized this distinction. The rest of this guide teaches it.

Chapter 2Definitions — GEO, AEO, and how they relate to SEO

Three acronyms. Easy to confuse. Worth getting right.

SEO — Search Engine Optimization

SEO is what we've done since the late 1990s. It's optimization for traditional, list-of-links search engines: Google, Bing, DuckDuckGo, Yandex, Baidu. The goal is to rank — to appear higher in the list of ten or twenty results returned for a query. Tactics include keyword targeting, backlink building, technical site health, page speed, mobile-friendliness, and structured data.

AEO — Answer Engine Optimization

AEO targets answer surfaces — places where the engine extracts and presents a direct answer to the user's question, not a list of links. Featured snippets in Google. Voice assistants like Alexa and Siri. Google's AI Overviews. Bing's instant answers. AEO emphasizes direct-answer formatting: clear questions and answers, FAQPage schema, concise factual statements, and structured data that makes content extractable.

GEO — Generative Engine Optimization

GEO targets generative AI search — engines that synthesize a response rather than extract a single answer. ChatGPT. Claude. Perplexity. Google Gemini. Generative engines combine information from multiple sources into a fluent, multi-paragraph response, with citations linking back to the sources. GEO emphasizes citability — content that is quotable, attributable, and likely to be selected as one of those sources.

How they relate

Imagine a Venn diagram with three overlapping circles. The overlap is large.

  • Most technical tactics overlap all three: HTTPS, mobile-friendliness, fast page load, valid HTML, accessible content. SEO requires these. AEO requires these. GEO requires these.
  • Schema markup matters for all three but with different priorities. SEO benefits from any valid markup. AEO especially rewards FAQPage and HowTo. GEO especially rewards Article (with author markup) and Speakable.
  • Backlinks matter for SEO heavily, AEO moderately, GEO surprisingly little compared to content quality signals.
  • Content tactics diverge most. SEO rewards keyword density and topic depth. AEO rewards direct-answer formatting. GEO rewards citability — emphasized sources, expert quotes, statistics density.

For most sites, all three matter. The Princeton paper showed that some GEO tactics (source emphasis, expert quotes) are also good SEO tactics — they signal authority and trust regardless of which engine type is consuming the page. That's lucky: doing GEO well typically improves SEO and AEO too.

For more, see our GEO vs AEO vs SEO disambiguation page.

Chapter 3How AI engines decide what to cite

Generative engines don't browse the web in real time for every query (mostly). They've already crawled, indexed, and partially summarized the web. When you ask a question, the engine performs three steps:

  1. Retrieval. Find candidate sources from the index that are relevant to the query. This is similar to traditional search ranking — relevance + authority signals decide which 10-50 sources are candidates.
  2. Synthesis. Read the candidate sources and write a single coherent answer. This is where the LLM does the work. It draws information from multiple sources, paraphrases, combines, and produces a response.
  3. Attribution. Decide which sources to cite. The engine surfaces a subset of the candidates as the citations shown to the user.

GEO is mostly about steps 2 and 3. Step 1 is largely the same as SEO: be indexable, be relevant, be authoritative. Steps 2 and 3 are where citability comes in.

Why some sources get cited and others don't

The Princeton 2024 paper investigated this question empirically. The team built GEO-bench, a benchmark of 10,000 queries, and tested 9 candidate "optimization tactics" — content modifications applied to source material before re-running each query. They measured which tactics caused the modified source to receive more visibility in the synthesized response.

The findings were sometimes intuitive (citing your sources helps) and sometimes counterintuitive (keyword stuffing actively hurts):

#TacticVisibility impact
1Source emphasis+115%
2Expert quotes+41%
3Statistics+40%
4Inline citations+30%
5Authority signaling+25%
6Improved fluency+15%
7Easy-to-read+12%
8Topic relevance+10%
9Keyword stuffing−22%

The headline finding: source emphasis — the simple act of explicitly bolding, citing, and attributing your sources — increases citation likelihood by +115%. This is the strongest single tactic. It costs nothing beyond formatting.

For a deeper read on the methodology, see our research foundation page.

Chapter 4Technical foundations — can AI bots reach your content?

The cheapest, most foundational fix is also the most overlooked: are you allowing AI bots to crawl your site?

The Princeton paper makes this explicit:

If crawlers can't find and parse your content, prose optimization doesn't matter.Aggarwal et al., 2024

The 27 AI bots you should explicitly allow

As of 2026, there are 27 distinct AI crawler user-agents you should know about. Most are operated by major LLM providers; some are operated by emerging engines. We track them in best-aeo-skill:

User-agent: GPTBot              # OpenAI training + ChatGPT browse
User-agent: ChatGPT-User        # ChatGPT real-time fetch
User-agent: OAI-SearchBot       # SearchGPT
User-agent: ClaudeBot           # Anthropic search/browse
User-agent: anthropic-ai        # Anthropic web crawler
User-agent: Claude-Web          # Claude.ai user fetches
User-agent: Claude-User         # Claude tool use
User-agent: Claude-SearchBot    # Claude search index
User-agent: PerplexityBot       # Perplexity index
User-agent: Perplexity-User     # Perplexity user fetches
User-agent: Google-Extended     # Bard/Gemini training
User-agent: GoogleOther         # AI Overviews + SGE
User-agent: Applebot            # Spotlight + Siri
User-agent: Applebot-Extended   # Apple Intelligence
User-agent: FacebookBot         # Meta AI
User-agent: Meta-ExternalAgent  # Meta tools
User-agent: YouBot              # You.com
User-agent: cohere-ai           # Cohere
User-agent: MistralAI-User      # Mistral fetch
User-agent: CCBot               # Common Crawl (LLM training)
User-agent: Bytespider          # ByteDance / Doubao
User-agent: Diffbot             # Diffbot Knowledge Graph
User-agent: Amazonbot           # Amazon AI
User-agent: DuckDuckBot         # DuckDuckGo
User-agent: YandexBot           # Yandex
User-agent: Bingbot             # Bing index (Copilot)
User-agent: Googlebot           # Standard Google search
Allow: /

Block any of these and you cut yourself off from the corresponding engine. Some sites do this thinking they're "protecting their content." In practice, they're invisible to a major share of high-intent users who now ask AI engines instead of typing keywords.

Verify, don't trust

Listing a User-agent in your robots.txt is necessary but not sufficient. Two common failure modes:

  • CDN bot management. Cloudflare, Akamai, and other CDNs run their own bot-detection rules independently of robots.txt. As of Q3 2024, Cloudflare added a dedicated "AI Bots" management category that some site owners enable by default — blocking even bots their robots.txt explicitly allows. Check Cloudflare → Security → Bots → AI bots.
  • JavaScript rendering. AI bots have inconsistent JavaScript execution. A pure SPA (single-page app) that renders content client-side may appear empty to many bots. Use server-side rendering (SSR) or static generation for content-bearing pages.

The fix: test fetch-as-bot for the engines you care about. Curl with the User-agent header, render the response with a basic HTML parser, confirm your content is in the markup. best-aeo-skill's ai_bot_access evidence collector does this automatically across all 27 agents.

Sitemap, hreflang, and other classics

Standard SEO foundations remain important: a current XML sitemap, valid hreflang tags for multilingual sites, clean 200/301/404 status codes, mobile-friendliness, fast page load. AI engines reuse most of the indexing pipeline traditional search engines built. None of these tactics are GEO-specific, but they're prerequisites.

Chapter 5Content citability — the tactics that matter most

Once your site is technically accessible, content quality determines whether AI engines actually cite you. This is where the Princeton paper's strongest findings live.

Source emphasis (+115%)

The single highest-leverage tactic. Princeton found that pages which explicitly emphasize their sources — through inline citations, links to primary references, or visible attribution — are 2.15× more likely to be cited by generative engines than equivalent pages without emphasis.

Practical implementation:

  • Every numeric claim should link to a primary source at the point of claim, not just at the bottom.
  • Use bold or italics on cited entities ("according to Princeton's KDD 2024 paper...").
  • Maintain a visible reference list at the bottom of long articles.
  • For research-style content, use footnote-style numbered references like [1], [2].
  • Cross-link related articles internally — this signals topical authority.

Expert quotes (+41%)

Adding 2-4 attributed quotations per ~1000 words raises citation likelihood by 41%. AI engines treat quoted passages as anchor evidence when synthesizing responses — they're easier to lift verbatim into a synthesized answer.

Two requirements for quotes to count:

  1. Speaker name and credential. "According to Dr. Jane Smith, professor of computer science at Stanford, ..." — not "according to one expert" or "anonymous source." Anonymous attribution patterns reduce citation rate.
  2. Quotation marks. Use proper quotation marks or HTML <blockquote> elements. Engines look for these signals.

Statistics density (+40%)

Pages with at least one numeric claim per 200 words receive 40% more citations than thin-statistic pages. The reason: AI engines often need a number to anchor a synthesized answer. Pages that supply the number with a source become natural citations.

Practical guideline: aim for one statistic per 200 words. For a 1500-word article, that's 7-8 numeric claims. For a 600-word post, three. Pair each statistic with a citation to its primary source — then you double-dip on tactic 1 (source emphasis) and tactic 3 (statistics).

Freshness

Empirical data from large-scale citation analysis shows content under 30 days old receives 3.2× more citations than equivalent older content. Three implications:

  • Add a machine-readable dateModified field to your JSON-LD on every content page.
  • Update major articles every 30-90 days — actual updates, not just changing the date.
  • For evergreen topics, schedule quarterly refreshes to keep the freshness signal alive.

Readability

Princeton found a Flesch-Kincaid grade level around 8-10 performs best for general AI search visibility. Higher (academic-grade text) reduces citations from general engines. Lower (oversimplified, listicle-style) reduces authority signaling.

Mix sentence lengths. Use short sentences (under 15 words) and medium sentences (15-25 words) in roughly equal proportion. Monotonic sentence length is a signal of AI-generated content, which engines increasingly de-cite.

Avoid AI-rewrite tells

Modern detection systems flag patterns common in machine-generated content. Avoid:

  • "It's important to note" — overused by LLMs as a transition.
  • "In conclusion" — formulaic; rewrite the closing paragraph.
  • "Furthermore" — almost never needed in human writing.
  • "Delve into" — heavily flagged.
  • Every paragraph starting with the same connector ("Moreover... Additionally... Finally...").

Chapter 6Structured data — making your content parseable

Structured data (JSON-LD via Schema.org) is how you tell engines what your content is. AI search engines lean on it heavily — much more so than traditional Google ever did.

FAQPage — the highest-leverage single signal

Across all schema types, FAQPage produces the highest single-signal AI citation rate. Why: AI engines often surface answers to user questions by extracting Q&A pairs from FAQPage schema directly. If your page has FAQ markup, you become a candidate citation for every related question.

Best practices:

  • 5-10 question-answer pairs per page is ideal. Below 3 reduces signal weight; above 15 dilutes.
  • Each question must be a real question — phrased with a question mark.
  • Each answer should be 30-200 words. Too short looks unauthoritative; too long becomes unextractable.
  • Use real user questions (from search console, support tickets, sales calls) — not synthetic Q&A. AI engines detect mass-generated FAQ content and de-cite it.
<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "What is generative engine optimization?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Generative engine optimization (GEO) is the practice of structuring web content to maximize..."
      }
    }
  ]
}
</script>

Article + author markup

Every content page should have Article (or BlogPosting or NewsArticle) markup with these required fields:

  • headline — page title
  • datePublished + dateModified — ISO-8601 timestamps
  • author — Person markup with name, jobTitle, sameAs links
  • publisher — Organization markup with logo

Anonymous authorship reduces citation rate by ~60%. If your articles don't have a named author with credentials, fix this first. The Person markup should include sameAs URLs that resolve — LinkedIn, Twitter/X, GitHub, ORCID for academics, Wikipedia if applicable.

Organization with sameAs

Your site's Organization markup should include sameAs URLs that link to authoritative entities for the same organization. Wikidata is the most achievable; Wikipedia is the most prestigious. Crunchbase, LinkedIn, and major industry registries also count. The more sameAs links, the stronger the Knowledge Graph trust signal.

HowTo, Speakable, Product, BreadcrumbList

Other schema types worth adding where applicable:

  • HowTo — for tutorials and step-by-step guides. Required fields: name, step list with HowToStep entities, optional images per step.
  • Speakable — Speakable Specification markup tells voice-AI surfaces (Google Assistant, Alexa) which sentences/headings to read aloud. Critical for AI Overview voice variants.
  • Product + AggregateRating + offers — for e-commerce. Without these, you can't be cited in AI buying-guide responses.
  • BreadcrumbList — on every non-home page. Improves contextual citation by clarifying page hierarchy.

Always validate before deploy. Schema.org's validator and Google's Rich Results test catch most errors. best-aeo-skill's schema_validate evidence collector runs both.

Chapter 7Entity and brand signals — do you exist as an authority?

Sustained citation requires entity presence. A page can have perfect technical setup, perfect schema, perfect content — and still get cited rarely if the underlying organization or author is unknown to the AI engines' Knowledge Graph.

Author bios with credentials

Author bios should include:

  • Full name
  • Role and organization
  • Years of experience
  • 2+ credentials (PhD, MD, certifications, named publications, awards)
  • Links to verified profiles (LinkedIn, ORCID, Twitter)
  • A photo (real, not stock)

Author profile pages with full Person schema increase article citation rate by ~20%.

Wikidata before Wikipedia

A Wikipedia entry is hard to get and easy to lose. A Wikidata entity (a Q-number) is much more achievable and provides similar trust signal value. Wikidata lists are open to community editing with relaxed notability requirements; if you have any verifiable third-party coverage, you can be in Wikidata.

Once your organization has a Q-number, link to it from your Organization schema's sameAs field. This creates a verifiable Knowledge Graph trust path that AI engines can follow.

NAP consistency for local

For local businesses, Name / Address / Phone (NAP) consistency across surfaces is non-negotiable:

  • Website footer
  • Google Business Profile (GBP)
  • Apple Maps profile
  • Yelp, Tripadvisor, industry directories
  • LocalBusiness JSON-LD on the website

Inconsistent NAP reduces local AI Overview citation by ~40%. The fix is mechanical: write down the canonical NAP once, audit every surface, fix mismatches.

Brand mentions and external signals

AI engines can detect when a brand is widely mentioned across the web — through citations, social, news coverage, podcast guest appearances, conference speaker activity. These are slow-moving signals; you can't fake them. They build over years through consistent activity.

Two practical accelerators:

  • Original research / annual reports. A recurring data study (e.g., "State of X 2026") that other publications cite is the highest-leverage single move for entity authority.
  • Podcast appearances. Each verified guest spot adds an entity signal. Document them with sameAs links from your Person schema to the episode pages.

Chapter 8Multi-engine optimization — each AI search surface is different

Different engines weight signals differently. Optimizing only for ChatGPT is leaving Perplexity, Claude, and AI Overviews on the table. Here's how each major engine prefers to be served:

ChatGPT

  • ~87% of all AI referral traffic, by far the largest. Prioritize this.
  • Favors authoritative long-form (1500+ words), consensus-based content, with explicit attribution.
  • Prefers content with clear "this is the answer" framing. Hedged content ("it depends," "may," "might") gets cited less.
  • Surfaces source links in the response when browsing is enabled (most queries in 2026).

Claude (Anthropic)

  • Rewards precise attribution and high factual density.
  • Most likely of the major engines to cite multiple sources for a single claim — making your page useful even if you're one of three sources, not the only one.
  • Tends to prefer sources with active dateModified within 90 days.
  • Officially supports llms.txt for ClaudeBot — though only a small share of crawls use it.

Perplexity

  • Favors academic and news sources, heavy citation density, fresh content.
  • Citation extraction prefers explicit numbered references ([1] [2] [3]) or footnote-style attribution.
  • Best engine to optimize for if you write data-rich, citation-heavy content.

Google AI Overviews

  • Triggers on ~25.11% of all Google searches in 2026 (up from 13.14% in March 2025). For local queries, the rate is ~38%.
  • Leans on traditional SEO best practices PLUS direct-answer formatting.
  • Rewards FAQPage schema heavily. If your AI Overview presence is low, add FAQ schema first.
  • Speakable schema increases voice-variant inclusion.

Gemini

  • Blends traditional Google ranking signals with AI-specific signals.
  • Optimizing for both Google search AND AI Overviews tends to lift Gemini citation rates as a side effect.
  • Less standalone optimization needed than ChatGPT/Perplexity.

The practical takeaway: optimize broadly. Most signals overlap. The specific differences between engines matter most for sites doing serious volume — for early-stage optimization, get the basics right across the board, then differentiate.

Chapter 9Measurement — tracking GEO performance over time

You can't improve what you can't measure. Three layers of measurement, ordered by sophistication:

Level 1: Composite GEO Score (weekly)

Run an audit (we recommend our free tool) on your top 10-20 pages weekly. Track the composite score over time. The goal isn't to hit 100 — it's to detect regressions.

A score that drifts down 5+ points in a week is a signal something broke: your CDN started blocking GPTBot, a deploy stripped your JSON-LD, a content update removed expert quotes. Catch it within a week instead of finding out months later when traffic is gone.

Level 2: AI traffic attribution (monthly)

In your analytics platform, segment traffic by AI referrer:

  • chat.openai.com
  • perplexity.ai
  • claude.ai
  • gemini.google.com
  • Google with parameter udm=14 (AI mode)

Track sessions, conversions, and conversion rate per AI source. Compare against organic search and direct traffic. If your AI traffic is converting at significantly better rates (it usually does — typically 3-5×), invest more.

Level 3: Brand mention tracking (quarterly)

Tools like OtterlyAI and Brand24 monitor mentions of your brand in AI search responses. They track which queries trigger your brand to appear, in which engines, with what context. This is the most expensive layer but the most strategic — it tells you where you're winning or losing share-of-voice in AI search.

Content decay detection

Articles older than 90 days with declining citation rates are decaying. Don't ignore them. Refresh signals:

  • Add new statistics from the past 90 days.
  • Update dateModified in JSON-LD.
  • Add new expert quotes.
  • Re-link to current primary sources.
  • Don't just change the date — actually update the content.

best-aeo-skill's monitor sub-skill flags articles showing decay automatically.

Chapter 10Pitfalls — 10 anti-patterns that destroy citation rates

  1. Keyword stuffing. The Princeton paper measured a -22% effect. AI engines penalize stuffing more aggressively than Google does.
  2. Synthetic FAQ schema. Mass-generated Q&A that doesn't match real user questions is detected and de-cited.
  3. AI-generated boilerplate. Detection is increasingly accurate. Once flagged, content gets de-cited even if facts are correct.
  4. Blocking AI bots while expecting AI citations. Some sites block GPTBot then complain about no ChatGPT visibility. You can't have both.
  5. Anonymous authorship. Reduces citation rate by ~60%. Always name authors with credentials.
  6. Removing URLs without 301s. Citation links break, citation history is lost. Always 301 redirect.
  7. Hiding content behind cookie banners or modals at first paint. AI bots see what's rendered initially. If your modal blocks the content, it's invisible.
  8. Pure SPAs without SSR. JavaScript-only rendering loses many AI bots. Use SSR or static generation for content pages.
  9. Stale content with no dateModified. 3.2× citation drop after 30 days for unfreshened content.
  10. Optimizing for one engine in isolation. Don't lose Perplexity to win ChatGPT. Most optimizations work across engines.

Chapter 11The 30-day action plan

Theory only matters if you ship. Here's a concrete week-by-week plan to take a site from "no GEO" to "actively cited":

Week 1: Technical foundations

  • Run an audit on your homepage and top 10 pages. Establish baseline GEO score.
  • Patch robots.txt to explicitly Allow all 27 AI bots listed in Chapter 4.
  • Verify CDN bot management isn't overriding your robots.txt rules.
  • Submit your XML sitemap to Google Search Console and Bing Webmaster.
  • Generate /llms.txt at root.
  • Generate /.well-known/ai.txt.

Week 2: Schema deployment

  • Add FAQPage schema to top 5 pages. Use real user questions.
  • Add Article + Person author markup to all blog posts.
  • Add Organization + sameAs links to Wikidata and LinkedIn.
  • Validate every schema block in Schema.org and Google Rich Results.

Week 3: Content citability

  • Pick 5 highest-traffic pages.
  • Add 2-4 expert quotes per 1000 words. Use real attributions, not anonymous.
  • Add inline citations to every numeric claim.
  • Bold or emphasize source mentions where they appear.
  • Update dateModified to today's date.

Week 4: Entity and brand

  • Create or update author bios. Add 2+ credentials. Link to LinkedIn/ORCID.
  • If your organization isn't on Wikidata, create a Q-number entry.
  • Audit NAP consistency across all directories (for local businesses).
  • Start tracking AI referral traffic in your analytics.
  • Re-run the audit. Compare to Week 1 baseline.

Realistic results from this 30-day plan: composite GEO score lift of 15-30 points (e.g., 60 → 80), AI referral traffic up 2-4× within 60-90 days as the engines re-crawl. The best-case studies (HubSpot's documented 6× AI-trial lift) come from sustained 6-12 month investment, not 30 days. But 30 days gets you visible.

ReferenceGlossary

Twenty-five terms used throughout this guide.

AEO
Answer Engine Optimization. Targeting answer surfaces (featured snippets, voice, AI Overviews) where the engine returns a direct answer.
AI Overviews
Google's AI-generated answer summaries that appear above traditional search results. Replaced "Search Generative Experience" (SGE) in 2024.
BreadcrumbList
Schema.org type for representing page hierarchy. Improves contextual citation.
ClaudeBot
Anthropic's web crawler user-agent for Claude search and browse functionality.
CCBot
Common Crawl's user-agent. Many LLMs train on Common Crawl, so blocking CCBot reduces training-data presence.
Composite GEO Score
A 0-100 score that weights Technical, Citability, Schema, and Entity signals into a single number.
Confidence label
One of Confirmed, Likely, Hypothesis — anti-hallucination markers we attach to every audit finding.
FAQPage
Schema.org type for question-and-answer content. Highest single-signal AI citation rate.
GEO
Generative Engine Optimization. Targeting generative AI engines (ChatGPT, Claude, Perplexity, Gemini).
GEO-bench
The 10,000-query benchmark Princeton built to measure GEO tactics.
GPTBot
OpenAI's web crawler user-agent for training and ChatGPT browse.
HowTo
Schema.org type for step-by-step tutorials.
JSON-LD
JSON for Linked Data. The recommended format for Schema.org structured data.
llms.txt
An emerging standard at llmstxt.org for site-level AI catalogs. Officially honored by Anthropic for ClaudeBot.
NAP
Name / Address / Phone. Consistency across surfaces is critical for local AI search.
PAWC
Position-Adjusted Word Count. Princeton's primary visibility metric in their KDD 2024 paper.
Perplexity
An AI search engine that emphasizes citation-heavy, academic-style responses.
PerplexityBot
Perplexity's index crawler. Distinct from Perplexity-User which handles real-time user fetches.
Schema.org
The shared vocabulary for structured data on the web. Used via JSON-LD.
sameAs
A Schema.org property linking an entity to its representations elsewhere (Wikidata, Wikipedia, LinkedIn).
SARIF
Static Analysis Results Interchange Format. Used for CI/CD integration of code-scanning results.
SEO
Search Engine Optimization. Targeting traditional list-of-links search engines.
SGE
Search Generative Experience. Google's experimental AI search, renamed to AI Overviews in 2024.
Speakable
Schema.org property marking sentences that should be read aloud by voice AI.
Wikidata
Open knowledge base of entities. Q-numbers are unique identifiers. Often achievable when Wikipedia is not.
▎ Next steps

If this guide was useful — start with our free tool. Audit your site in 60 seconds, see your composite GEO score and ranked findings, then install best-aeo-skill to apply fixes with one command.

Run free audit → Install bestaeo → Research foundation → Read SKILL.md →