AI Search·Intermediate·11 min read

ChatGPT Search and SEO. How the system finds sources, which signals drive citations, and what you can actually do about it.

ChatGPT's web search pulls live pages, synthesises an answer, and cites the sources it used. There is no public algorithm. What we observe across client sites is that ChatGPT leans heavily on Bing's index, favours sites with clear entity identity, and prefers content that addresses the query directly. Here is the working understanding of how the system behaves and the practical moves to be the source it picks.

What ChatGPT Search actually is

ChatGPT Search is the live-web search feature inside ChatGPT. When a user asks a question that benefits from current information, ChatGPT runs a web search, reads a small set of pages, and produces a synthesised answer with citation links. The citations appear as small numbered links inside the response and as a source list at the bottom.

The product is built on top of the underlying ChatGPT LLM. The LLM does the synthesis; a separate retrieval layer does the web search and pulls the candidate pages. The retrieval layer evolved over several iterations and now uses a combination of OpenAI's own crawlers and the Bing search index.

From an SEO perspective, ChatGPT Search is the second-largest AI search surface after Google's AI Overviews. Reach is smaller than Google's because ChatGPT has fewer users, but the user intent is unusually high. People who reach ChatGPT for a specific question are deep in research mode and tend to convert well when they click through. See the tracking AI referrals chapter for the measurement view.

How the system finds sources

The full retrieval pipeline is not public. What OpenAI has confirmed plus what we observe across client sites suggests something like the following. None of this is algorithmic certainty.

  1. Query rewrite. The user's natural-language question gets rewritten into one or more web search queries that are closer to how a search engine indexes content.
  2. Web search retrieval. The rewritten queries are fired against Bing's index plus OpenAI's own crawl. A candidate set of pages comes back.
  3. Page fetch and read. ChatGPT's retrieval layer fetches a small number of candidate pages (typically three to ten), reads the content, and extracts the passages most relevant to the original question.
  4. Synthesis. The LLM synthesises an answer from the extracted passages, with inline citations to the source pages.
  5. Render. The user sees the synthesised answer plus a citation list.

The implication. Two layers matter for citation pickup. First, the retrieval layer has to pick your page as a candidate, which depends mostly on your Bing visibility and your entity signals. Second, the LLM synthesis layer has to choose your page's passages over a competitor's, which depends on how well your content directly addresses the rewritten query.

Why Bing visibility matters

The least-glamorous but most-load-bearing observation. ChatGPT Search has historically drawn on Bing's index for live web retrieval. Pages with strong Bing visibility show up more often as ChatGPT citations than equivalent pages without it.

For Australian businesses, this is a small but real shift in priorities. Most Australian SEO retainers focus almost exclusively on Google because Google has 90-plus percent market share in Australia. Bing is barely thought about. But for ChatGPT visibility, Bing matters disproportionately. Three practical moves:

  • Set up Bing Webmaster Tools. Free. Submit the XML sitemap. The five-minute setup is enough to confirm the site is being crawled and indexed by Bing.
  • Check the site's Bing rankings on priority queries. A quick manual check on Bing for the same queries you monitor on Google. If the site is ranking poorly on Bing despite good Google rankings, that is a flag for ChatGPT Search visibility.
  • Treat Bing-specific optimisation as a small extra step. Most Google SEO work also helps with Bing. The exceptions are niche; do not build a separate Bing strategy unless the site is genuinely underperforming on Bing.

For the wider context on indexation across search engines, see the XML sitemaps chapter and the Technical SEO pillar.

The OpenAI crawlers explained

OpenAI runs two distinct crawlers. Knowing which is which lets you control them properly in robots.txt.

GPTBot

The training-data crawler. Fetches pages to build OpenAI's training corpus for future model versions. Allow or disallow via the User-agent: GPTBot directive in robots.txt. Allowing GPTBot means your content may be used in training future ChatGPT models; disallowing means it will not be. This is a content-rights decision, not an SEO decision. For most marketing-focused sites we recommend allowing GPTBot because the content was already public; for content where exclusivity is the asset (paid courses, premium reports), disallowing is reasonable.

OAI-SearchBot

The live-search crawler. Fetches pages during ChatGPT Search responses to ground the answer. Allow or disallow via the User-agent: OAI-SearchBot directive. Allowing OAI-SearchBot means your pages can be cited in ChatGPT Search responses; disallowing means they cannot. For almost all marketing-focused sites we recommend allowing this crawler; disallowing it removes the site from ChatGPT Search results.

Recommended default for most Australian businesses

Allow both crawlers. The cost is zero (the content is already public and indexable by Google), and the benefit is being available to be cited by ChatGPT. The exceptions are: paid content sites where the AI summarisation directly cannibalises the product, and sites with specific legal or contractual restrictions on AI training data. Outside those, allow both.

Signals we observe driving citations

Five patterns from monitoring ChatGPT Search citation behaviour across client sites through 2025 and 2026. Observation, not algorithm.

Signal 1: Bing visibility on the target query

The single best predictor. Pages cited by ChatGPT Search are almost always ranking decently on Bing for the rewritten query. The exact Bing position matters less than being on page one, with top three being noticeably stronger.

Signal 2: Clear entity identity

Sites with locked-down Organization and Person schema get cited more often than sites with ambiguous identity. The pattern is the same as for AI Overviews and Perplexity: AI clients prefer sources they can clearly identify and attribute. See entity SEO.

Signal 3: Named expert authors on YMYL topics

For health, finance, legal and similar high-stakes topics, ChatGPT Search visibly skews toward sources with named credentialled authors. Articles published anonymously or under generic "Admin" bylines get cited far less than equivalent articles with named experts and real Person schema. See E-E-A-T explained.

Signal 4: Direct-answer formatting

Pages where the answer is stated clearly in declarative passages get extracted more cleanly than pages where the answer is buried under preamble. The pattern is identical to the featured snippets pattern: lead with the answer, layer the context underneath.

Signal 5: External authority

ChatGPT Search visibly favours sources with established external authority. Wikipedia, major publishers, large authority sites in the topic, and the established top of the Bing SERP. Below that, the long tail of sources is thinner than the equivalent SERP would be. The digital PR chapter covers the work that builds this authority layer, and the brand mentions chapter covers the unlinked-mentions layer that AI systems also pick up.

Practical optimisation moves

Five concrete steps for a site that wants to improve ChatGPT Search visibility. None of these are speculative; all are work you should be doing anyway for traditional SEO, sharpened where it matters.

  1. Set up Bing Webmaster Tools and submit the sitemap. Five-minute job. Confirms the site is being crawled and indexed by Bing.
  2. Allow GPTBot and OAI-SearchBot in robots.txt. Unless you have a specific reason to opt out. Confirm via a manual robots.txt review.
  3. Tighten the entity layer. Valid Organization schema with sameAs links. Named authors with Person schema and real bio pages. LocalBusiness if relevant.
  4. Lead each page with a direct answer. First 50 to 100 words should state the answer to the underlying query in a single declarative passage the AI extractor can lift.
  5. Track citation pickup monthly. Pick ten priority queries. Run each one through ChatGPT Search once a month. Record which sources are cited. Adjust based on what wins.

Common mistakes

What works
  • Setting up Bing Webmaster Tools alongside Google Search Console.
  • Allowing both OpenAI crawlers in robots.txt unless there is a specific opt-out reason.
  • Treating ChatGPT optimisation as a small bolt-on to existing SEO, not as a separate discipline.
  • Tracking citation pickup manually on a monthly cadence.
  • Naming authors with Person schema, especially on YMYL topics.
What kills momentum
  • Blocking GPTBot reflexively without thinking through the trade-off.
  • Ignoring Bing because Google has 90 percent market share. Bing matters for ChatGPT.
  • Building a "ChatGPT SEO" plan separate from the existing SEO retainer.
  • Skipping the entity layer because "AI works it out". It does not.
  • Trusting vendor claims of "guaranteed ChatGPT citations". The algorithm is not public; nobody can guarantee.

Perth and WA context

Two patterns we see for Perth and WA businesses specifically.

Local trade businesses get very little ChatGPT Search exposure. "Plumber Fremantle" style queries are rarely asked of ChatGPT, and when they are, the system tends to suggest Google Maps rather than synthesise a list of plumbers. The local pack and Maps still dominate; ChatGPT Search is barely in the picture. For these businesses, the AI-search work that actually matters is the entity layer and the Google Business Profile work, not ChatGPT optimisation. See Local SEO Perth.

Professional services and B2B see real ChatGPT exposure. A Perth law firm, accounting practice or B2B services business is the kind of source ChatGPT Search reaches for when a user asks a research-style question ("how does X regulation work in WA", "what is the process for Y"). For these businesses, ChatGPT optimisation work pays off: clean entity identity, named credentialled authors, citable claims, strong Bing visibility on educational queries. See legal SEO and the Content Strategy pillar.

For the wider context, the AI Search pillar covers the full visibility stack. The Perplexity citations chapter covers the citation-first AI client whose behaviour is closely related. The schema for AI chapter covers the structured data layer that affects every AI client, ChatGPT included. For an entry-level diagnostic, the free SEO audit pulls the entity and structured-data signals that drive ChatGPT pickup, and the full website audit service goes deeper.

Frequently asked

How does ChatGPT Search choose which sites to cite?
OpenAI has not published a ranking algorithm. What we observe across client sites is that ChatGPT's web search pulls a small set of candidate sources for each query, drawing heavily on sites Bing already ranks well plus a layer of authority sources OpenAI's systems trust. From that candidate set, ChatGPT cites the sources whose content most directly addresses the query. The pattern is broadly similar to traditional SEO with a stronger emphasis on direct-answer formatting.
Does ChatGPT crawl my site directly?
ChatGPT Search uses a combination of OpenAI's own crawler (GPTBot, OAI-SearchBot) and the Bing search index. GPTBot is for training data; OAI-SearchBot is the user agent used during live search responses. Both can be allowed or blocked via robots.txt. Most sites should allow both unless they have a specific reason to opt out, because blocking effectively removes the site from ChatGPT Search results.
Does Bing visibility affect ChatGPT citations?
Yes, observably. ChatGPT's web search layer historically drew heavily on Bing's index, and pages with strong Bing visibility tend to show up more often as ChatGPT citations than pages without it. Bing Webmaster Tools is worth setting up alongside Google Search Console for this reason. Bing-specific optimisation is mostly redundant with good general SEO, but submitting the XML sitemap to Bing is a cheap five-minute job that helps.
How do I track ChatGPT referral traffic in GA4?
ChatGPT referral traffic shows up in GA4 with chat.openai.com or chatgpt.com as the referrer source. Some hits get stripped of referrer information and land as direct traffic, so the picture is incomplete. The fix is to build a custom referrer-grouping channel in GA4 that catches the known ChatGPT referrers plus the other AI clients, and to cross-check against server logs for completeness. See tracking AI referrals.
Should I block GPTBot to protect my content?
That is a content-rights decision, not an SEO decision. If you want OpenAI to use your content as training data, allow GPTBot. If you do not, disallow it via robots.txt. The trade-off: blocking GPTBot does not stop ChatGPT Search from using your content via the Bing index, but it does stop OpenAI from training future models on your content. For most marketing-focused sites the right answer is to allow both crawlers.
Does ChatGPT use the same ranking signals as Google?
Overlapping but not identical. ChatGPT draws on Bing's index more than Google's, weights authority sources slightly differently, and appears to favour content with clear direct-answer formatting. The signals that overlap (entity identity, named expert authors, structured data, depth of topical coverage) are the ones most worth optimising for because they pay off across all AI search clients plus traditional Google SEO.
See how your site stacks up

Get a free SEO audit of your site.

30 seconds. Real Lighthouse scores, real keyword data, real backlink profile, AI-generated quick wins. Free, no sales pitch.

Get a Free SEO Audit

Or call 0435 462 205