Where Does ChatGPT Get Its Information? Inside 143,010 Citations From UK Banking Conversations

Post Date: May 28, 2026

Last Modified: May 28, 2026

Reading Time: 3 min read

Across 25,417 simulated ChatGPT conversations about UK retail and commercial banking in 2026, the model cited 143,010 sources, drawn from 4,125 distinct domains. Only 2.57% of those citations pointed to a bank's own website. The other 97.43% came from places most banks have never optimized for, and in many cases have never even audited.

When a customer asks ChatGPT "which UK bank should I open an account with," "which app is best for mobile banking," or "is NatWest better than Lloyds," the model is not reading the bank's homepage. It is reading Finder, Forbes, Which?, MoneySavingExpert, Wikipedia, and a long tail of comparison sites and aggregators. That is where ChatGPT gets its information about UK banking. The answer has direct consequences for any brand that still treats AI search as a content strategy problem rather than an external authority problem.

The data below is drawn from the Genezio platform (specifically from their industry leaderboards), which simulates full multi-turn ChatGPT conversations as configured customer personas and tracks every source the model cites. The 2026 UK banking dataset is the basis for everything in this piece.

What "ChatGPT's sources" actually means

Most marketing teams still picture ChatGPT as a kind of fixed encyclopedia trained on the open web up to a cutoff date. That mental model is two years out of date. In current production, ChatGPT routes most live queries through a retrieval layer: a web index it consults in real time before generating an answer. The model decides what to retrieve. The retrieval layer decides which pages get surfaced. The answer is then composed from those retrieved pages, with citations that the user may or may not click.

So when we ask "where does ChatGPT get its information," we are really asking three separate questions:

Which domains does ChatGPT pull from when answering a given question?
How are those domains weighted against each other?
How does the answer change based on which domains rank for the underlying retrieval?

For UK banking in 2026, the answers are surprisingly stable across thousands of conversations. The same handful of domains dominate. The same comparison-site listicles get cited hundreds of times. And the same handful of editorial titles set the tone of how the model talks about specific banks.

Where ChatGPT gets its information: the top 16 domains for UK banking

Across 143,010 citations from January through May 2026, here is what ChatGPT actually drew from when answering banking questions:

Rank	Domain	Citations	Ownership type
1	finder.com	6,647	Industry / trade publication
2	forbes.com	5,595	Editorial
3	en.wikipedia.org	5,355	Reference encyclopedia
4	which.co.uk	4,772	Industry / trade publication
5	comparebanks.co.uk	4,228	Aggregator / marketplace
6	moneysavingexpert.com	3,130	Industry / trade publication
7	moneytothemasses.com	3,053	Industry / trade publication
8	moneyweek.com	2,314	Industry / trade publication
9	moneysupermarket.com	2,054	Aggregator / marketplace
10	moneyfactscompare.co.uk	2,016	Industry / trade publication
11	moneyzine.com	2,013	Industry / trade publication
12	gov.uk	1,863	Government / regulatory
13	theguardian.com	1,524	Editorial
14	monito.com	1,382	Industry / trade publication
15	nerdwallet.com	1,277	Aggregator / marketplace
16	ft.com	1,213	Editorial

The top 16 domains alone account for 48,436 citations, roughly 40% of all sourcing for the entire UK banking category. The remaining 60% are spread across 4,105 long-tail domains, most of which appear in fewer than 100 conversations.

Pattern one: comparison sites dominate by category, not by brand

Finder, Which?, MoneySavingExpert, MoneyToTheMasses, MoneyWeek, MoneyFactsCompare, Moneyzine, Monito, NerdWallet, and GoodMoneyGuide together account for 28,261 of the 48,436 top-16 citations, nearly half of all major sourcing. Add aggregators like comparebanks.co.uk, MoneySuperMarket, and the share rises to roughly 60% of the top-tier source pool.

What this means in practice: when ChatGPT answers "best current account UK," it is not picking from the banks themselves. It is reading a comparison-site ranking that was written by a journalist or product reviewer at Finder, Which?, or NerdWallet, and reproducing that ranking, often with the same banks at the top, often with the same descriptive language ("rewards loyalty schemes," "strong customer satisfaction scores"), often with the same omissions.

This is the single most important fact for any UK bank's GEO strategy. Your visibility in ChatGPT is downstream of your presence in fifteen comparison-site rankings. You can rewrite your homepage every week and the model will barely notice. Get on a Finder shortlist, and the model will quote you 600+ times in a single quarter.

Pattern two: first-party sources are 2.57% of citations

Of 143,010 total citations, only 3,670 (about 2.57%) pointed to a bank's own owned domain. Even when you include large bank-owned corporate sites that are not strictly customer-facing, the first-party share stays in the low single digits.

This is the inverse of how most banks allocate their content investment. The marketing team writes hundreds of pages on the brand's own website. The model cites four of them. Meanwhile, a single comparison-site review can pull 800+ citations in five months.

Two implications follow directly:

The first is that traditional SEO performance does not predict AI sourcing. A page can rank #1 on Google for "best UK current account" and still be ignored by ChatGPT in favour of a Finder listicle. ChatGPT's retrieval is not Google's index.
The second is that the brands quoted most often inside AI answers are not necessarily the brands that appear in the official ranking. ChatGPT will happily quote a Finder list that ranks NatWest fourth, and then in the natural-language response describe NatWest in detail because the surrounding paragraph has more on it. Position in the list is not what drives the model's narrative. The way a bank is described inside the prose is what gets recycled into the answer.

Pattern three: Wikipedia and gov.uk carry institutional weight

Wikipedia alone accounts for 5,355 citations. Gov.uk adds another 1,863. Together, these reference and regulatory sources make up about 11% of top-16 citations.

The role of these two is different from comparison sites. Comparison sites shape the ranking of banks. Wikipedia and gov.uk shape the facts: founding year, ownership structure, headquarters, regulatory status, deposit guarantee scheme membership, parent group. When ChatGPT explains who a bank is, it pulls structural facts from these sources almost exclusively. When the Wikipedia entry for a bank contains an outdated CEO, a missing acquisition, or an unclear ownership chain, that error propagates directly into thousands of customer-facing AI answers.

Most UK banks have not audited their own Wikipedia entries in years. This is the cheapest, highest-leverage correction available in the entire GEO playbook.

Why GA4 will not show you any of this

A reasonable counter-argument at this point is: "If this matters, why isn't it showing up in our analytics?"

It is not showing up because GA4 measures clicks on outbound links. ChatGPT citations are mostly not clicked. The model has already synthesized the answer for the user. The user reads the answer, makes a decision, and either goes directly to the bank's app, walks into a branch, or moves on. The Finder citation that shaped their decision never converts into a referral session in your analytics.

What does this look like in practice? A typical UK retail bank's GA4 dashboard might show a few hundred sessions per month attributed to AI sources. The actual number of ChatGPT conversations in which the bank was discussed will run into the tens of thousands, and the brand has no GA4-side visibility into any of them.

This is the GA4 illusion: a measurement system that quietly hides 99% of a category's AI activity because that activity does not produce clicks. Banks that rely on GA4 to size the AI opportunity will systematically under-invest, because the data they are looking at is showing them the wrong number.

What a bank should actually do with this data

Knowing what sources ChatGPT cites is only useful if it changes what the brand invests in. Three concrete shifts follow from the 2026 UK banking data:

Audit the top 16 domains your category cites, not your own site. For UK banking, that list is now public, and your brand either appears in those rankings or it does not. If you appear, audit the language used to describe you. ChatGPT will repeat it. If you do not appear, the question is whether you can earn placement by demonstrating measurable advantages on the criteria those reviewers care about: customer satisfaction scores, app ratings, and account features.

Treat Wikipedia and gov.uk as Tier-1 GEO assets, not afterthoughts. Your Wikipedia entry is more visible to ChatGPT than most of your owned content. Verify every fact, update structural information, and make sure the page is sourced from authoritative third-party citations.

Stop measuring AI exposure with GA4. If your AI strategy is being sized off referral sessions, you are sizing off the 2.57% of activity that produces a click. Server log analysis and conversation-level tracking will surface the other 97.43%. For a full review of tools that track these metrics, see our breakdown of the best LLM monitoring tools for brand visibility, or read our guide on how to run an LLM brand visibility audit to design your own pipeline.

The question every CMO in UK banking should be asking next month is not "are we visible on AI." It is: "When ChatGPT answers a customer question about us, which fifteen URLs is it reading, and what do those URLs say?" That answer is now knowable, and it is the only one that maps to how the model actually behaves.

Data in this piece is drawn from the Genezio GEO platform, which simulates multi-turn ChatGPT conversations as configured customer personas across UK commercial banking. The 2026 dataset covers 25,417 conversations, 143,010 citations, and 4,125 distinct sourced domains.

Frequently Asked Questions (FAQ)

Where does ChatGPT get its information about UK banking?

ChatGPT retrieves information in real-time using a retrieval layer that queries search indexes. For UK banking, around 60% of citations point to comparison sites (like Finder, Which?, and MoneySavingExpert) and aggregators, while only 2.57% point to the banks' own websites. Editorial sites, Wikipedia, and government pages (like gov.uk) make up the rest.

Why are bank homepages rarely cited by ChatGPT?

When users ask ChatGPT for banking recommendations, comparisons, or reviews, the retrieval layer is optimized to pull from third-party authority sites, comparison tables, and review platforms rather than promotional brand homepages. This makes external mentions and reviews much more influential for AI visibility than a brand's own content.

Can traditional SEO strategies improve AI engine visibility?

Not necessarily. Traditional SEO focuses on ranking pages on Google's search result pages. AI engines (like ChatGPT's search feature) prioritize synthesized answers pulled from a curated set of authoritative hubs. To optimize for AI search (GEO/Generative Engine Optimization), brands must focus on being featured in these third-party comparison sites, aggregators, and reference sources like Wikipedia.

Why doesn't Google Analytics (GA4) show the traffic from these AI queries?

Most users read the synthesized answers directly within ChatGPT and do not click on the citation links. Because GA4 only tracks sessions initiated by direct link clicks (referrals), it misses the massive volume of conversations where the brand is recommended or discussed without generating a click.

How can Wikipedia and gov.uk affect ChatGPT responses?

Wikipedia and gov.uk carry immense authority and shape the factual structural details (e.g., leadership, regulatory status, parent groups) that the model reproduces. Errors or outdated details on these pages propagate directly into ChatGPT's customer-facing answers.

Other SEO, AI & Marketing Automation Stories

GEOJune 25, 2026

How to Run an LLM Brand Visibility Audit (Enterprise Framework)

Your SEO team just showed you a dashboard where everything is green. Rankings holding, technical health at 88, Core Web Vitals passing. Meanwhile ChatGPT, Gemini, and Perplexity are recommending your competitor by name and skipping you entirely. Here is the enterprise framework to audit and fix your LLM visibility.

GEOMarch 11, 2026

The Ultimate Guide to LLM Brand Visibility & Generative Engine Optimization (GEO)

In 2026, the digital landscape for brands is transforming radically. The rise of generative AI-powered language models means brand visibility now hinges on new strategies. Discover Generative Engine Optimization (GEO).

GEOFebruary 24, 2026

The 10 Best LLM Monitoring Tools for Brand Visibility in 2026

The SEO landscape has fundamentally transformed. We are no longer just fighting for ten blue links; today, the battleground is the single, definitive answer generated by Large Language Models. I ran a massive, live audit across the top AI visibility platforms to compare how they handle brand monitoring.

AutomationMay 29, 2026

How I Use n8n to Automate SEO, Newsletters, and Link Building (Full Breakdown)

Building your own automation infrastructure might sound like overkill, but the strategic advantage is massive. Here is a full breakdown of three heavily relied-upon workflows running on my self-hosted n8n server, and the business logic that makes them so powerful.

PerformanceMarch 6, 2026

The Ultimate Guide to Website Page Speed in 2026

Nobody purposely designs a slow website, but in today's attention economy, we are no longer fighting for minutes of user engagement, we are fighting for absolute milliseconds. In this guide, I show you how I achieved a flawless 100/100 Google PageSpeed score.

AnalyticsAugust 3, 2025

Web Analytics: The Guide to Implementation and Optimization

We live in a world where documenting that something happened often matters more than how it occurred in reality. A doctor's medical chart carries more weight than the actual treatment provided, because the next doctor will base their actions on that documented diagnosis and patient status. The same applies to police officers, lawyers, and accountants—their reports and documentation often matter more than the actual events.