AI Visibility

How to Track Your Brand's AI Visibility Across ChatGPT, Perplexity, and Gemini

The Appearly Team Apr 23, 2026 14 min read
Guide to tracking AI visibility across ChatGPT, Gemini, Perplexity, Claude, Grok, and Google AIO - Appearly blog
ai visibility llm mentions chatgpt perplexity gemini tracking geo aeo

How to Track Your Brand's AI Visibility Across ChatGPT, Perplexity, and Gemini

Your customers are asking ChatGPT about your category right now. Some of them are hearing your name. Most of them are not, and you have no idea which group is bigger.

That is the awkward truth every marketer is waking up to in 2026. Organic traffic from Google is still real, but a growing share of high-intent research never reaches a search results page. It ends in a generated answer from ChatGPT, Perplexity, Gemini, Claude, Grok, or Google's AI Overviews. If your brand is not cited in those answers, you do not exist in that conversation. You cannot optimize what you cannot see, so the first job is to track AI visibility the same way you once tracked keyword rankings.

This guide walks through how to track AI visibility end to end. We start with the free manual method (so you can audit yourself in 5 minutes without signing up for anything), then show why the manual approach breaks at scale, then cover what to actually measure and how to automate the work.

What "AI visibility" actually means

Before you can track AI visibility, you need a shared vocabulary for what you are tracking. We see three distinct events every time an AI engine answers a question about your category:

1. Mention. Your brand name appears somewhere in the response. It could be a throwaway reference in a list of five tools, or a source link at the bottom. You showed up, but you were not picked.

2. Recommendation. The engine actively suggests your product to solve the user's problem. This is the AI equivalent of a featured snippet. It is the difference between "some options include Brand X, Y, and Z" and "for SaaS founders, Brand X is usually the best fit because..."

3. Sentiment. The tone of how your brand is described. Even a recommendation can come wrapped in hedged, lukewarm language. A mention can be glowing. Sentiment is the thing most people forget to measure until they see a competitor weaponize it.

LLM mentions are the floor. Recommendations are the goal. Sentiment is the multiplier. When you track AI visibility seriously, you track all three.

The manual method: a 5-minute free audit to track AI visibility

You do not need a tool to get started. Open six browser tabs. Log into free accounts for ChatGPT, Gemini, Claude, Grok, and Perplexity. Open Google in the sixth tab. Pick three queries a real customer would ask in your category. Run them, one per engine, and write down what happens.

Here are the prompts we use when we audit a brand for the first time:

1. "What are the best [category] tools in 2026?"
2. "I need [category] for [use case]. What do you recommend?"
3. "Compare [your brand] vs [top competitor]. Which is better for [use case]?"

Rules for a useful manual audit:

  • Use a fresh chat or incognito window so prior history does not bias the answer.
  • Turn on web search where the engine offers it (ChatGPT, Claude, Gemini, and Grok all have real-time web search now). Without it, models answer from training data that may be months old.
  • Record the exact phrasing of any mention of your brand, not just "yes they mentioned us."
  • Note the sources the engine cites. Those sources are the path an SEO needs to optimize toward.
  • Repeat each query twice. Variance is real. If you only ask once, you have a data point, not a trend.

Spreadsheet the results: engine, query, mention yes/no, recommendation yes/no, sentiment positive/neutral/negative, sources cited. Do this for three queries and six engines and you have 18 data points. That is enough to tell you whether you have an AI visibility problem, which is the only question a first audit needs to answer.

Why the manual method breaks the week after

The manual audit is useful exactly once, as a wake-up call. Then the math catches up to you.

A real keyword set is not three queries. It is 10 to 30 queries that cover the ways your customers describe their problem. Multiply:

  • 10 keywords
  • 6 engines
  • 2 runs per query (to handle variance)
  • Weekly cadence (because AI answers shift as new content gets indexed and as models retrain)

That is 480 queries per month. Each one takes roughly 30 seconds to run, transcribe, and score. You are looking at 4 hours per week of pure data collection, before you do anything with the results. Now add competitor tracking (are you gaining or losing share of voice against three rivals?) and the number doubles. Add sentiment scoring done well (reading the full response, not just scanning for your name) and the time doubles again.

Nobody does this work manually for more than a month. They either stop tracking, which is worse than never starting because they now know they are invisible and are choosing to ignore it, or they automate. There is no middle ground.

Even if you have the discipline, the manual method misses two things automation catches for free: variance (the same prompt answered twice in the same hour can produce different brand lists) and timing (you will never notice that a competitor's mention rate tripled in week three unless you have a baseline).

The metrics that actually matter

When you move from "am I there?" to "am I winning?", four metrics carry the load. Each has a formula simple enough to compute in a spreadsheet if you have the raw data.

Mention rate

mentions / total queries = mention rate

If you show up in 12 of 60 weekly queries, your mention rate is 20%. This is your floor. It tells you whether the engines know you exist at all in the context of a given keyword.

Recommendation rate

recommendations / total queries = recommendation rate

Of the same 60 queries, how many actively suggested your product as the answer? If mention rate is 20% but recommendation rate is 3%, you have a positioning problem, not a visibility problem. You are a footnote, not an answer.

Share of voice

your mentions / (your mentions + all competitor mentions)

This is the metric executives understand fastest because it looks like the market share they already track. Pick your three most important competitors. For every query where your brand or any of them appear, divide your mentions by the total. 40% share of voice against three rivals is strong. 10% is a signal.

Sentiment score Most teams start with a simple scale: -1 negative, 0 neutral, +1 positive, scored per mention. Average across all mentions in a week. A weighted version multiplies sentiment by recommendation (a neutral recommendation is worth more than a glowing mention that nobody acts on).

Track these four per keyword, per engine, per week. The cross-tab is where the useful decisions live. "Our mention rate on Perplexity for 'AI visibility tracking tools' is 60%, but recommendation rate is 5%, and sentiment is neutral. Perplexity knows we exist and does not think we are the right answer. What do the sources it cites say about our category?" That is a question you can act on. "Are we in ChatGPT?" is not.

Setting up automation to track AI visibility at scale

Once you accept that manual tracking does not scale, the question is which tool to use. A few options exist in 2026, each with tradeoffs:

  • Generic SEO suites with an AI visibility module bolted on. Cheap if you already pay for the suite. The downside: the AI module is usually the last feature built, engines covered are uneven, and scan cadence is often tied to keyword plans built for Google rankings.
  • Enterprise AI search analytics platforms. Deep features, custom dashboards, sales calls to see pricing. Usually overkill for a team with fewer than 100 keywords.
  • Purpose-built AI visibility trackers. Tools built specifically for this problem from day one. Usually faster to set up and priced per workspace.

We built Appearly because we were in the third camp as customers first and could not find a tool that did what we needed at a price a founder could justify. So this section covers how we set up automated tracking inside Appearly. The same workflow applies to any purpose-built tool. The mechanics are what matter.

Step 1: Define your keyword set. 10 to 30 queries that cover your category from different angles. Include branded queries (people who already know you and are asking for a comparison), unbranded queries (people describing their problem without knowing tools exist), and competitor queries (people comparing specific vendors). If you skip unbranded queries you will miss the top of your funnel, which is exactly where AI visibility matters most.

Step 2: Add competitors. Pick the three to five brands a customer would realistically consider alongside yours. More than five and the share of voice signal gets noisy.

Step 3: Connect the engines. Appearly monitors six: ChatGPT, Gemini, Claude, Grok, Perplexity, and Google AI Overviews. Web search is enabled on the four that support it, so answers reflect the current web, not a model's training cutoff. Perplexity and AIO have always been real-time by design.

Step 4: Schedule. Automated scans run every Sunday at 22:00 UTC in Appearly, which means fresh data waits on Monday morning. You also get one manual scan per workspace per week, which is what you reach for after you ship a new landing page or publish a comparison post and want to see whether the engines picked it up. Combined, that is roughly 500 to 1,000 queries run for you every week depending on keyword and competitor count. You run zero of them.

Step 5: Baseline, then watch trends. The first week of data is a snapshot, not a verdict. You need three to four weeks before patterns separate from variance. Appearly keeps 3 to 12 months of history depending on plan, which is the horizon you need for quarterly reporting anyway.

Reading your results: three patterns you will see

Raw numbers do not run themselves. Here are the three scenarios that show up in almost every dashboard in the first month of tracking, and what each means.

Scenario 1: "We are nowhere." Mention rate below 10% across every engine. Share of voice in the single digits. This is the most common starting point and it is not actually bad news. It means you have a content gap, not a reputation problem. AI engines cite what they can find. If your site does not rank for the queries you care about and no third-party source talks about you in that context, the engines have nothing to pull from. Fix: build the comparison pages, product roundups, and use-case deep dives that third parties and your own SEO will feed into the models.

Scenario 2: "We are mentioned but not recommended." Mention rate 30% or higher, recommendation rate under 10%. This is the frustrating middle. The engines know you exist but place you as an also-ran. Usually means two things: your positioning on your own site is fuzzy (the AI cannot figure out who you are for), or the comparison content about your category frames you as a secondary option. Fix: sharpen the positioning statement on your homepage, audit the third-party reviews the engines are citing, and consider reaching out to the highest-traffic ones with product updates or corrections.

Scenario 3: "We dominate one engine and are invisible on another." This happens more than you would expect. A brand that lives in Perplexity's answers might be missing entirely from ChatGPT for the same query. The engines index and retrieve differently. Grok leans on X content heavily. Google AIO leans on Google's own index. Perplexity cites a broad set of web sources with visible attribution. Fix: figure out which engine your target customers actually use, and prioritize the gap there first. Then broaden.

The data will give you other patterns too, but those three are the starting set every team we have worked with has to solve before anything else is worth looking at.

FAQ

What does it mean to track AI visibility? To track AI visibility means to systematically monitor whether your brand appears, gets recommended, and is described positively across the major AI engines (ChatGPT, Gemini, Claude, Grok, Perplexity, Google AI Overviews) when users ask questions about your category.

How often should I track AI visibility? Weekly is the right cadence for most brands. AI answers shift as models refresh and as new content gets indexed. Daily tracking is noisy (too much variance). Monthly is too slow (you miss shifts that matter). A Sunday automated scan with Monday-morning review is a practical rhythm.

Can I track LLM mentions for free? Yes, manually. Run three queries per engine across six engines and spreadsheet the results. This works as a one-time audit. It does not scale past week one because the query volume needed to cover a real keyword set (10 keywords, 6 engines, weekly, plus competitors) runs into the hundreds per month.

Is AI visibility the same as SEO? No. SEO optimizes for ranking on a search results page. AI visibility optimizes for being cited, recommended, and described well inside a generated answer. The signals overlap (third-party sources, structured content, authority), but the output surface is different, and so is the measurement.

Which AI engines should I monitor? At minimum the six with meaningful consumer and professional use: ChatGPT, Gemini, Claude, Grok, Perplexity, and Google AI Overviews. If your audience is international, add region-specific engines (for example, Baidu's ERNIE in China). If your audience is highly technical, weight Perplexity more.

How long before AI visibility tracking shows results? The first week gives you a baseline. Three to four weeks let you separate signal from variance. Meaningful improvement from content and positioning changes usually shows up in 6 to 12 weeks because engines need time to index new sources.

Do I need a different tool for every engine? No. Purpose-built AI visibility trackers handle all the major engines in one workspace. What differs is how up-to-date each engine is. A tracker that uses web search on ChatGPT and Gemini gives you current-web answers. One that only hits the base model gives you training-cutoff answers, which will lag weeks or months.

Start tracking, stop guessing

If you took nothing else from this guide: the single worst state for a brand in 2026 is to suspect you are invisible in AI answers and never look. Running the manual audit in this post takes 5 minutes and will tell you more than every instinct call you have made this quarter.

If the audit surfaces a problem, automate. If it surfaces that you are doing well, automate anyway, because the benchmark you build now is the only way you will notice the week a competitor starts winning.

You can start tracking your AI visibility with Appearly free with a 10-day trial, no card required. Connect your keywords, add your competitors, and the first automated scan runs the following Sunday.

Track your AI visibility

See how your brand appears across ChatGPT, Perplexity, Gemini, and more.

Get Started