AI Technical SEO Audit: Find and Fix Site Issues Fast

By Brent Dunn Jan 25, 2026 23 min read

Build Your First AI Project This Weekend

Stop consuming tutorials. Start creating. Get the free step-by-step guide.

Stop consuming tutorials. Start creating. Get the free step-by-step guide.

You built your site. You published content. Traffic should be rolling in.

But Google isn’t indexing half your pages. Or it’s indexing garbage pages you didn’t want. Your site loads like it’s 2008. And you have no idea where to start fixing it.

Technical SEO audits are the problem most new site owners ignore until it’s too late. They crawl sites, export CSVs, cross-reference Search Console data, check robots.txt, validate schema, analyze Core Web Vitals, and burn entire days doing it.

Here’s how I run a complete technical SEO audit in under 2 hours using AI. This is the exact process I use on my own sites and client projects.

Plus: most guides skip AI crawler optimization. ChatGPT, Perplexity, and Google’s AI Overviews now drive real traffic. If your site blocks these crawlers, you’re invisible to a growing traffic source.


What a Technical SEO Audit Actually Does

A technical SEO audit finds the infrastructure problems that prevent search engines from crawling, indexing, and ranking your site.

This isn’t about content. It’s about whether Google can even access what you’ve built.

You can write the best content on the internet. If Google can’t crawl it, can’t index it, or punishes you for slow load times, none of it matters.

Your content is the product. Technical SEO is the store. If customers can’t find the entrance, can’t navigate the aisles, or the lights are off, they leave.

What You’re Actually Checking

CategoryWhat You’re CheckingWhy It Matters
CrawlabilityCan bots access all pages?No crawl = no index = no traffic
IndexationAre the right pages indexed?Wrong pages indexed = diluted rankings
SpeedHow fast do pages load?Slow = high bounce + ranking penalty
MobileDoes the site work on phones?Mobile-first indexing is the standard
SecurityIs the site secure?HTTPS is a ranking factor
StructureIs information organized?Helps bots and users navigate
AI VisibilityCan AI crawlers access your content?Growing traffic source in 2026

When to Run an Audit

Run a full audit:

  • Before launching a new site (don’t launch blind)
  • After major site changes or migrations
  • Quarterly on established sites
  • Immediately when traffic drops unexpectedly

Run quick monthly checks:

  • 404 errors in Search Console
  • Core Web Vitals scores
  • New indexing issues
  • Top keyword ranking movements

Screaming Frog SEO Spider is the go-to crawling tool. The free version handles sites up to 500 URLs, more than enough when you’re starting out. The paid license runs £199/year if you scale up.


The Complete Audit Workflow

A technical SEO audit has a lot of moving parts. Miss one, and you might spend months wondering why your traffic isn’t growing.

Same steps, same order, every time.

The 7-Step Process

StepCategoryTime (with AI)Tools
1Crawlability20 minScreaming Frog + AI
2Indexation15 minSearch Console + AI
3Site Speed15 minPageSpeed Insights + AI
4Mobile Experience10 minMobile-Friendly Test + AI
5Security5 minSSL Labs + AI
6Structured Data15 minRich Results Test + AI
7AI Crawler Optimization10 minrobots.txt + manual checks

Total time: Under 2 hours.

Compare that to 8-10 hours doing this manually, or worse, paying an agency $500+ for what you can now do yourself.

The Complete Technical SEO Checklist

Print this out or save it. Run through it every time.

CRAWLABILITY

  • Robots.txt accessible and correct
  • XML sitemap exists and submitted
  • No critical pages blocked
  • Crawl errors identified and logged
  • Redirect chains under 3 hops
  • Internal linking structure reviewed

INDEXATION

  • Index coverage matches expectations
  • Canonical tags properly implemented
  • Noindex tags only on correct pages
  • Duplicate content issues resolved
  • Pagination handled correctly

SITE SPEED

  • LCP under 2.5 seconds
  • INP under 200ms
  • CLS under 0.1
  • Images optimized (WebP, lazy loading)
  • Render-blocking resources minimized

MOBILE

  • Viewport meta tag present
  • Touch targets 48x48px minimum
  • No horizontal scrolling
  • Content parity with desktop

SECURITY

  • Valid SSL certificate
  • All pages serve HTTPS
  • No mixed content warnings
  • HTTP redirects to HTTPS

STRUCTURED DATA

  • Schema validates without errors
  • Appropriate schema types used
  • Rich result eligibility confirmed

AI VISIBILITY

  • AI crawlers not blocked
  • Content accessible without JavaScript
  • Clear, structured content format
  • FAQ and how-to content optimized

AI Prompt: Complete Technical SEO Audit Report

Export your Screaming Frog crawl as CSV, then feed it to Claude. This prompt generates a full audit report from your data.

You are a technical SEO specialist conducting a comprehensive site audit.

SITE: [DOMAIN]
CRAWL DATA: [paste CSV export or key findings]

Analyze and create a prioritized audit report:

## 1. EXECUTIVE SUMMARY
- Overall site health score (1-10)
- Critical issues count
- High-priority fixes

## 2. CRAWLABILITY ANALYSIS
Review robots.txt directives, sitemap validity, blocked resources, and crawl budget concerns.

Flag any:
- Important pages blocked by robots.txt
- Sitemap URLs returning non-200 status
- Excessive crawl depth (>3 clicks from homepage)

## 3. INDEXATION STATUS
Compare indexed pages vs actual pages.

Identify:
- Index bloat (pages that shouldn't be indexed)
- Index gaps (important pages not indexed)
- Canonical tag issues
- Duplicate content signals

## 4. PERFORMANCE METRICS
Analyze Core Web Vitals data.

For any failing metrics, provide:
- Specific issue
- Pages affected
- Technical fix with code example

## 5. MOBILE + SECURITY
Note any mobile usability issues and HTTPS problems.

## 6. STRUCTURED DATA
List schema types found and validation errors.

## 7. PRIORITIZED ACTION PLAN

Format as:

| Priority | Issue | Category | Pages Affected | Fix | Estimated Impact |
|----------|-------|----------|----------------|-----|------------------|

CRITICAL (fix this week):
[List issues]

HIGH (fix within 2 weeks):
[List issues]

MEDIUM (fix within 30 days):
[List issues]

LOW (fix when convenient):
[List issues]

Be specific with fixes. Include code snippets where helpful.

Step 1: Crawlability Audit

If search engines can’t crawl your site, nothing else matters. No indexing. No rankings. No traffic.

Using Screaming Frog for Your Crawl

Here’s the exact workflow:

1. Configure your crawl settings

Open Screaming Frog and go to Configuration > Spider. Enable:

  • Crawl JavaScript
  • Crawl images
  • Check external links
  • Respect robots.txt (so you see what bots see)

2. Start the crawl

Enter your URL in the top bar and hit Start. A 500-page site takes 5-10 minutes.

3. Export the data

Once complete, go to Reports > Crawl Overview for a quick summary. Then export specific reports:

  • Bulk Export > Response Codes > Client Error (4xx) - your broken links
  • Bulk Export > Response Codes > Redirection (3xx) - your redirects
  • Reports > Redirects > Redirect Chains - redirect chain problems

This gives you the raw data to feed into Claude.

Robots.txt Analysis

Your robots.txt file controls what search engines can access. One wrong line blocks your entire site from indexing.

Check it at: yourdomain.com/robots.txt

What to look for:

DirectiveWhat It DoesCommon Mistakes
User-agent: *Applies to all botsForgetting to specify
Disallow: /admin/Blocks a directoryAccidentally blocking important pages
Disallow: /Blocks entire siteLeftover from development
Sitemap:Points to sitemapMissing entirely
Allow:Overrides disallowConflicting rules

AI Prompt: Robots.txt Audit

Analyze this robots.txt file for SEO issues:

[PASTE YOUR ROBOTS.TXT CONTENT]

Check for:

1. BLOCKING ISSUES
   - Are any important page types blocked?
   - Is CSS/JS blocked (Google needs these to render)?
   - Are any crawlers unnecessarily blocked?

2. SITEMAP DECLARATION
   - Is the sitemap URL included?
   - Is the URL correct and accessible?

3. DIRECTIVE CONFLICTS
   - Any rules that contradict each other?
   - Overly broad blocking patterns?

4. AI CRAWLER RULES (2026 important)
   - Is GPTBot blocked or allowed?
   - Is OAI-SearchBot blocked or allowed?
   - Is ClaudeBot blocked or allowed?
   - Is PerplexityBot blocked or allowed?

Provide:
- List of issues found
- Risk level for each (Critical/High/Medium/Low)
- Recommended robots.txt with fixes applied

XML Sitemap Audit

Your sitemap tells search engines what pages exist and when they were updated.

Find it at: yourdomain.com/sitemap.xml or yourdomain.com/sitemap_index.xml

The rules:

  • Only include pages you want indexed
  • Only include pages returning 200 status
  • Keep each sitemap under 50MB / 50,000 URLs
  • Update automatically when content changes
  • Declare in robots.txt

Common sitemap problems:

ProblemImpactFix
404 URLs in sitemapWasted crawl budgetRemove dead URLs
Redirect URLs in sitemapConfuses botsUse final destination URLs
Noindexed pages in sitemapConflicting signalsRemove or remove noindex
Missing from robots.txtMay not be discoveredAdd Sitemap: directive
Not submitted to Search ConsoleSlower discoverySubmit manually

AI Prompt: Sitemap Analysis

I'm going to paste my XML sitemap content. Analyze it for SEO issues.

[PASTE SITEMAP XML OR LIST OF URLs]

Check:

1. URL STATUS
   - Any URLs that look like they might 404?
   - Any obvious redirect URLs (HTTP vs HTTPS, www vs non-www)?
   - Any parameter URLs that shouldn't be indexed?

2. COMPLETENESS
   - Any obvious content types missing?
   - Estimate if count seems right for the site type

3. FORMAT ISSUES
   - Date format correct (W3C datetime)?
   - Proper XML structure?
   - Priority values between 0.0-1.0?

4. ORGANIZATION
   - Should this be split into multiple sitemaps?
   - Any category-specific sitemaps needed?

List issues and fixes in a table.

Crawl Errors from Search Console

Google Search Console shows you exactly what problems Googlebot encounters.

Go to Indexing > Pages to see:

  • Not indexed - Pages Google found but chose not to index
  • Crawled - currently not indexed - Found but not good enough
  • Discovered - currently not indexed - In queue but not crawled
  • Excluded by noindex tag - You told Google not to index
  • Blocked by robots.txt - You blocked it

Fix these first:

ErrorUrgencyFix
Server error (5xx)CriticalFix server/hosting issue
Redirect errorCriticalFix broken redirect chain
Blocked by robots.txtHighUpdate robots.txt if page should be indexed
Soft 404HighAdd content or return real 404
Not found (404)MediumRedirect or remove internal links
Duplicate without canonicalMediumAdd canonical tags

Step 2: Indexation Audit

Crawlability gets Google to your pages. Indexation determines if they show up in search results.

Two different things. I’ve seen sites with perfect crawlability and terrible indexation because of canonical tag issues, thin content, or index bloat.

The Index Math

Get your numbers:

Method 1: Site Search

site:yourdomain.com

This gives you a rough count of indexed pages. Not perfectly accurate, but useful for ballpark comparisons.

Method 2: Google Search Console

Go to Indexing > Pages. This shows exact numbers and breaks down why pages aren’t indexed.

Method 3: Screaming Frog

After your crawl, check the Indexability column. Filter for “Non-Indexable” to see what’s being excluded.

Interpreting the Numbers

Your NumbersWhat It MeansAction
Indexed > Sitemap pagesIndex bloat - unwanted pages indexedFind and noindex the extras
Indexed < Sitemap pagesIndex gaps - pages not being indexedFix crawl/indexation issues
Indexed roughly = SitemapHealthy stateMonitor monthly

Index bloat is a real problem.

I’ve seen ecommerce sites with 50,000 pages indexed when they only have 5,000 products. The rest? Filter pages, search results pages, parameter variations. All that junk dilutes your site’s authority.

Common Index Bloat Sources

Bloat TypeExample URLFix
Parameter URLs/products?color=red&size=xlCanonical to base URL or noindex
Pagination/blog/page/47/rel="prev/next" or noindex beyond page 1-2
Tag/Category pages/tag/blue-widgets/Noindex if thin content
Internal search/search?q=widgetBlock in robots.txt
Calendar/date archives/2024/03/15/Noindex date-based archives
Faceted navigation/products/category/price-under-50/Canonical to category page

Canonical Tag Audit

Canonical tags tell Google which version of a page is the “real” one.

In Screaming Frog: Go to the Canonicals tab. Filter for issues:

  • Missing
  • Self-referencing (usually good)
  • Canonicalized to another URL
  • Non-indexable canonical

The rules:

  • Every indexable page needs a canonical tag
  • Unique pages should self-reference
  • Duplicate/variant pages should point to the canonical version
  • Canonical URL must return 200 status
  • Don’t mix signals (noindex + canonical to different page = confusing)

AI Prompt: Indexation Analysis

Analyze this indexation data and identify issues:

SITE: [DOMAIN]

DATA:
- Pages in sitemap: [NUMBER]
- Google indexed (site: search): [NUMBER]
- Search Console indexed: [NUMBER]
- Screaming Frog crawled: [NUMBER]

SEARCH CONSOLE PAGE REPORT:
[Paste the status breakdown - how many pages in each category]

SCREAMING FROG CANONICAL REPORT:
[Paste or summarize canonical tag findings]

Analyze:

1. INDEX BLOAT CHECK
   - Is indexed count higher than it should be?
   - What page types might be causing bloat?
   - Prioritized list of pages to noindex

2. INDEX GAP CHECK
   - Important pages not being indexed?
   - Why might they be excluded?
   - Specific fixes for each gap

3. CANONICAL ISSUES
   - Any problematic canonical patterns?
   - Chain canonicals?
   - Mixed signals?

4. DUPLICATE CONTENT SIGNALS
   - HTTP/HTTPS versions both indexed?
   - www/non-www both indexed?
   - Trailing slash variations?
   - Parameter variations?

Output a prioritized fix list:

| Priority | Issue | Pages Affected | Specific Fix |

Step 3: Site Speed Audit

Speed is a ranking factor. But more importantly for your business, speed is a conversion factor.

A 1-second delay in page load drops conversions by about 7%. On mobile, users expect pages in under 3 seconds. Miss that, and they’re gone before they ever see your offer.

Core Web Vitals Explained

Google measures speed with three Core Web Vitals. Know these numbers:

MetricWhat It MeasuresGoodNeeds WorkPoor
LCP (Largest Contentful Paint)How fast main content loads< 2.5s2.5-4s> 4s
INP (Interaction to Next Paint)Response time when user clicks< 200ms200-500ms> 500ms
CLS (Cumulative Layout Shift)Visual stability (stuff jumping around)< 0.10.1-0.25> 0.25

Note: INP replaced FID (First Input Delay) in March 2024. If you see FID in guides, they’re outdated.

Testing Your Speed

Use these tools in this order:

1. Google PageSpeed Insights - pagespeed.web.dev

  • Tests both mobile and desktop
  • Shows field data (real user metrics) AND lab data (simulated)
  • Field data matters more for rankings

2. Chrome DevTools

  • Open DevTools > Lighthouse tab
  • Run a mobile performance audit
  • Good for debugging specific issues

3. WebPageTest - webpagetest.org

  • Test from different locations
  • See waterfall charts of resource loading
  • Identify specific bottlenecks

4. Search Console

  • Experience > Core Web Vitals
  • Shows which URLs pass/fail
  • Groups URLs by similar template

The Usual Suspects

I see the same speed issues on 80% of sites I audit:

IssueImpactQuick FixProper Fix
Unoptimized imagesHigh LCPCompress existing imagesServe WebP, lazy load below-fold
Too much JavaScriptHigh LCP, INPRemove unused scriptsCode split, defer non-critical
Render-blocking CSSHigh LCPInline critical CSSOptimize CSS delivery
No cachingRepeat visits slowAdd cache headersCDN + browser caching strategy
Slow server (TTFB)Everything slowBetter hostingOptimize backend, add caching layer
Layout shiftsHigh CLSAdd width/height to imagesReserve space for dynamic content
Third-party scriptsAll metricsAudit and remove unnecessaryLoad async, self-host critical ones

AI Prompt: Speed Optimization Analysis

Run PageSpeed Insights on your key pages, then use this prompt:

Analyze this PageSpeed Insights data and create an optimization plan:

PAGE: [URL]
DEVICE: [Mobile/Desktop]

CORE WEB VITALS (Field Data):
- LCP: [VALUE]
- INP: [VALUE]
- CLS: [VALUE]

OPPORTUNITIES LISTED:
[Paste the opportunities section from PageSpeed]

DIAGNOSTICS LISTED:
[Paste the diagnostics section from PageSpeed]

Create an optimization plan:

## CRITICAL FIXES (Do immediately)
For each issue blocking good CWV scores:
| Issue | Current Impact | Fix | Expected Improvement |

## HIGH IMPACT OPPORTUNITIES
For each opportunity with >1s potential savings:
| Opportunity | Potential Savings | How To Implement |

## QUICK WINS
Fixes that take <30 minutes and improve metrics:
1. [Fix with specific implementation steps]
2. [Fix with specific implementation steps]
3. [Fix with specific implementation steps]

## RESOURCE AUDIT
- Largest resources by size (potential to optimize)
- Render-blocking resources (need to defer/async)
- Unused JavaScript/CSS (can remove)

## IMPLEMENTATION ORDER
Prioritized list of what to fix first based on:
1. Impact on Core Web Vitals
2. Difficulty to implement
3. Dependencies between fixes

Include specific code snippets where helpful.

Speed Optimization by Platform

Different platforms have different solutions:

WordPress:

  • Use a caching plugin (WP Rocket, W3 Total Cache)
  • Optimize images with ShortPixel or Imagify
  • Consider a lightweight theme or headless approach

Shopify:

  • Use Shopify’s built-in image optimization
  • Minimize apps (each one adds JavaScript)
  • Use Shopify CDN, don’t override it

Hugo/Static Sites:

  • Already fast by default
  • Focus on image optimization
  • Minify CSS/JS in build process

Step 4: Mobile Audit

Mobile-first indexing isn’t new anymore. It’s the standard.

Google primarily uses the mobile version of your site for indexing and ranking. If your mobile experience is broken, your rankings will suffer, even for desktop searches.

Quick Mobile Test

Google’s Mobile-Friendly Test: search.google.com/test/mobile-friendly

Paste your URL, get a pass/fail result.

But this test is basic. It won’t catch everything. For a real mobile audit, check manually.

Mobile Audit Checklist

Viewport Configuration:

  • <meta name="viewport" content="width=device-width, initial-scale=1"> present
  • Content scales properly to different screen sizes
  • No horizontal scrolling required

Touch Targets:

  • Buttons/links minimum 48x48 pixels
  • Adequate spacing between clickable elements (8px minimum)
  • No tiny links that are hard to tap

Content Parity:

  • Same content on mobile as desktop (no hidden content)
  • Same structured data
  • Same internal links

Usability:

  • Text readable without zooming (16px minimum font)
  • No intrusive interstitials blocking content
  • Forms easy to complete on mobile

The Interstitial Problem

Google explicitly penalizes intrusive interstitials on mobile. These kill your rankings:

  • Full-screen popups that cover main content
  • Standalone interstitials users must dismiss before accessing content
  • Above-the-fold layouts where content is pushed below a popup

What’s okay:

  • Cookie consent banners (legally required)
  • Age verification (legally required)
  • Small banners that use reasonable screen space

AI Prompt: Mobile Experience Audit

Analyze the mobile experience for this page:

URL: [URL]

Based on the page content and structure, identify:

1. VIEWPORT ISSUES
   - Is viewport meta tag correctly configured?
   - Any content width problems?

2. TOUCH TARGET PROBLEMS
   - Links or buttons that might be too small?
   - Elements too close together?

3. TEXT READABILITY
   - Font sizes that might be too small?
   - Line spacing issues?

4. CONTENT PARITY
   - Any content hidden on mobile?
   - Missing navigation elements?

5. USABILITY ISSUES
   - Intrusive popups/interstitials?
   - Form field problems?
   - Horizontal scrolling?

Output:
| Issue | Location | Severity | Fix |

Step 5: Security Audit

This one’s straightforward. HTTPS is required.

If your site isn’t fully on HTTPS, you have a problem. Google has used HTTPS as a ranking signal since 2014. More importantly, browsers now show “Not Secure” warnings for HTTP sites.

Security Checklist

  • Valid SSL certificate installed
  • Certificate not expiring soon
  • ALL pages serve over HTTPS
  • HTTP requests 301 redirect to HTTPS
  • No mixed content warnings
  • HSTS header implemented (optional but recommended)

Checking Your SSL

Quick check: Visit your site. Look for the padlock icon in the browser address bar. Click it to see certificate details.

Detailed check: Use SSL Labs Server Test

This grades your SSL configuration from A to F. Aim for A or A+.

Mixed Content Issues

Mixed content = HTTPS page loading HTTP resources (images, scripts, stylesheets).

This triggers browser warnings and can break functionality.

Find mixed content in Screaming Frog:

  • Go to Security tab
  • Filter for “Mixed Content”
  • Export the list of insecure resources

Common culprits:

  • Hardcoded HTTP image URLs
  • Third-party scripts loaded over HTTP
  • Embedded content (iframes, videos)

Fix: Update all resource URLs to HTTPS. If the resource doesn’t support HTTPS, find an alternative or host it yourself.

Security Issues Table

IssueImpactFix
No SSL certificate“Not Secure” warning, no rankingInstall certificate (free with Let’s Encrypt)
Expired certificateSite shows errorRenew immediately
HTTP still accessibleDuplicate content, insecure301 redirect HTTP to HTTPS
Mixed contentBrowser warningsUpdate all resources to HTTPS
Weak SSL configurationSecurity vulnerabilitiesUpdate server config, use modern ciphers

Step 6: Structured Data Audit

Structured data tells search engines what your content is about.

Not a direct ranking factor, but it unlocks rich results, those search listings with stars, images, FAQs. Rich results get higher click-through rates, which means more traffic from the same rankings.

For a complete guide on implementing schema, see our Schema & JSON-LD Guide.

Schema Types You Should Have

Page TypeRequired SchemaOptional/Enhanced
HomepageWebsite, OrganizationSiteNavigationElement
Blog postsArticle or BlogPostingFAQ, HowTo, Author
Product pagesProductReview, AggregateRating, Offer
Service pagesServiceFAQ, Review
Local businessLocalBusinessOpeningHours, Review
All pagesBreadcrumbList-

Testing Your Structured Data

Google Rich Results Test: search.google.com/test/rich-results

Paste your URL. See what schema Google finds and if you’re eligible for rich results.

Schema Markup Validator: validator.schema.org

More detailed validation against Schema.org specs.

In Screaming Frog:

  • Configuration > Spider > Extraction > JSON-LD
  • After crawl, go to Structured Data tab
  • See all schema found across the site

Common Schema Mistakes

MistakeImpactFix
Missing required propertiesSchema invalid, no rich resultsAdd all required fields
Wrong schema typeMisleading to GoogleUse correct type for content
Mismatched dataSchema doesn’t match page contentKeep schema and content aligned
Duplicate schemaConfusing signalsOne schema block per type per page
HTTP URLs in schemaInconsistent with HTTPS siteUse HTTPS URLs
Missing breadcrumbsNo breadcrumb rich resultsAdd BreadcrumbList to all pages

AI Prompt: Schema Audit and Generation

Audit the structured data on this page and suggest improvements:

URL: [URL]
PAGE TYPE: [blog post/product page/service page/etc.]

CURRENT SCHEMA (if any):
[Paste existing JSON-LD]

PAGE CONTENT SUMMARY:
[Brief description of what the page is about]

Tasks:

1. VALIDATE CURRENT SCHEMA
   - Any syntax errors?
   - Missing required properties?
   - Values match page content?

2. IDENTIFY MISSING SCHEMA
   - What schema types should this page have?
   - What rich results could this page be eligible for?

3. GENERATE IMPROVED SCHEMA
   Provide complete JSON-LD that:
   - Includes all required properties
   - Includes recommended properties
   - Is properly nested
   - Uses correct data types
   - Matches the actual page content

Output format:

VALIDATION ISSUES:
| Issue | Severity | Fix |

MISSING OPPORTUNITIES:
- [Schema type that should be added]

RECOMMENDED JSON-LD:
```json
[Complete schema code]

RICH RESULT ELIGIBILITY:

  • [List which rich results this schema enables]

---

## Step 7: AI Crawler Optimization

Most technical SEO guides skip this entirely.

AI platforms like ChatGPT, Perplexity, and Google's AI Overviews drive real traffic now. These systems use their own crawlers to find and understand content.

**If you're blocking AI crawlers or your content isn't AI-friendly, you're invisible to a growing traffic source.**

### The AI Crawlers You Need to Know

| Crawler | Platform | Purpose |
|---------|----------|---------|
| GPTBot | OpenAI | Training data (optional to allow) |
| OAI-SearchBot | ChatGPT | Answers in ChatGPT search (important) |
| ChatGPT-User | ChatGPT | When users ask ChatGPT to read a URL (important) |
| ClaudeBot | Anthropic Claude | Training and retrieval |
| PerplexityBot | Perplexity | Answer generation |
| Google-Extended | Google | AI training (separate from main Googlebot) |

### Robots.txt for AI Crawlers

Check your robots.txt. Are you accidentally blocking AI crawlers?

**Default (block AI training, allow AI search):**

Block AI training crawlers

User-agent: GPTBot Disallow: /

User-agent: Google-Extended Disallow: /

Allow AI search crawlers

User-agent: OAI-SearchBot Allow: /

User-agent: ChatGPT-User Allow: /

User-agent: PerplexityBot Allow: /


**Allow everything (maximize AI visibility):**

AI crawlers allowed

User-agent: GPTBot Allow: /

User-agent: OAI-SearchBot Allow: /

User-agent: ChatGPT-User Allow: /

User-agent: ClaudeBot Allow: /

User-agent: PerplexityBot Allow: /


### Optimizing Content for AI Retrieval

AI systems parse your content differently than traditional search crawlers. They want clear, direct answers.

**What helps AI understand your content:**

1. **Clear structure** - Proper heading hierarchy (H1 > H2 > H3)
2. **Direct answers** - Put the answer early, then explain
3. **Lists and tables** - Easy to parse and cite
4. **FAQ sections** - Question-answer format works well
5. **No JavaScript-only content** - AI crawlers often can't execute JS

**What hurts:**

1. **Content behind logins** - Can't be crawled
2. **Heavy JavaScript rendering** - May not be accessible
3. **Thin content** - Nothing valuable to cite
4. **Duplicate content everywhere** - Confuses which page to cite

### AI Prompt: AI Visibility Audit

Audit this site’s visibility to AI crawlers and answer engines:

DOMAIN: [DOMAIN]

ROBOTS.TXT: [Paste robots.txt content]

SAMPLE CONTENT PAGE: [Paste content from one representative page]

Analyze:

  1. CRAWLER ACCESS

    • Which AI crawlers are blocked?
    • Which are allowed?
    • Recommendations for changes?
  2. CONTENT ACCESSIBILITY

    • Is key content in HTML or JavaScript-rendered?
    • Any content behind login walls?
    • Clear heading structure?
  3. AI-FRIENDLY FORMATTING

    • Has direct answers (not just buildup)?
    • Uses lists and tables?
    • Has FAQ content?
    • Clear, citable statements?
  4. CITATION POTENTIAL

    • What topics could this site be cited for?
    • What content gaps would help AI citations?

Recommendations: | Area | Current State | Recommendation | Priority |


---

## Creating the Final Audit Report

You've gathered data from every section. Now compile it into an actionable report.

This is where AI helps most. Feed it all your findings and it synthesizes everything into a prioritized action plan.

### AI Prompt: Complete Audit Report Generation

Compile these technical SEO audit findings into an executive report:

SITE: [DOMAIN] AUDIT DATE: [DATE]

CRAWLABILITY FINDINGS: [Paste findings from Step 1]

INDEXATION FINDINGS: [Paste findings from Step 2]

SPEED FINDINGS: [Paste findings from Step 3]

MOBILE FINDINGS: [Paste findings from Step 4]

SECURITY FINDINGS: [Paste findings from Step 5]

STRUCTURED DATA FINDINGS: [Paste findings from Step 6]

AI VISIBILITY FINDINGS: [Paste findings from Step 7]

Generate a professional audit report:

EXECUTIVE SUMMARY

Overall Site Health: [Score 1-10] Critical Issues: [count] High Priority Issues: [count] Total Issues Found: [count]

Key Findings:

  1. [Most important finding]
  2. [Second most important]
  3. [Third most important]

FINDINGS BY CATEGORY

For each category, provide:

  • Status: Good / Needs Work / Critical
  • Key issues found
  • Quick wins available

PRIORITIZED ACTION PLAN

CRITICAL - Fix This Week: | # | Issue | Category | Impact | Effort | Specific Fix |

HIGH - Fix Within 2 Weeks: | # | Issue | Category | Impact | Effort | Specific Fix |

MEDIUM - Fix Within 30 Days: | # | Issue | Category | Impact | Effort | Specific Fix |

LOW - Fix When Convenient: | # | Issue | Category | Impact | Effort | Specific Fix |

QUICK WINS

List 5 fixes that take <30 minutes and have immediate impact.

  • Date: [30/60/90 days based on issues found]
  • Focus areas: [based on what was most problematic]

Be specific with fixes. Include code snippets where helpful.


---

## Tools Summary

Here's everything mentioned in this guide:

**Crawling & Analysis:**
- [Screaming Frog SEO Spider](https://www.screamingfrog.co.uk/seo-spider/) - The go-to technical SEO crawler (free up to 500 URLs)
- [Google Search Console](https://search.google.com/search-console) - Free, essential for indexation data

**Speed Testing:**
- [PageSpeed Insights](https://pagespeed.web.dev) - Google's official speed test
- [WebPageTest](https://www.webpagetest.org) - Detailed waterfall analysis
- [GTmetrix](https://gtmetrix.com) - Alternative speed testing

**Validation:**
- [Google Rich Results Test](https://search.google.com/test/rich-results) - Schema validation
- [SSL Labs](https://www.ssllabs.com/ssltest/) - SSL certificate grading
- [Mobile-Friendly Test](https://search.google.com/test/mobile-friendly) - Quick mobile check

**AI Tools:**
- Claude for analysis prompts
- Export data as CSV and feed to AI for bulk analysis

---

## What to Do Next

Technical SEO isn't glamorous. It's the foundation that makes everything else work.

Your content strategy means nothing if Google can't crawl your site. Your link building is wasted if pages aren't being indexed. Your traffic potential is capped if your site is slow.

**Here's your action plan:**

1. Download Screaming Frog (free version handles 500 URLs)
2. Run a crawl of your site
3. Pull your indexation data from Search Console
4. Run PageSpeed Insights on your 3 most important pages
5. Feed everything to Claude using the prompts above
6. Get your prioritized fix list
7. Execute by priority
8. Re-audit quarterly

What used to take days now takes hours. AI handles the analysis. You handle the fixes.

**Don't skip AI crawler optimization.** ChatGPT, Perplexity, and AI Overviews are driving real traffic now. Make sure your site is visible to them.

For your next step, check out our [AI Site Architecture](/build/websites/content-sites/ai-site-architecture/) guide to build a crawlable site structure from the ground up, or our [Schema & JSON-LD Guide](/build/websites/content-sites/schema-json-ld-guide/) if structured data was your weak point in this audit.

---

## Recommended Reading

**On-Page SEO:**
- [AI Keyword Optimization](/acquire/seo/on-page/ai-keyword-optimization/) - Optimize content for target keywords
- [AI Content Structure](/acquire/seo/on-page/ai-content-structure/) - Structure content for rankings

**Building:**
- [Schema & JSON-LD Guide](/build/websites/content-sites/schema-json-ld-guide/) - Complete structured data implementation
- [AI Site Architecture](/build/websites/content-sites/ai-site-architecture/) - Build a crawlable site structure
- [AI Silo Structure](/build/websites/content-sites/ai-silo-structure/) - Organize content for topical authority