Build Your First AI Project This Weekend
Stop consuming tutorials. Start creating. Get the free step-by-step guide.
You built your site. You published content. Traffic should be rolling in.
But Google isn’t indexing half your pages. Or it’s indexing garbage pages you didn’t want. Your site loads like it’s 2008. And you have no idea where to start fixing it.
Technical SEO audits are the problem most new site owners ignore until it’s too late. They crawl sites, export CSVs, cross-reference Search Console data, check robots.txt, validate schema, analyze Core Web Vitals, and burn entire days doing it.
Here’s how I run a complete technical SEO audit in under 2 hours using AI. This is the exact process I use on my own sites and client projects.
Plus: most guides skip AI crawler optimization. ChatGPT, Perplexity, and Google’s AI Overviews now drive real traffic. If your site blocks these crawlers, you’re invisible to a growing traffic source.
What a Technical SEO Audit Actually Does
A technical SEO audit finds the infrastructure problems that prevent search engines from crawling, indexing, and ranking your site.
This isn’t about content. It’s about whether Google can even access what you’ve built.
You can write the best content on the internet. If Google can’t crawl it, can’t index it, or punishes you for slow load times, none of it matters.
Your content is the product. Technical SEO is the store. If customers can’t find the entrance, can’t navigate the aisles, or the lights are off, they leave.
What You’re Actually Checking
| Category | What You’re Checking | Why It Matters |
|---|---|---|
| Crawlability | Can bots access all pages? | No crawl = no index = no traffic |
| Indexation | Are the right pages indexed? | Wrong pages indexed = diluted rankings |
| Speed | How fast do pages load? | Slow = high bounce + ranking penalty |
| Mobile | Does the site work on phones? | Mobile-first indexing is the standard |
| Security | Is the site secure? | HTTPS is a ranking factor |
| Structure | Is information organized? | Helps bots and users navigate |
| AI Visibility | Can AI crawlers access your content? | Growing traffic source in 2026 |
When to Run an Audit
Run a full audit:
- Before launching a new site (don’t launch blind)
- After major site changes or migrations
- Quarterly on established sites
- Immediately when traffic drops unexpectedly
Run quick monthly checks:
- 404 errors in Search Console
- Core Web Vitals scores
- New indexing issues
- Top keyword ranking movements
Screaming Frog SEO Spider is the go-to crawling tool. The free version handles sites up to 500 URLs, more than enough when you’re starting out. The paid license runs £199/year if you scale up.
The Complete Audit Workflow
A technical SEO audit has a lot of moving parts. Miss one, and you might spend months wondering why your traffic isn’t growing.
Same steps, same order, every time.
The 7-Step Process
| Step | Category | Time (with AI) | Tools |
|---|---|---|---|
| 1 | Crawlability | 20 min | Screaming Frog + AI |
| 2 | Indexation | 15 min | Search Console + AI |
| 3 | Site Speed | 15 min | PageSpeed Insights + AI |
| 4 | Mobile Experience | 10 min | Mobile-Friendly Test + AI |
| 5 | Security | 5 min | SSL Labs + AI |
| 6 | Structured Data | 15 min | Rich Results Test + AI |
| 7 | AI Crawler Optimization | 10 min | robots.txt + manual checks |
Total time: Under 2 hours.
Compare that to 8-10 hours doing this manually, or worse, paying an agency $500+ for what you can now do yourself.
The Complete Technical SEO Checklist
Print this out or save it. Run through it every time.
CRAWLABILITY
- Robots.txt accessible and correct
- XML sitemap exists and submitted
- No critical pages blocked
- Crawl errors identified and logged
- Redirect chains under 3 hops
- Internal linking structure reviewed
INDEXATION
- Index coverage matches expectations
- Canonical tags properly implemented
- Noindex tags only on correct pages
- Duplicate content issues resolved
- Pagination handled correctly
SITE SPEED
- LCP under 2.5 seconds
- INP under 200ms
- CLS under 0.1
- Images optimized (WebP, lazy loading)
- Render-blocking resources minimized
MOBILE
- Viewport meta tag present
- Touch targets 48x48px minimum
- No horizontal scrolling
- Content parity with desktop
SECURITY
- Valid SSL certificate
- All pages serve HTTPS
- No mixed content warnings
- HTTP redirects to HTTPS
STRUCTURED DATA
- Schema validates without errors
- Appropriate schema types used
- Rich result eligibility confirmed
AI VISIBILITY
- AI crawlers not blocked
- Content accessible without JavaScript
- Clear, structured content format
- FAQ and how-to content optimized
AI Prompt: Complete Technical SEO Audit Report
Export your Screaming Frog crawl as CSV, then feed it to Claude. This prompt generates a full audit report from your data.
You are a technical SEO specialist conducting a comprehensive site audit.
SITE: [DOMAIN]
CRAWL DATA: [paste CSV export or key findings]
Analyze and create a prioritized audit report:
## 1. EXECUTIVE SUMMARY
- Overall site health score (1-10)
- Critical issues count
- High-priority fixes
## 2. CRAWLABILITY ANALYSIS
Review robots.txt directives, sitemap validity, blocked resources, and crawl budget concerns.
Flag any:
- Important pages blocked by robots.txt
- Sitemap URLs returning non-200 status
- Excessive crawl depth (>3 clicks from homepage)
## 3. INDEXATION STATUS
Compare indexed pages vs actual pages.
Identify:
- Index bloat (pages that shouldn't be indexed)
- Index gaps (important pages not indexed)
- Canonical tag issues
- Duplicate content signals
## 4. PERFORMANCE METRICS
Analyze Core Web Vitals data.
For any failing metrics, provide:
- Specific issue
- Pages affected
- Technical fix with code example
## 5. MOBILE + SECURITY
Note any mobile usability issues and HTTPS problems.
## 6. STRUCTURED DATA
List schema types found and validation errors.
## 7. PRIORITIZED ACTION PLAN
Format as:
| Priority | Issue | Category | Pages Affected | Fix | Estimated Impact |
|----------|-------|----------|----------------|-----|------------------|
CRITICAL (fix this week):
[List issues]
HIGH (fix within 2 weeks):
[List issues]
MEDIUM (fix within 30 days):
[List issues]
LOW (fix when convenient):
[List issues]
Be specific with fixes. Include code snippets where helpful.
Step 1: Crawlability Audit
If search engines can’t crawl your site, nothing else matters. No indexing. No rankings. No traffic.
Using Screaming Frog for Your Crawl
Here’s the exact workflow:
1. Configure your crawl settings
Open Screaming Frog and go to Configuration > Spider. Enable:
- Crawl JavaScript
- Crawl images
- Check external links
- Respect robots.txt (so you see what bots see)
2. Start the crawl
Enter your URL in the top bar and hit Start. A 500-page site takes 5-10 minutes.
3. Export the data
Once complete, go to Reports > Crawl Overview for a quick summary. Then export specific reports:
- Bulk Export > Response Codes > Client Error (4xx) - your broken links
- Bulk Export > Response Codes > Redirection (3xx) - your redirects
- Reports > Redirects > Redirect Chains - redirect chain problems
This gives you the raw data to feed into Claude.
Robots.txt Analysis
Your robots.txt file controls what search engines can access. One wrong line blocks your entire site from indexing.
Check it at: yourdomain.com/robots.txt
What to look for:
| Directive | What It Does | Common Mistakes |
|---|---|---|
User-agent: * | Applies to all bots | Forgetting to specify |
Disallow: /admin/ | Blocks a directory | Accidentally blocking important pages |
Disallow: / | Blocks entire site | Leftover from development |
Sitemap: | Points to sitemap | Missing entirely |
Allow: | Overrides disallow | Conflicting rules |
AI Prompt: Robots.txt Audit
Analyze this robots.txt file for SEO issues:
[PASTE YOUR ROBOTS.TXT CONTENT]
Check for:
1. BLOCKING ISSUES
- Are any important page types blocked?
- Is CSS/JS blocked (Google needs these to render)?
- Are any crawlers unnecessarily blocked?
2. SITEMAP DECLARATION
- Is the sitemap URL included?
- Is the URL correct and accessible?
3. DIRECTIVE CONFLICTS
- Any rules that contradict each other?
- Overly broad blocking patterns?
4. AI CRAWLER RULES (2026 important)
- Is GPTBot blocked or allowed?
- Is OAI-SearchBot blocked or allowed?
- Is ClaudeBot blocked or allowed?
- Is PerplexityBot blocked or allowed?
Provide:
- List of issues found
- Risk level for each (Critical/High/Medium/Low)
- Recommended robots.txt with fixes applied
XML Sitemap Audit
Your sitemap tells search engines what pages exist and when they were updated.
Find it at: yourdomain.com/sitemap.xml or yourdomain.com/sitemap_index.xml
The rules:
- Only include pages you want indexed
- Only include pages returning 200 status
- Keep each sitemap under 50MB / 50,000 URLs
- Update automatically when content changes
- Declare in robots.txt
Common sitemap problems:
| Problem | Impact | Fix |
|---|---|---|
| 404 URLs in sitemap | Wasted crawl budget | Remove dead URLs |
| Redirect URLs in sitemap | Confuses bots | Use final destination URLs |
| Noindexed pages in sitemap | Conflicting signals | Remove or remove noindex |
| Missing from robots.txt | May not be discovered | Add Sitemap: directive |
| Not submitted to Search Console | Slower discovery | Submit manually |
AI Prompt: Sitemap Analysis
I'm going to paste my XML sitemap content. Analyze it for SEO issues.
[PASTE SITEMAP XML OR LIST OF URLs]
Check:
1. URL STATUS
- Any URLs that look like they might 404?
- Any obvious redirect URLs (HTTP vs HTTPS, www vs non-www)?
- Any parameter URLs that shouldn't be indexed?
2. COMPLETENESS
- Any obvious content types missing?
- Estimate if count seems right for the site type
3. FORMAT ISSUES
- Date format correct (W3C datetime)?
- Proper XML structure?
- Priority values between 0.0-1.0?
4. ORGANIZATION
- Should this be split into multiple sitemaps?
- Any category-specific sitemaps needed?
List issues and fixes in a table.
Crawl Errors from Search Console
Google Search Console shows you exactly what problems Googlebot encounters.
Go to Indexing > Pages to see:
- Not indexed - Pages Google found but chose not to index
- Crawled - currently not indexed - Found but not good enough
- Discovered - currently not indexed - In queue but not crawled
- Excluded by noindex tag - You told Google not to index
- Blocked by robots.txt - You blocked it
Fix these first:
| Error | Urgency | Fix |
|---|---|---|
| Server error (5xx) | Critical | Fix server/hosting issue |
| Redirect error | Critical | Fix broken redirect chain |
| Blocked by robots.txt | High | Update robots.txt if page should be indexed |
| Soft 404 | High | Add content or return real 404 |
| Not found (404) | Medium | Redirect or remove internal links |
| Duplicate without canonical | Medium | Add canonical tags |
Step 2: Indexation Audit
Crawlability gets Google to your pages. Indexation determines if they show up in search results.
Two different things. I’ve seen sites with perfect crawlability and terrible indexation because of canonical tag issues, thin content, or index bloat.
The Index Math
Get your numbers:
Method 1: Site Search
site:yourdomain.com
This gives you a rough count of indexed pages. Not perfectly accurate, but useful for ballpark comparisons.
Method 2: Google Search Console
Go to Indexing > Pages. This shows exact numbers and breaks down why pages aren’t indexed.
Method 3: Screaming Frog
After your crawl, check the Indexability column. Filter for “Non-Indexable” to see what’s being excluded.
Interpreting the Numbers
| Your Numbers | What It Means | Action |
|---|---|---|
| Indexed > Sitemap pages | Index bloat - unwanted pages indexed | Find and noindex the extras |
| Indexed < Sitemap pages | Index gaps - pages not being indexed | Fix crawl/indexation issues |
| Indexed roughly = Sitemap | Healthy state | Monitor monthly |
Index bloat is a real problem.
I’ve seen ecommerce sites with 50,000 pages indexed when they only have 5,000 products. The rest? Filter pages, search results pages, parameter variations. All that junk dilutes your site’s authority.
Common Index Bloat Sources
| Bloat Type | Example URL | Fix |
|---|---|---|
| Parameter URLs | /products?color=red&size=xl | Canonical to base URL or noindex |
| Pagination | /blog/page/47/ | rel="prev/next" or noindex beyond page 1-2 |
| Tag/Category pages | /tag/blue-widgets/ | Noindex if thin content |
| Internal search | /search?q=widget | Block in robots.txt |
| Calendar/date archives | /2024/03/15/ | Noindex date-based archives |
| Faceted navigation | /products/category/price-under-50/ | Canonical to category page |
Canonical Tag Audit
Canonical tags tell Google which version of a page is the “real” one.
In Screaming Frog: Go to the Canonicals tab. Filter for issues:
- Missing
- Self-referencing (usually good)
- Canonicalized to another URL
- Non-indexable canonical
The rules:
- Every indexable page needs a canonical tag
- Unique pages should self-reference
- Duplicate/variant pages should point to the canonical version
- Canonical URL must return 200 status
- Don’t mix signals (noindex + canonical to different page = confusing)
AI Prompt: Indexation Analysis
Analyze this indexation data and identify issues:
SITE: [DOMAIN]
DATA:
- Pages in sitemap: [NUMBER]
- Google indexed (site: search): [NUMBER]
- Search Console indexed: [NUMBER]
- Screaming Frog crawled: [NUMBER]
SEARCH CONSOLE PAGE REPORT:
[Paste the status breakdown - how many pages in each category]
SCREAMING FROG CANONICAL REPORT:
[Paste or summarize canonical tag findings]
Analyze:
1. INDEX BLOAT CHECK
- Is indexed count higher than it should be?
- What page types might be causing bloat?
- Prioritized list of pages to noindex
2. INDEX GAP CHECK
- Important pages not being indexed?
- Why might they be excluded?
- Specific fixes for each gap
3. CANONICAL ISSUES
- Any problematic canonical patterns?
- Chain canonicals?
- Mixed signals?
4. DUPLICATE CONTENT SIGNALS
- HTTP/HTTPS versions both indexed?
- www/non-www both indexed?
- Trailing slash variations?
- Parameter variations?
Output a prioritized fix list:
| Priority | Issue | Pages Affected | Specific Fix |
Step 3: Site Speed Audit
Speed is a ranking factor. But more importantly for your business, speed is a conversion factor.
A 1-second delay in page load drops conversions by about 7%. On mobile, users expect pages in under 3 seconds. Miss that, and they’re gone before they ever see your offer.
Core Web Vitals Explained
Google measures speed with three Core Web Vitals. Know these numbers:
| Metric | What It Measures | Good | Needs Work | Poor |
|---|---|---|---|---|
| LCP (Largest Contentful Paint) | How fast main content loads | < 2.5s | 2.5-4s | > 4s |
| INP (Interaction to Next Paint) | Response time when user clicks | < 200ms | 200-500ms | > 500ms |
| CLS (Cumulative Layout Shift) | Visual stability (stuff jumping around) | < 0.1 | 0.1-0.25 | > 0.25 |
Note: INP replaced FID (First Input Delay) in March 2024. If you see FID in guides, they’re outdated.
Testing Your Speed
Use these tools in this order:
1. Google PageSpeed Insights - pagespeed.web.dev
- Tests both mobile and desktop
- Shows field data (real user metrics) AND lab data (simulated)
- Field data matters more for rankings
2. Chrome DevTools
- Open DevTools > Lighthouse tab
- Run a mobile performance audit
- Good for debugging specific issues
3. WebPageTest - webpagetest.org
- Test from different locations
- See waterfall charts of resource loading
- Identify specific bottlenecks
4. Search Console
- Experience > Core Web Vitals
- Shows which URLs pass/fail
- Groups URLs by similar template
The Usual Suspects
I see the same speed issues on 80% of sites I audit:
| Issue | Impact | Quick Fix | Proper Fix |
|---|---|---|---|
| Unoptimized images | High LCP | Compress existing images | Serve WebP, lazy load below-fold |
| Too much JavaScript | High LCP, INP | Remove unused scripts | Code split, defer non-critical |
| Render-blocking CSS | High LCP | Inline critical CSS | Optimize CSS delivery |
| No caching | Repeat visits slow | Add cache headers | CDN + browser caching strategy |
| Slow server (TTFB) | Everything slow | Better hosting | Optimize backend, add caching layer |
| Layout shifts | High CLS | Add width/height to images | Reserve space for dynamic content |
| Third-party scripts | All metrics | Audit and remove unnecessary | Load async, self-host critical ones |
AI Prompt: Speed Optimization Analysis
Run PageSpeed Insights on your key pages, then use this prompt:
Analyze this PageSpeed Insights data and create an optimization plan:
PAGE: [URL]
DEVICE: [Mobile/Desktop]
CORE WEB VITALS (Field Data):
- LCP: [VALUE]
- INP: [VALUE]
- CLS: [VALUE]
OPPORTUNITIES LISTED:
[Paste the opportunities section from PageSpeed]
DIAGNOSTICS LISTED:
[Paste the diagnostics section from PageSpeed]
Create an optimization plan:
## CRITICAL FIXES (Do immediately)
For each issue blocking good CWV scores:
| Issue | Current Impact | Fix | Expected Improvement |
## HIGH IMPACT OPPORTUNITIES
For each opportunity with >1s potential savings:
| Opportunity | Potential Savings | How To Implement |
## QUICK WINS
Fixes that take <30 minutes and improve metrics:
1. [Fix with specific implementation steps]
2. [Fix with specific implementation steps]
3. [Fix with specific implementation steps]
## RESOURCE AUDIT
- Largest resources by size (potential to optimize)
- Render-blocking resources (need to defer/async)
- Unused JavaScript/CSS (can remove)
## IMPLEMENTATION ORDER
Prioritized list of what to fix first based on:
1. Impact on Core Web Vitals
2. Difficulty to implement
3. Dependencies between fixes
Include specific code snippets where helpful.
Speed Optimization by Platform
Different platforms have different solutions:
WordPress:
- Use a caching plugin (WP Rocket, W3 Total Cache)
- Optimize images with ShortPixel or Imagify
- Consider a lightweight theme or headless approach
Shopify:
- Use Shopify’s built-in image optimization
- Minimize apps (each one adds JavaScript)
- Use Shopify CDN, don’t override it
Hugo/Static Sites:
- Already fast by default
- Focus on image optimization
- Minify CSS/JS in build process
Step 4: Mobile Audit
Mobile-first indexing isn’t new anymore. It’s the standard.
Google primarily uses the mobile version of your site for indexing and ranking. If your mobile experience is broken, your rankings will suffer, even for desktop searches.
Quick Mobile Test
Google’s Mobile-Friendly Test: search.google.com/test/mobile-friendly
Paste your URL, get a pass/fail result.
But this test is basic. It won’t catch everything. For a real mobile audit, check manually.
Mobile Audit Checklist
Viewport Configuration:
-
<meta name="viewport" content="width=device-width, initial-scale=1">present - Content scales properly to different screen sizes
- No horizontal scrolling required
Touch Targets:
- Buttons/links minimum 48x48 pixels
- Adequate spacing between clickable elements (8px minimum)
- No tiny links that are hard to tap
Content Parity:
- Same content on mobile as desktop (no hidden content)
- Same structured data
- Same internal links
Usability:
- Text readable without zooming (16px minimum font)
- No intrusive interstitials blocking content
- Forms easy to complete on mobile
The Interstitial Problem
Google explicitly penalizes intrusive interstitials on mobile. These kill your rankings:
- Full-screen popups that cover main content
- Standalone interstitials users must dismiss before accessing content
- Above-the-fold layouts where content is pushed below a popup
What’s okay:
- Cookie consent banners (legally required)
- Age verification (legally required)
- Small banners that use reasonable screen space
AI Prompt: Mobile Experience Audit
Analyze the mobile experience for this page:
URL: [URL]
Based on the page content and structure, identify:
1. VIEWPORT ISSUES
- Is viewport meta tag correctly configured?
- Any content width problems?
2. TOUCH TARGET PROBLEMS
- Links or buttons that might be too small?
- Elements too close together?
3. TEXT READABILITY
- Font sizes that might be too small?
- Line spacing issues?
4. CONTENT PARITY
- Any content hidden on mobile?
- Missing navigation elements?
5. USABILITY ISSUES
- Intrusive popups/interstitials?
- Form field problems?
- Horizontal scrolling?
Output:
| Issue | Location | Severity | Fix |
Step 5: Security Audit
This one’s straightforward. HTTPS is required.
If your site isn’t fully on HTTPS, you have a problem. Google has used HTTPS as a ranking signal since 2014. More importantly, browsers now show “Not Secure” warnings for HTTP sites.
Security Checklist
- Valid SSL certificate installed
- Certificate not expiring soon
- ALL pages serve over HTTPS
- HTTP requests 301 redirect to HTTPS
- No mixed content warnings
- HSTS header implemented (optional but recommended)
Checking Your SSL
Quick check: Visit your site. Look for the padlock icon in the browser address bar. Click it to see certificate details.
Detailed check: Use SSL Labs Server Test
This grades your SSL configuration from A to F. Aim for A or A+.
Mixed Content Issues
Mixed content = HTTPS page loading HTTP resources (images, scripts, stylesheets).
This triggers browser warnings and can break functionality.
Find mixed content in Screaming Frog:
- Go to Security tab
- Filter for “Mixed Content”
- Export the list of insecure resources
Common culprits:
- Hardcoded HTTP image URLs
- Third-party scripts loaded over HTTP
- Embedded content (iframes, videos)
Fix: Update all resource URLs to HTTPS. If the resource doesn’t support HTTPS, find an alternative or host it yourself.
Security Issues Table
| Issue | Impact | Fix |
|---|---|---|
| No SSL certificate | “Not Secure” warning, no ranking | Install certificate (free with Let’s Encrypt) |
| Expired certificate | Site shows error | Renew immediately |
| HTTP still accessible | Duplicate content, insecure | 301 redirect HTTP to HTTPS |
| Mixed content | Browser warnings | Update all resources to HTTPS |
| Weak SSL configuration | Security vulnerabilities | Update server config, use modern ciphers |
Step 6: Structured Data Audit
Structured data tells search engines what your content is about.
Not a direct ranking factor, but it unlocks rich results, those search listings with stars, images, FAQs. Rich results get higher click-through rates, which means more traffic from the same rankings.
For a complete guide on implementing schema, see our Schema & JSON-LD Guide.
Schema Types You Should Have
| Page Type | Required Schema | Optional/Enhanced |
|---|---|---|
| Homepage | Website, Organization | SiteNavigationElement |
| Blog posts | Article or BlogPosting | FAQ, HowTo, Author |
| Product pages | Product | Review, AggregateRating, Offer |
| Service pages | Service | FAQ, Review |
| Local business | LocalBusiness | OpeningHours, Review |
| All pages | BreadcrumbList | - |
Testing Your Structured Data
Google Rich Results Test: search.google.com/test/rich-results
Paste your URL. See what schema Google finds and if you’re eligible for rich results.
Schema Markup Validator: validator.schema.org
More detailed validation against Schema.org specs.
In Screaming Frog:
- Configuration > Spider > Extraction > JSON-LD
- After crawl, go to Structured Data tab
- See all schema found across the site
Common Schema Mistakes
| Mistake | Impact | Fix |
|---|---|---|
| Missing required properties | Schema invalid, no rich results | Add all required fields |
| Wrong schema type | Misleading to Google | Use correct type for content |
| Mismatched data | Schema doesn’t match page content | Keep schema and content aligned |
| Duplicate schema | Confusing signals | One schema block per type per page |
| HTTP URLs in schema | Inconsistent with HTTPS site | Use HTTPS URLs |
| Missing breadcrumbs | No breadcrumb rich results | Add BreadcrumbList to all pages |
AI Prompt: Schema Audit and Generation
Audit the structured data on this page and suggest improvements:
URL: [URL]
PAGE TYPE: [blog post/product page/service page/etc.]
CURRENT SCHEMA (if any):
[Paste existing JSON-LD]
PAGE CONTENT SUMMARY:
[Brief description of what the page is about]
Tasks:
1. VALIDATE CURRENT SCHEMA
- Any syntax errors?
- Missing required properties?
- Values match page content?
2. IDENTIFY MISSING SCHEMA
- What schema types should this page have?
- What rich results could this page be eligible for?
3. GENERATE IMPROVED SCHEMA
Provide complete JSON-LD that:
- Includes all required properties
- Includes recommended properties
- Is properly nested
- Uses correct data types
- Matches the actual page content
Output format:
VALIDATION ISSUES:
| Issue | Severity | Fix |
MISSING OPPORTUNITIES:
- [Schema type that should be added]
RECOMMENDED JSON-LD:
```json
[Complete schema code]
RICH RESULT ELIGIBILITY:
- [List which rich results this schema enables]
---
## Step 7: AI Crawler Optimization
Most technical SEO guides skip this entirely.
AI platforms like ChatGPT, Perplexity, and Google's AI Overviews drive real traffic now. These systems use their own crawlers to find and understand content.
**If you're blocking AI crawlers or your content isn't AI-friendly, you're invisible to a growing traffic source.**
### The AI Crawlers You Need to Know
| Crawler | Platform | Purpose |
|---------|----------|---------|
| GPTBot | OpenAI | Training data (optional to allow) |
| OAI-SearchBot | ChatGPT | Answers in ChatGPT search (important) |
| ChatGPT-User | ChatGPT | When users ask ChatGPT to read a URL (important) |
| ClaudeBot | Anthropic Claude | Training and retrieval |
| PerplexityBot | Perplexity | Answer generation |
| Google-Extended | Google | AI training (separate from main Googlebot) |
### Robots.txt for AI Crawlers
Check your robots.txt. Are you accidentally blocking AI crawlers?
**Default (block AI training, allow AI search):**
Block AI training crawlers
User-agent: GPTBot Disallow: /
User-agent: Google-Extended Disallow: /
Allow AI search crawlers
User-agent: OAI-SearchBot Allow: /
User-agent: ChatGPT-User Allow: /
User-agent: PerplexityBot Allow: /
**Allow everything (maximize AI visibility):**
AI crawlers allowed
User-agent: GPTBot Allow: /
User-agent: OAI-SearchBot Allow: /
User-agent: ChatGPT-User Allow: /
User-agent: ClaudeBot Allow: /
User-agent: PerplexityBot Allow: /
### Optimizing Content for AI Retrieval
AI systems parse your content differently than traditional search crawlers. They want clear, direct answers.
**What helps AI understand your content:**
1. **Clear structure** - Proper heading hierarchy (H1 > H2 > H3)
2. **Direct answers** - Put the answer early, then explain
3. **Lists and tables** - Easy to parse and cite
4. **FAQ sections** - Question-answer format works well
5. **No JavaScript-only content** - AI crawlers often can't execute JS
**What hurts:**
1. **Content behind logins** - Can't be crawled
2. **Heavy JavaScript rendering** - May not be accessible
3. **Thin content** - Nothing valuable to cite
4. **Duplicate content everywhere** - Confuses which page to cite
### AI Prompt: AI Visibility Audit
Audit this site’s visibility to AI crawlers and answer engines:
DOMAIN: [DOMAIN]
ROBOTS.TXT: [Paste robots.txt content]
SAMPLE CONTENT PAGE: [Paste content from one representative page]
Analyze:
CRAWLER ACCESS
- Which AI crawlers are blocked?
- Which are allowed?
- Recommendations for changes?
CONTENT ACCESSIBILITY
- Is key content in HTML or JavaScript-rendered?
- Any content behind login walls?
- Clear heading structure?
AI-FRIENDLY FORMATTING
- Has direct answers (not just buildup)?
- Uses lists and tables?
- Has FAQ content?
- Clear, citable statements?
CITATION POTENTIAL
- What topics could this site be cited for?
- What content gaps would help AI citations?
Recommendations: | Area | Current State | Recommendation | Priority |
---
## Creating the Final Audit Report
You've gathered data from every section. Now compile it into an actionable report.
This is where AI helps most. Feed it all your findings and it synthesizes everything into a prioritized action plan.
### AI Prompt: Complete Audit Report Generation
Compile these technical SEO audit findings into an executive report:
SITE: [DOMAIN] AUDIT DATE: [DATE]
CRAWLABILITY FINDINGS: [Paste findings from Step 1]
INDEXATION FINDINGS: [Paste findings from Step 2]
SPEED FINDINGS: [Paste findings from Step 3]
MOBILE FINDINGS: [Paste findings from Step 4]
SECURITY FINDINGS: [Paste findings from Step 5]
STRUCTURED DATA FINDINGS: [Paste findings from Step 6]
AI VISIBILITY FINDINGS: [Paste findings from Step 7]
Generate a professional audit report:
EXECUTIVE SUMMARY
Overall Site Health: [Score 1-10] Critical Issues: [count] High Priority Issues: [count] Total Issues Found: [count]
Key Findings:
- [Most important finding]
- [Second most important]
- [Third most important]
FINDINGS BY CATEGORY
For each category, provide:
- Status: Good / Needs Work / Critical
- Key issues found
- Quick wins available
PRIORITIZED ACTION PLAN
CRITICAL - Fix This Week: | # | Issue | Category | Impact | Effort | Specific Fix |
HIGH - Fix Within 2 Weeks: | # | Issue | Category | Impact | Effort | Specific Fix |
MEDIUM - Fix Within 30 Days: | # | Issue | Category | Impact | Effort | Specific Fix |
LOW - Fix When Convenient: | # | Issue | Category | Impact | Effort | Specific Fix |
QUICK WINS
List 5 fixes that take <30 minutes and have immediate impact.
RECOMMENDED NEXT AUDIT
- Date: [30/60/90 days based on issues found]
- Focus areas: [based on what was most problematic]
Be specific with fixes. Include code snippets where helpful.
---
## Tools Summary
Here's everything mentioned in this guide:
**Crawling & Analysis:**
- [Screaming Frog SEO Spider](https://www.screamingfrog.co.uk/seo-spider/) - The go-to technical SEO crawler (free up to 500 URLs)
- [Google Search Console](https://search.google.com/search-console) - Free, essential for indexation data
**Speed Testing:**
- [PageSpeed Insights](https://pagespeed.web.dev) - Google's official speed test
- [WebPageTest](https://www.webpagetest.org) - Detailed waterfall analysis
- [GTmetrix](https://gtmetrix.com) - Alternative speed testing
**Validation:**
- [Google Rich Results Test](https://search.google.com/test/rich-results) - Schema validation
- [SSL Labs](https://www.ssllabs.com/ssltest/) - SSL certificate grading
- [Mobile-Friendly Test](https://search.google.com/test/mobile-friendly) - Quick mobile check
**AI Tools:**
- Claude for analysis prompts
- Export data as CSV and feed to AI for bulk analysis
---
## What to Do Next
Technical SEO isn't glamorous. It's the foundation that makes everything else work.
Your content strategy means nothing if Google can't crawl your site. Your link building is wasted if pages aren't being indexed. Your traffic potential is capped if your site is slow.
**Here's your action plan:**
1. Download Screaming Frog (free version handles 500 URLs)
2. Run a crawl of your site
3. Pull your indexation data from Search Console
4. Run PageSpeed Insights on your 3 most important pages
5. Feed everything to Claude using the prompts above
6. Get your prioritized fix list
7. Execute by priority
8. Re-audit quarterly
What used to take days now takes hours. AI handles the analysis. You handle the fixes.
**Don't skip AI crawler optimization.** ChatGPT, Perplexity, and AI Overviews are driving real traffic now. Make sure your site is visible to them.
For your next step, check out our [AI Site Architecture](/build/websites/content-sites/ai-site-architecture/) guide to build a crawlable site structure from the ground up, or our [Schema & JSON-LD Guide](/build/websites/content-sites/schema-json-ld-guide/) if structured data was your weak point in this audit.
---
## Recommended Reading
**On-Page SEO:**
- [AI Keyword Optimization](/acquire/seo/on-page/ai-keyword-optimization/) - Optimize content for target keywords
- [AI Content Structure](/acquire/seo/on-page/ai-content-structure/) - Structure content for rankings
**Building:**
- [Schema & JSON-LD Guide](/build/websites/content-sites/schema-json-ld-guide/) - Complete structured data implementation
- [AI Site Architecture](/build/websites/content-sites/ai-site-architecture/) - Build a crawlable site structure
- [AI Silo Structure](/build/websites/content-sites/ai-silo-structure/) - Organize content for topical authority