Files
shopify-ai-backup/SITEMAP_SETUP.md

5.3 KiB

Sitemap and Robots.txt Setup

Overview

A sitemap.xml and robots.txt have been generated for the Shopify AI App Builder to help with SEO indexing in Google Search Console and other search engines.

Files Created

1. sitemap.xml

Located at: chat/public/sitemap.xml

The sitemap includes only publicly accessible pages that search engines should index:

  • / (Home page) - Priority 1.0, daily updates
  • /features - Priority 0.9, weekly updates
  • /pricing - Priority 0.9, weekly updates
  • /affiliate - Priority 0.8, monthly updates
  • /affiliate-signup - Priority 0.7, monthly updates
  • /docs - Priority 0.8, weekly updates
  • /terms - Priority 0.3, yearly updates
  • /privacy - Priority 0.3, yearly updates

2. robots.txt

Located at: chat/public/robots.txt

The robots.txt file:

  • Allows all search engines to crawl public content
  • Disallows crawling of authenticated and admin areas:
    • /admin - Admin dashboard
    • /apps - User dashboard (requires auth)
    • /builder - App builder (requires auth)
    • /settings - User settings (requires auth)
    • /affiliate-dashboard - Affiliate dashboard (requires auth)
    • /api/ - API endpoints
  • Includes a reference to the sitemap location

3. Server Routes Updated

Modified chat/server.js to serve both files with proper caching headers:

  • Content-Type: Correct MIME types (application/xml for sitemap, text/plain for robots.txt)
  • Cache-Control: 24-hour cache (86400 seconds) to reduce server load

Configuration Required

Before deploying, you need to update the domain name in both files:

Step 1: Update sitemap.xml

Replace https://your-domain.com with your actual domain name in all URL entries.

Example:

<!-- Before -->
<loc>https://your-domain.com/</loc>

<!-- After -->
<loc>https://shopify-app-builder.example.com/</loc>

Step 2: Update robots.txt

Replace the sitemap URL reference with your actual domain.

Example:

# Before
Sitemap: https://your-domain.com/sitemap.xml

# After
Sitemap: https://shopify-app-builder.example.com/sitemap.xml

Step 3: Environment Variable (Optional)

The server uses the PUBLIC_BASE_URL environment variable to determine the base URL. If set, the domain in the sitemap should match this value.

export PUBLIC_BASE_URL=https://shopify-app-builder.example.com

Submitting to Google Search Console

  1. Go to Google Search Console
  2. Add your property (your domain)
  3. Verify ownership
  4. Navigate to "Sitemaps" in the left sidebar
  5. Enter sitemap.xml in the "Add a new sitemap" field
  6. Click "Submit"

Why These Pages Are Excluded

The following pages are intentionally excluded from the sitemap:

Authenticated Pages (require login)

  • /apps - User's personal app dashboard
  • /builder - Individual app building interface
  • /settings - User account settings
  • /affiliate-dashboard - Affiliate dashboard

Admin Pages

  • /admin - Admin dashboard and management pages
  • /admin/accounts - User account management
  • /admin/login - Admin login page

Functional/Technical Pages

  • /login, /signup - Authentication entry points
  • /verify-email - Email verification flow
  • /reset-password - Password reset flow
  • /api/* - API endpoints (not meant for indexing)
  • /uploads/* - User-uploaded files

These pages are either:

  1. Behind authentication (search engines can't access them)
  2. Functional pages that don't provide value to search users
  3. Administrative interfaces
  4. Temporary/stateful pages (like verification flows)

Maintaining the Sitemap

When to Update

  • When you add new public pages (e.g., blog posts, landing pages)
  • When you change the URL structure
  • When you make significant content updates

Updating Lastmod Dates

The current lastmod date is set to 2025-01-08. Update this when making significant changes to pages.

Priority Guidelines

  • 1.0: Homepage and most important pages
  • 0.9: Key marketing pages (features, pricing)
  • 0.8: Secondary marketing pages (affiliate, docs)
  • 0.7: Tertiary pages (affiliate-signup)
  • 0.3: Legal pages (terms, privacy)

Change Frequency Guidelines

  • daily: Homepage (content changes frequently)
  • weekly: Features, pricing, docs (regular updates)
  • monthly: Affiliate pages (occasional updates)
  • yearly: Legal pages (rare changes)

Testing

To verify the sitemap is working correctly:

# Test sitemap endpoint
curl https://your-domain.com/sitemap.xml

# Test robots.txt endpoint
curl https://your-domain.com/robots.txt

# Validate sitemap XML structure
curl https://your-domain.com/sitemap.xml | xmllint --format -

Security Notes

  • The sitemap and robots.txt are publicly accessible by design
  • No sensitive information is exposed in these files
  • Caching headers help reduce server load
  • The robots.txt properly blocks sensitive areas from indexing

Additional SEO Recommendations

  1. Add meta tags to each page for SEO
  2. Create structured data (JSON-LD) for rich snippets
  3. Optimize page titles and meta descriptions
  4. Create a blog and add blog posts to the sitemap
  5. Generate sitemap indexes if you have multiple sitemaps (e.g., separate sitemaps for different content types)
  6. Use canonical URLs to prevent duplicate content issues
  7. Implement Open Graph tags for better social media sharing