5.3 KiB
Sitemap and Robots.txt Setup
Overview
A sitemap.xml and robots.txt have been generated for the Shopify AI App Builder to help with SEO indexing in Google Search Console and other search engines.
Files Created
1. sitemap.xml
Located at: chat/public/sitemap.xml
The sitemap includes only publicly accessible pages that search engines should index:
- / (Home page) - Priority 1.0, daily updates
- /features - Priority 0.9, weekly updates
- /pricing - Priority 0.9, weekly updates
- /affiliate - Priority 0.8, monthly updates
- /affiliate-signup - Priority 0.7, monthly updates
- /docs - Priority 0.8, weekly updates
- /terms - Priority 0.3, yearly updates
- /privacy - Priority 0.3, yearly updates
2. robots.txt
Located at: chat/public/robots.txt
The robots.txt file:
- Allows all search engines to crawl public content
- Disallows crawling of authenticated and admin areas:
/admin- Admin dashboard/apps- User dashboard (requires auth)/builder- App builder (requires auth)/settings- User settings (requires auth)/affiliate-dashboard- Affiliate dashboard (requires auth)/api/- API endpoints
- Includes a reference to the sitemap location
3. Server Routes Updated
Modified chat/server.js to serve both files with proper caching headers:
- Content-Type: Correct MIME types (application/xml for sitemap, text/plain for robots.txt)
- Cache-Control: 24-hour cache (86400 seconds) to reduce server load
Configuration Required
Before deploying, you need to update the domain name in both files:
Step 1: Update sitemap.xml
Replace https://your-domain.com with your actual domain name in all URL entries.
Example:
<!-- Before -->
<loc>https://your-domain.com/</loc>
<!-- After -->
<loc>https://shopify-app-builder.example.com/</loc>
Step 2: Update robots.txt
Replace the sitemap URL reference with your actual domain.
Example:
# Before
Sitemap: https://your-domain.com/sitemap.xml
# After
Sitemap: https://shopify-app-builder.example.com/sitemap.xml
Step 3: Environment Variable (Optional)
The server uses the PUBLIC_BASE_URL environment variable to determine the base URL. If set, the domain in the sitemap should match this value.
export PUBLIC_BASE_URL=https://shopify-app-builder.example.com
Submitting to Google Search Console
- Go to Google Search Console
- Add your property (your domain)
- Verify ownership
- Navigate to "Sitemaps" in the left sidebar
- Enter
sitemap.xmlin the "Add a new sitemap" field - Click "Submit"
Why These Pages Are Excluded
The following pages are intentionally excluded from the sitemap:
Authenticated Pages (require login)
/apps- User's personal app dashboard/builder- Individual app building interface/settings- User account settings/affiliate-dashboard- Affiliate dashboard
Admin Pages
/admin- Admin dashboard and management pages/admin/accounts- User account management/admin/login- Admin login page
Functional/Technical Pages
/login,/signup- Authentication entry points/verify-email- Email verification flow/reset-password- Password reset flow/api/*- API endpoints (not meant for indexing)/uploads/*- User-uploaded files
These pages are either:
- Behind authentication (search engines can't access them)
- Functional pages that don't provide value to search users
- Administrative interfaces
- Temporary/stateful pages (like verification flows)
Maintaining the Sitemap
When to Update
- When you add new public pages (e.g., blog posts, landing pages)
- When you change the URL structure
- When you make significant content updates
Updating Lastmod Dates
The current lastmod date is set to 2025-01-08. Update this when making significant changes to pages.
Priority Guidelines
- 1.0: Homepage and most important pages
- 0.9: Key marketing pages (features, pricing)
- 0.8: Secondary marketing pages (affiliate, docs)
- 0.7: Tertiary pages (affiliate-signup)
- 0.3: Legal pages (terms, privacy)
Change Frequency Guidelines
- daily: Homepage (content changes frequently)
- weekly: Features, pricing, docs (regular updates)
- monthly: Affiliate pages (occasional updates)
- yearly: Legal pages (rare changes)
Testing
To verify the sitemap is working correctly:
# Test sitemap endpoint
curl https://your-domain.com/sitemap.xml
# Test robots.txt endpoint
curl https://your-domain.com/robots.txt
# Validate sitemap XML structure
curl https://your-domain.com/sitemap.xml | xmllint --format -
Security Notes
- The sitemap and robots.txt are publicly accessible by design
- No sensitive information is exposed in these files
- Caching headers help reduce server load
- The robots.txt properly blocks sensitive areas from indexing
Additional SEO Recommendations
- Add meta tags to each page for SEO
- Create structured data (JSON-LD) for rich snippets
- Optimize page titles and meta descriptions
- Create a blog and add blog posts to the sitemap
- Generate sitemap indexes if you have multiple sitemaps (e.g., separate sitemaps for different content types)
- Use canonical URLs to prevent duplicate content issues
- Implement Open Graph tags for better social media sharing