# Sitemap and Robots.txt Setup ## Overview A sitemap.xml and robots.txt have been generated for the Shopify AI App Builder to help with SEO indexing in Google Search Console and other search engines. ## Files Created ### 1. sitemap.xml Located at: `chat/public/sitemap.xml` The sitemap includes only publicly accessible pages that search engines should index: - **/ (Home page)** - Priority 1.0, daily updates - **/features** - Priority 0.9, weekly updates - **/pricing** - Priority 0.9, weekly updates - **/affiliate** - Priority 0.8, monthly updates - **/affiliate-signup** - Priority 0.7, monthly updates - **/docs** - Priority 0.8, weekly updates - **/terms** - Priority 0.3, yearly updates - **/privacy** - Priority 0.3, yearly updates ### 2. robots.txt Located at: `chat/public/robots.txt` The robots.txt file: - Allows all search engines to crawl public content - Disallows crawling of authenticated and admin areas: - `/admin` - Admin dashboard - `/apps` - User dashboard (requires auth) - `/builder` - App builder (requires auth) - `/settings` - User settings (requires auth) - `/affiliate-dashboard` - Affiliate dashboard (requires auth) - `/api/` - API endpoints - Includes a reference to the sitemap location ### 3. Server Routes Updated Modified `chat/server.js` to serve both files with proper caching headers: - **Content-Type**: Correct MIME types (application/xml for sitemap, text/plain for robots.txt) - **Cache-Control**: 24-hour cache (86400 seconds) to reduce server load ## Configuration Required Before deploying, you need to update the domain name in both files: ### Step 1: Update sitemap.xml Replace `https://your-domain.com` with your actual domain name in all URL entries. Example: ```xml https://your-domain.com/ https://shopify-app-builder.example.com/ ``` ### Step 2: Update robots.txt Replace the sitemap URL reference with your actual domain. Example: ```txt # Before Sitemap: https://your-domain.com/sitemap.xml # After Sitemap: https://shopify-app-builder.example.com/sitemap.xml ``` ### Step 3: Environment Variable (Optional) The server uses the `PUBLIC_BASE_URL` environment variable to determine the base URL. If set, the domain in the sitemap should match this value. ```bash export PUBLIC_BASE_URL=https://shopify-app-builder.example.com ``` ## Submitting to Google Search Console 1. Go to [Google Search Console](https://search.google.com/search-console) 2. Add your property (your domain) 3. Verify ownership 4. Navigate to "Sitemaps" in the left sidebar 5. Enter `sitemap.xml` in the "Add a new sitemap" field 6. Click "Submit" ## Why These Pages Are Excluded The following pages are intentionally excluded from the sitemap: ### Authenticated Pages (require login) - `/apps` - User's personal app dashboard - `/builder` - Individual app building interface - `/settings` - User account settings - `/affiliate-dashboard` - Affiliate dashboard ### Admin Pages - `/admin` - Admin dashboard and management pages - `/admin/accounts` - User account management - `/admin/login` - Admin login page ### Functional/Technical Pages - `/login`, `/signup` - Authentication entry points - `/verify-email` - Email verification flow - `/reset-password` - Password reset flow - `/api/*` - API endpoints (not meant for indexing) - `/uploads/*` - User-uploaded files These pages are either: 1. Behind authentication (search engines can't access them) 2. Functional pages that don't provide value to search users 3. Administrative interfaces 4. Temporary/stateful pages (like verification flows) ## Maintaining the Sitemap ### When to Update - When you add new public pages (e.g., blog posts, landing pages) - When you change the URL structure - When you make significant content updates ### Updating Lastmod Dates The current `lastmod` date is set to 2025-01-08. Update this when making significant changes to pages. ### Priority Guidelines - **1.0**: Homepage and most important pages - **0.9**: Key marketing pages (features, pricing) - **0.8**: Secondary marketing pages (affiliate, docs) - **0.7**: Tertiary pages (affiliate-signup) - **0.3**: Legal pages (terms, privacy) ### Change Frequency Guidelines - **daily**: Homepage (content changes frequently) - **weekly**: Features, pricing, docs (regular updates) - **monthly**: Affiliate pages (occasional updates) - **yearly**: Legal pages (rare changes) ## Testing To verify the sitemap is working correctly: ```bash # Test sitemap endpoint curl https://your-domain.com/sitemap.xml # Test robots.txt endpoint curl https://your-domain.com/robots.txt # Validate sitemap XML structure curl https://your-domain.com/sitemap.xml | xmllint --format - ``` ## Security Notes - The sitemap and robots.txt are publicly accessible by design - No sensitive information is exposed in these files - Caching headers help reduce server load - The robots.txt properly blocks sensitive areas from indexing ## Additional SEO Recommendations 1. **Add meta tags** to each page for SEO 2. **Create structured data** (JSON-LD) for rich snippets 3. **Optimize page titles and meta descriptions** 4. **Create a blog** and add blog posts to the sitemap 5. **Generate sitemap indexes** if you have multiple sitemaps (e.g., separate sitemaps for different content types) 6. **Use canonical URLs** to prevent duplicate content issues 7. **Implement Open Graph tags** for better social media sharing