# Sitemap and Robots.txt Setup
## Overview
A sitemap.xml and robots.txt have been generated for the Shopify AI App Builder to help with SEO indexing in Google Search Console and other search engines.
## Files Created
### 1. sitemap.xml
Located at: `chat/public/sitemap.xml`
The sitemap includes only publicly accessible pages that search engines should index:
- **/ (Home page)** - Priority 1.0, daily updates
- **/features** - Priority 0.9, weekly updates
- **/pricing** - Priority 0.9, weekly updates
- **/affiliate** - Priority 0.8, monthly updates
- **/affiliate-signup** - Priority 0.7, monthly updates
- **/docs** - Priority 0.8, weekly updates
- **/terms** - Priority 0.3, yearly updates
- **/privacy** - Priority 0.3, yearly updates
### 2. robots.txt
Located at: `chat/public/robots.txt`
The robots.txt file:
- Allows all search engines to crawl public content
- Disallows crawling of authenticated and admin areas:
- `/admin` - Admin dashboard
- `/apps` - User dashboard (requires auth)
- `/builder` - App builder (requires auth)
- `/settings` - User settings (requires auth)
- `/affiliate-dashboard` - Affiliate dashboard (requires auth)
- `/api/` - API endpoints
- Includes a reference to the sitemap location
### 3. Server Routes Updated
Modified `chat/server.js` to serve both files with proper caching headers:
- **Content-Type**: Correct MIME types (application/xml for sitemap, text/plain for robots.txt)
- **Cache-Control**: 24-hour cache (86400 seconds) to reduce server load
## Configuration Required
Before deploying, you need to update the domain name in both files:
### Step 1: Update sitemap.xml
Replace `https://your-domain.com` with your actual domain name in all URL entries.
Example:
```xml
https://your-domain.com/
https://shopify-app-builder.example.com/
```
### Step 2: Update robots.txt
Replace the sitemap URL reference with your actual domain.
Example:
```txt
# Before
Sitemap: https://your-domain.com/sitemap.xml
# After
Sitemap: https://shopify-app-builder.example.com/sitemap.xml
```
### Step 3: Environment Variable (Optional)
The server uses the `PUBLIC_BASE_URL` environment variable to determine the base URL. If set, the domain in the sitemap should match this value.
```bash
export PUBLIC_BASE_URL=https://shopify-app-builder.example.com
```
## Submitting to Google Search Console
1. Go to [Google Search Console](https://search.google.com/search-console)
2. Add your property (your domain)
3. Verify ownership
4. Navigate to "Sitemaps" in the left sidebar
5. Enter `sitemap.xml` in the "Add a new sitemap" field
6. Click "Submit"
## Why These Pages Are Excluded
The following pages are intentionally excluded from the sitemap:
### Authenticated Pages (require login)
- `/apps` - User's personal app dashboard
- `/builder` - Individual app building interface
- `/settings` - User account settings
- `/affiliate-dashboard` - Affiliate dashboard
### Admin Pages
- `/admin` - Admin dashboard and management pages
- `/admin/accounts` - User account management
- `/admin/login` - Admin login page
### Functional/Technical Pages
- `/login`, `/signup` - Authentication entry points
- `/verify-email` - Email verification flow
- `/reset-password` - Password reset flow
- `/api/*` - API endpoints (not meant for indexing)
- `/uploads/*` - User-uploaded files
These pages are either:
1. Behind authentication (search engines can't access them)
2. Functional pages that don't provide value to search users
3. Administrative interfaces
4. Temporary/stateful pages (like verification flows)
## Maintaining the Sitemap
### When to Update
- When you add new public pages (e.g., blog posts, landing pages)
- When you change the URL structure
- When you make significant content updates
### Updating Lastmod Dates
The current `lastmod` date is set to 2025-01-08. Update this when making significant changes to pages.
### Priority Guidelines
- **1.0**: Homepage and most important pages
- **0.9**: Key marketing pages (features, pricing)
- **0.8**: Secondary marketing pages (affiliate, docs)
- **0.7**: Tertiary pages (affiliate-signup)
- **0.3**: Legal pages (terms, privacy)
### Change Frequency Guidelines
- **daily**: Homepage (content changes frequently)
- **weekly**: Features, pricing, docs (regular updates)
- **monthly**: Affiliate pages (occasional updates)
- **yearly**: Legal pages (rare changes)
## Testing
To verify the sitemap is working correctly:
```bash
# Test sitemap endpoint
curl https://your-domain.com/sitemap.xml
# Test robots.txt endpoint
curl https://your-domain.com/robots.txt
# Validate sitemap XML structure
curl https://your-domain.com/sitemap.xml | xmllint --format -
```
## Security Notes
- The sitemap and robots.txt are publicly accessible by design
- No sensitive information is exposed in these files
- Caching headers help reduce server load
- The robots.txt properly blocks sensitive areas from indexing
## Additional SEO Recommendations
1. **Add meta tags** to each page for SEO
2. **Create structured data** (JSON-LD) for rich snippets
3. **Optimize page titles and meta descriptions**
4. **Create a blog** and add blog posts to the sitemap
5. **Generate sitemap indexes** if you have multiple sitemaps (e.g., separate sitemaps for different content types)
6. **Use canonical URLs** to prevent duplicate content issues
7. **Implement Open Graph tags** for better social media sharing