AI-Powered URL Selection
Uses Claude AI to intelligently select relevant URLs from sitemaps based on your specific objectives.
SmartCrawler uses Claude AI to intelligently select and analyze web pages based on your specific objectives. Automatically discover sitemaps, select relevant URLs, and get detailed content analysis.
Watch how SmartCrawler intelligently selects and analyzes web content
Uses Claude AI to intelligently select relevant URLs from sitemaps based on your specific objectives.
Finds and parses XML sitemaps across multiple domains automatically.
AI-powered analysis of scraped content for objective-specific insights and structured data extraction.
Crawl multiple websites in a single session with intelligent domain handling.
Scrolls through pages to capture JavaScript-rendered content that traditional crawlers miss.
Results saved in JSON format with structured entities for further analysis and processing.
export ANTHROPIC_API_KEY="your-api-key-here"
                geckodriver --port 4444
                smart-crawler --objective "Find pricing information" --domains "example.com" --max-urls 5
                Research competitor pricing across multiple stores
                  smart-crawler \
                    --objective "Find product pricing, discounts, and shipping costs" \
                    --domains "shop1.com,shop2.com,competitor.com" \
                    --max-urls 15 \
                    --output pricing-research.json
                
              Extract contact information and team details
                  smart-crawler \
                    -o "Find company contact information, team members, and office locations" \
                    -d "company.com" \
                    -m 8 --delay 2000 -v
                
              Find API docs and developer resources
                  smart-crawler \
                    --objective "Find API documentation, integration guides, and developer resources" \
                    --domains "docs.example.com,api.service.com" \
                    --max-urls 20 --output api-docs.json