Frequently Asked Questions
Everything you need to know about Wag-Tail AI Gateway
Getting Started
New to Wag-Tail? Start with our installation guide and basic configuration.
Installation GuideIntegration & Setup
How does Wag-Tail integrate with existing API gateways?
Wag-Tail can be deployed in several ways:
- Standalone deployment: Replace your existing LLM integration completely
- Behind existing gateways: Deploy as a microservice behind Kong, AWS API Gateway, etc.
- Sidecar pattern: Deploy alongside existing services in Kubernetes
- Lambda/serverless: Deploy as serverless functions for auto-scaling
Wag-Tail exposes standard REST APIs that integrate with any HTTP client or API gateway.
What are the system requirements and deployment options?
Minimum Requirements:
- Python 3.8+ or Docker
- 512MB RAM (1GB+ recommended)
- 1 CPU core (2+ recommended for production)
- Network access to LLM providers
Deployment Options:
- Docker containers (recommended)
- Kubernetes with Helm charts
- AWS Lambda, Azure Functions, Google Cloud Run
- Traditional VMs or bare metal
- Cloud-native PaaS platforms
Can I migrate from existing LLM integrations gradually?
Yes! Wag-Tail supports gradual migration strategies:
- Parallel deployment: Run both systems simultaneously
- Traffic splitting: Route percentage of traffic to Wag-Tail
- Feature-based migration: Migrate specific use cases first
- Environment-based: Start with dev/staging environments
The OpenAI-compatible API means minimal code changes for existing integrations.
Configuration
How do I configure LLM provider fallback chains?
Configure fallback chains in the YAML configuration:
routing:
chains:
- name: "production"
providers:
- provider: "openai"
models: ["gpt-4", "gpt-3.5-turbo"]
priority: 1
- provider: "anthropic"
models: ["claude-3-sonnet"]
priority: 2
- provider: "google"
models: ["gemini-pro"]
priority: 3
health_check:
enabled: true
interval: 30s
timeout: 10s
What authentication methods are supported?
Wag-Tail supports multiple authentication methods:
- API Keys: Simple token-based authentication
- JWT tokens: For stateless authentication
- OAuth 2.0: Integration with existing identity providers
- mTLS: Certificate-based authentication
- Custom headers: Flexible authentication patterns
Enterprise edition includes SAML and LDAP integration.
How do I set up caching for better performance?
Configure caching with Redis or in-memory storage:
cache:
type: "redis" # or "memory"
host: "localhost"
port: 6379
ttl: 3600 # 1 hour
max_size: "100MB"
rules:
- pattern: "gpt-*"
ttl: 1800 # 30 minutes for GPT models
- pattern: "claude-*"
ttl: 3600 # 1 hour for Claude models
Caching reduces costs and improves response times for repeated queries.
Security & Compliance
How does Wag-Tail handle sensitive data and PII?
Wag-Tail includes comprehensive PII protection:
- PII Detection: Automatic detection of emails, phone numbers, SSNs, etc.
- Data Masking: Replace sensitive data with tokens
- Request Blocking: Block requests containing PII
- Audit Logging: Track all PII detection events
- Custom Patterns: Define organization-specific sensitive patterns
Enterprise edition includes advanced DLP (Data Loss Prevention) features.
What security certifications does Wag-Tail support?
Wag-Tail is designed to support various compliance frameworks:
- SOC 2 Type II: Comprehensive audit logging and controls
- GDPR: Data minimization and privacy controls
- HIPAA: Healthcare data protection (Enterprise)
- ISO 27001: Information security management
- FedRAMP: Government cloud security (Enterprise)
Contact our compliance team for certification assistance.
How do I prevent prompt injection attacks?
Wag-Tail includes built-in prompt security:
- Jailbreak Detection: Identify common prompt injection patterns
- Content Filtering: Block harmful or inappropriate content
- Rate Limiting: Prevent abuse and DoS attacks
- Input Sanitization: Clean and validate input prompts
- Custom Rules: Define organization-specific security policies
Performance & Scaling
What are the performance benchmarks and scaling limits?
Performance Benchmarks:
- Latency: <5ms overhead for cached responses
- Throughput: 10,000+ requests/second per instance
- Memory: ~200MB base footprint
- CPU: Low CPU usage for most operations
Scaling Options:
- Horizontal scaling with load balancers
- Auto-scaling in Kubernetes
- Serverless scaling with cloud functions
- Enterprise: Multi-region deployment
How do I optimize costs across multiple LLM providers?
Wag-Tail provides several cost optimization features:
- Smart Routing: Route to cheapest available provider
- Caching: Reduce redundant API calls
- Model Selection: Automatic model selection based on complexity
- Usage Analytics: Track costs per group/user/model
- Budget Controls: Set spending limits and alerts
Enterprise edition includes advanced cost analytics and optimization.
Can Wag-Tail handle high-availability deployments?
Yes, Wag-Tail is designed for high availability:
- Stateless Design: Easy horizontal scaling
- Health Checks: Built-in health monitoring
- Circuit Breakers: Automatic failure handling
- Load Balancing: Distribute traffic across instances
- Multi-AZ: Deploy across availability zones
Troubleshooting
Common installation and startup issues
Docker Issues:
- Ensure Docker is running and up to date
- Check port 8000 is not already in use
- Verify environment variables are set correctly
Python Issues:
- Use Python 3.8 or higher
- Install requirements:
pip install -r requirements.txt
- Check virtual environment activation
Configuration Issues:
- Validate YAML syntax with online tools
- Check API keys are correctly set
- Verify network connectivity to LLM providers
How to debug API request failures?
Enable Debug Logging:
logging:
level: DEBUG
format: "detailed"
output: "console"
Check Common Issues:
- API key validity and permissions
- Rate limiting from providers
- Network connectivity and firewall rules
- Request format and headers
- Provider-specific error codes
Health Check Endpoint:
Use GET /health
to check system status and provider connectivity.
Still Have Questions?
Join our active community for technical support, feature discussions, and best practices sharing.