๐ ๏ธ๐งช Support & Testing Engineer - Complete Guide¶
Comprehensive guide for Support and Testing Engineers at Appgain
๐ฏ Role Overview¶
Support & Testing Engineers at Appgain play a dual role ensuring both customer satisfaction through technical support and system quality through comprehensive testing. This combined role provides a holistic approach to maintaining our multi-platform ecosystem's reliability, performance, and user experience.
๐ Specialized Learning¶
Additional Resources¶
- Appgain Documentation
- Parse Server Documentation
- MongoDB Documentation
- Docker Documentation
- Postman Documentation
- Testing Flow - Comprehensive testing procedures and workflows
- WhatsApp Warming Flow Guide - Complete WhatsApp campaign setup and warmup procedures
๐ ๏ธ Support Infrastructure¶
Support Tools¶
- Slack: Internal team communication
- Confluence: Knowledge base and documentation
- Jira: Bug tracking and issue management
- Grafana: System monitoring and dashboards
- Cyberduck: File transfer tool for AWS S3 and server file system access
Client Communication Channels¶
- WhatsApp Support: +20 111 998 5526 - Direct WhatsApp support for urgent issues
- Email Support: support@appgain.io - Email support for detailed inquiries and documentation
- Support Portal: Freescout ticketing system for structured support requests
Account Creation & Management¶
- Account Creation Portal: https://suitcreator.appgain.io/ - UI for creation of new client accounts
- Coupon Code:
BFL2mMXN- Use this coupon code for account creation - Account Storage: All created accounts are automatically stored in the Client Accounts Spreadsheet for tracking and management
Slack Support Channel Management¶
- Customer Slack Sign-up: https://join.slack.com/t/appgainiocs/signup - Send this link to clients to add them to the Slack support channel
- Purpose: Direct communication channel for real-time support and issue resolution
- Process:
- Create client account using the suit creator portal
- Send the Slack signup link to the client
- Add client to appropriate Slack channels based on their needs
- Monitor and respond to client inquiries in Slack
๐ Client Onboarding Procedures¶
For the complete and up-to-date process, see the dedicated page: - Client Onboarding Procedures
Monitoring Systems¶
- Prometheus: Metrics collection and alerting
- Grafana: Visualization and dashboards
- Loki: Log aggregation and search
- Status Page: Public system status updates
System Monitoring URLs¶
- Prometheus Targets - Fast check over all targets status (up or down), also providing other options and stats about Prometheus instance itself
- Grafana Dashboards
[ask your direct manager for the access]- For showing metrics data in visual and understandable way - Admin Dashboard
[Support Engineer access]- Comprehensive admin interface for support operations, user management, and system monitoring - Admin-Server Dashboard
[auth required and only system admin has access to it]- Admin-Server Dashboard for quick actions and custom checks (WIP)
๐งช Testing Infrastructure¶
Testing Tools¶
- Postman: API testing and automation
- Selenium: Web application testing
- JMeter: Performance testing
Testing Environments¶
- Development: Local testing environment
- Staging: Pre-production testing environment
- Production: Live environment monitoring
- CI/CD: Automated testing in pipelines
๐ Daily System Check¶
Morning System Health Check (9:00 AM)¶
# 1. Check overall system status
curl -s https://status.instabackend.io/ | grep -i "status"
# 2. Verify all Prometheus targets are UP
curl -s "http://monitor.instabackend.io:9090/a../targets" | jq '.data.activeTargets[] | select(.health == "up") | .labels.instance' | wc -l
# 3. Check critical services health
slack channel #system-status
# 4. Daily Status Page Check
# Open https://status.instabackend.io/ and confirm all services are up
# This is a critical daily check for support engineers
Evening Maintenance Check (6:00 PM)¶
# 1. Review daily error logs
tail -n 100 /var/log/appgain/error.log | grep "$(date +%Y-%m-%d)"
# 2. Check backup status
ls -la /backups/$(date +%Y-%m-%d)/
# 3. Verify SSL certificate expiration
echo | openssl s_client -servername appgain.io -connect appgain.io:443 2>/dev/null | openssl x509 -noout -dates
# 4. Monitor resource usage trends
curl -s "http://monitor.instabackend.io:9090/a../query_range?query=cpu_usage&start=$(date -d '1 hour ago' +%s)&end=$(date +%s)&step=60" | jq '.data.result[0].values'
# 5. Check for any pending alerts
curl -s "http://monitor.instabackend.io:9093/a../alerts" | jq '.data[] | select(.status.state == "active")'
# 6. Run end-to-end tests
npm run test:e2e
Weekly System Review (Every Monday)¶
# 1. Generate weekly performance report
curl -s "http://monitor.instabackend.io:9090/a../query_range?query=uptime&start=$(date -d '7 days ago' +%s)&end=$(date +%s)&step=3600" > weekly_uptime_report.json
# 2. Check for security updates
apt list --upgradable | grep security
# 3. Review and rotate logs
logrotate -f /etc/logrotate.d/appgain
# 4. Verify backup integrity
sha256sum /backups/$(date +%Y-%m-%d)/*.tar.gz
# 5. Update status page
curl -X POST "https://status.instabackend.io/a../incidents" \
-H "Content-Type: application/json" \
-d '{"status": "resolved", "message": "Weekly maintenance completed"}'
# 6. Run full test suite
npm run test:full
System Check Checklist¶
- Prometheus Targets: All targets showing as UP
- API Health: All endpoints responding within 200ms
- Database: MongoDB and Redis connections stable
- Disk Space: >20% free space on all servers
- Memory Usage: <80% utilization
- CPU Load: <70% average load
- Network: All services accessible
- SSL Certificates: Valid and not expiring soon
- Backups: Daily backups completed successfully
- Error Logs: No critical errors in last 24 hours
- Queue Lengths: Notification queues processing normally
- Delivery Rates: Email and push notification success rates >95%
- Test Coverage: >80% code coverage maintained
- Test Reliability: >95% test pass rate
- Performance Tests: All performance benchmarks met
๐งช Daily Testing Process¶
This document outlines the daily testing process, including each API collection to run, the expected outcomes, and troubleshooting steps if issues arise. Replace any placeholder variables (e.g., {{SUIT_ID}}, {{APP_API_KEY}}) with their actual values in your environment.
Environment: Appgain.io
1. WhatsApp Lite Daily Check¶
Purpose: Verify that the WhatsApp Lite container is up and responding correctly.
Steps to Execute:
- Open the Daily WhatsApp Lite Checkcollection in Postman
- Select the appropriate environment
- Run the collection
Expected Results:
- All container health-check endpoints should return a 200 OK status
- WhatsApp messages should be sent successfully
Troubleshooting:
If you encounter "Log in first" or "evaluation failed" errors:
- Open the Initialize WhatsApp Lite Account collection
- Locate the "Initialize WhatsApp Lite Account" request
- Retrieve the base URL from the environment (
{{base_url}}) - Append
/whatsapp-loginto the base URL - Send the request and scan the returned QR code with the WhatsApp Lite app
- Retry the Daily WhatsApp Lite Check
2. Daily Notify SMS Check¶
Purpose: Ensure the SMS notification service is operational and delivering messages.
Steps to Execute:
- Open the Daily Notify SMS Check collection in Postman
- Select the appropriate environment
- Run the collection
Expected Results:
- Each SMS send request should return a 202 Accepted status (or similar successful code)
- Messages should appear in the SMS gateway dashboard
Troubleshooting:
If SMS requests fail, check:
- API key validity (
{{APP_API_KEY}}) - SMS gateway credentials and quotas โ
sms_countinappgaincpdatabase - Network connectivity to the SMS provider
3. Automation Daily Check (iKhair, Nabolia)¶
Purpose: Validate that the automation workflows for iKhair and Nabolia are triggering correctly.
Steps to Execute:
- Use the Appgain.io environment and run both requests
Expected Results:
- Each workflow endpoint should return a 200 OK response
- The response body should contain execution logs or identifiers confirming that automation tasks ran
Troubleshooting:
If an endpoint returns an automator error, verify that the automator is listed in the Automators section:
Opt-In automator:
- Opt-In Request
- Required variables:
{{SUIT_ID}},{{ENTER_SEGMENT_ID}},{{TRIGGER_POINT_NAME}},{{APP_API_KEY}}
Opt-Out automator:
- Opt-Out Request
- Required variables:
{{SUIT_ID}},{{EXIT_SEGMENT_ID}},{{TRIGGER_POINT_NAME}},{{APP_API_KEY}}
Re-run the Opt-In/Opt-Out request and then re-test the automation endpoint.
4. Daily App Push Notifications Check¶
Purpose: Confirm that push notifications are sent and received by users.
Steps to Execute:
- Open the Daily App Push Notifications collection in Postman
- Provide the
suit_id,apiKey, anduserIdin the request body - Send the notification request
Expected Results:
- The API should return a 200 OK status
- The targeted user should receive the push notification on their device
Troubleshooting:
If notifications do not arrive:
- Verify that
userIdexists in the following collections in your database: _UserNotificationChannelsInstallation- Check for errors in the push service logs
- Confirm that the device token is valid and not expired
Important Notes:¶
- Always use the latest environment in Postman โ Appgain.io
- After fixing any issue, re-run the affected collection to confirm resolution
- Document any anomalies or persistent failures in the daily QA report
๐ง Key Responsibilities¶
1. Customer Support¶
- Ticket Management: Handle customer support tickets efficiently
- Issue Resolution: Troubleshoot and resolve technical issues
- Customer Communication: Provide clear and helpful responses
- Escalation Management: Escalate complex issues to appropriate teams
2. System Monitoring¶
- Performance Monitoring: Monitor system performance and health
- Alert Management: Respond to system alerts and notifications
- Log Analysis: Analyze logs to identify and resolve issues
- Capacity Planning: Monitor resource usage and plan for growth
3. Test Planning & Strategy¶
- Test Strategy: Develop comprehensive testing strategies
- Test Planning: Plan test execution and resource allocation
- Risk Assessment: Identify testing risks and mitigation strategies
- Quality Gates: Define quality gates and acceptance criteria
4. Automated Testing¶
- Unit Testing: Develop and maintain unit tests
- Integration Testing: Test component integration
- End-to-End Testing: Automate user journey testing
- API Testing: Comprehensive API testing and validation
5. Manual Testing¶
- Exploratory Testing: Manual testing for edge cases
- User Acceptance Testing: Validate user requirements
- Regression Testing: Ensure existing functionality works
- Cross-platform Testing: Test across different platforms
6. Performance & Security Testing¶
- Performance Testing: Load and stress testing
- Security Testing: Vulnerability assessment and penetration testing
- Accessibility Testing: Ensure applications are accessible
- Compatibility Testing: Test across different browsers and devices
7. Documentation & Knowledge Management¶
- Knowledge Base: Maintain and update support documentation
- Troubleshooting Guides: Create and update troubleshooting procedures
- FAQ Management: Keep FAQs current and helpful
- Process Documentation: Document support processes and procedures
8. Technical Troubleshooting¶
- API Issues: Troubleshoot API-related problems
- Database Issues: Resolve database connectivity and performance issues
- Integration Problems: Fix third-party integration issues
- Platform Issues: Resolve platform-specific problems
๐ Technical Stack¶
Support Tools¶
- Freescout: Customer support platform
- Slack: Team communication
- Confluence: Knowledge management
- Jira: Issue tracking
- Postman: API testing
Testing Frameworks¶
- Jest: Unit and integration testing
- Cypress: End-to-end testing
- Postman: API testing and automation
- Selenium: Web application testing
- JMeter: Performance testing
Monitoring Tools¶
- Prometheus: Metrics collection
- Grafana: Visualization
- Loki: Log aggregation
- Alertmanager: Alert management
- Status Page: Public status
๐ Backend Servers Logging¶
Loki Log Aggregation System¶
Loki is our centralized log aggregation system that collects, stores, and queries logs from all backend services. It's designed for high availability and scalability, making it perfect for our microservices architecture.
Key Features of Loki:¶
- High Performance: Efficient log storage and querying
- Scalability: Handles large volumes of log data
- Cost-Effective: Optimized storage for log data
- Real-time Queries: Fast log search and filtering
- Integration: Works seamlessly with Grafana for visualization
Log Sources:¶
- Appgain Server: Main business logic and CRM functionality
- Parse Server: Backend-as-a-Service for mobile apps
- Notify Service: Multi-channel notification system
- Automator Engine: Workflow automation and triggers
- Admin Server: Management interface and analytics
- API Gateway: Request routing and authentication
- Task Queue: Background job processing
Accessing Microservice Logs¶
To get logs for any microservice, use the following Grafana Explore link:
Grafana Explore - Microservice Logs
How to Use:¶
- Replace Service Name: Change
admin-serverin the URL to any service name: appgain-serverparse-servernotify-serviceautomator-engineapi-gateway-
task-queue -
Time Range: Adjust the time range in the URL:
now-1hfor last hournow-6hfor last 6 hours-
now-24hfor last 24 hours -
Query Examples:
Logging Best Practices¶
Structured Logging¶
// Good logging practice
logger.info('User authentication successful', {
userId: user.id,
service: 'auth-service',
timestamp: new Date().toISOString(),
requestId: req.headers['x-request-id']
});
// Error logging with context
logger.error('Database connection failed', {
error: error.message,
service: 'appgain-server',
database: 'mongodb',
timestamp: new Date().toISOString()
});
Log Levels¶
- DEBUG: Detailed information for debugging
- INFO: General information about application flow
- WARN: Warning messages for potential issues
- ERROR: Error messages for failed operations
- FATAL: Critical errors that may cause system failure
Video Resources¶
Backend Logging Training Video¶
Development Tools¶
- Git: Version control for test code
- Docker: Containerized testing environments
- Jenkins: CI/CD pipeline integration
- Jira: Test case and bug management
- Confluence: Test documentation
Development Platforms¶
- iKhair: Donation payment platform support
- RetailGain: Retail platform support
- Shrinkit: E-commerce platform support
- Appgain Core: Core platform support
๐ Success Metrics¶
Support Metrics¶
- Response Time: < 2 hours average response time
- Resolution Time: < 24 hours average resolution time
- Customer Satisfaction: > 4.5/5 satisfaction rating
- First Contact Resolution: > 80% FCR rate
Quality Metrics¶
- Test Coverage: > 80% code coverage
- Bug Detection Rate: > 90% bugs caught before production
- Test Automation: > 70% automated test coverage
- Regression Prevention: 100% critical regressions prevented
System Metrics¶
- System Uptime: 99.9% availability
- Alert Response: < 15 minutes alert response time
- Issue Resolution: > 95% issues resolved within SLA
- Knowledge Base: > 90% coverage of common issues
Performance Metrics¶
- Test Execution Time: < 30 minutes for full test suite
- Test Reliability: > 95% test pass rate
- CI/CD Integration: 100% automated testing in pipelines
- Release Quality: Zero critical bugs in production releases
๐ Integration Points¶
Customer Integration¶
- Customer Platforms: iKhair, RetailGain, Shrinkit users
- Communication Channels: Email, chat, phone support
- Feedback Systems: Customer satisfaction surveys
- Escalation Paths: Technical team escalation procedures
Development Integration¶
- Frontend Teams: React, Next.js application testing
- Backend Teams: API and database testing
- Mobile Teams: iOS and Android application testing
- DevOps Teams: CI/CD pipeline integration
System Integration¶
- Development Teams: Frontend, Backend, DevOps teams
- Monitoring Systems: Prometheus, Grafana, Loki
- Documentation: Confluence, technical documentation
- Issue Tracking: Jira, bug tracking systems
Platform Integration¶
- iKhair Platform: Donation payment platform testing
- RetailGain Platform: Retail platform testing
- Shrinkit Platform: E-commerce platform testing
- Appgain Core: Core platform testing
๐ Daily Operations¶
Morning Routine¶
# Check system status
curl https://status.instabackend.io/
# Review overnight alerts
curl http://monitor.instabackend.io:9090/a../alerts
# Check support tickets
freescout tickets --unassigned
# Review system metrics
grafana dashboard --name "System Health"
# Check test execution status
jest --coverage --watchAll=false
# Review overnight test results
cypress run --headless
# Check CI/CD pipeline status
jenkins --job "test-pipeline" --status
# Review bug reports
jira --filter "Testing Bugs"
Support Workflow¶
# Process support tickets
freescout tickets --priority high
# Investigate issues
curl http://api.appgain.io/health
mongo --eval "db.stats()"
# Update knowledge base
confluence --page "Troubleshooting Guide"
# Escalate complex issues
jira issue --create --type "Bug"
Testing Workflow¶
# Run unit tests
npm test
# Run integration tests
npm run test:integration
# Run end-to-end tests
npm run test:e2e
# Run performance tests
jmeter -n -t performance-test.jmx
Monitoring & Maintenance¶
# Monitor system performance
curl http://monitor.instabackend.io:9090/a../query?query=response_time
# Check service logs
docker logs appgain-server --tail 100
# Update status page
statuspage --update "All systems operational"
# Backup support data
freescout backup --date $(date +%Y%m%d)
# Monitor test metrics
curl http://monitor.instabackend.io:9090/a../query?query=test_execution_time
# Update test documentation
confluence --page "Test Strategy"
# Backup test data
tar -czf test-data-$(date +%Y%m%d).tar.gz test-data/
# Clean up test environments
docker system prune -f
๐ฏ Project Examples¶
1. Customer Onboarding Support¶
- Goal: Smooth customer onboarding experience
- Process: Setup assistance, configuration support, training
- Metrics: Onboarding success rate, time to first value, customer satisfaction
- Tools: Documentation, video tutorials, live support
2. Platform Migration Support¶
- Goal: Support customers during platform migrations
- Process: Migration planning, execution support, post-migration assistance
- Metrics: Migration success rate, downtime minimization, customer satisfaction
- Tools: Migration guides, rollback procedures, 24/7 support
3. API Integration Support¶
- Goal: Help customers integrate with Appgain APIs
- Process: API documentation, code examples, troubleshooting
- Metrics: Integration success rate, time to integration, API usage
- Tools: Postman collections, SDK documentation, code samples
4. Automated Testing Pipeline¶
- Goal: Fully automated testing in CI/CD pipeline
- Technology: Jest, Cypress, Postman, Jenkins
- Integration: GitLab CI/CD, automated deployments
- Metrics: Test coverage, execution time, reliability
5. Cross-platform Testing¶
- Goal: Ensure applications work across all platforms
- Technology: Selenium, BrowserStack, mobile testing
- Integration: iOS, Android, web applications
- Metrics: Platform compatibility, user experience consistency
6. Performance Testing Suite¶
- Goal: Ensure applications meet performance requirements
- Technology: JMeter, Artillery, custom performance tests
- Integration: Load testing, stress testing, monitoring
- Metrics: Response time, throughput, resource utilization
๐ง Troubleshooting¶
Common Issues¶
- API Authentication: Check API keys and authentication headers
- Database Connectivity: Verify connection strings and network access
- Push Notifications: Troubleshoot FCM/APNs configuration
- Email Delivery: Check SMTP settings and email provider configuration
- Test Flakiness: Address intermittent test failures
- Environment Issues: Resolve test environment problems
- Performance Degradation: Identify and fix performance bottlenecks
- Integration Problems: Fix test integration issues
Debug Commands¶
# Check API status
curl -H "Authorization: Bearer $API_KEY" http://api.appgain.io/health
# Test database connection
mongo --eval "db.runCommand('ping')"
# Check push notification status
curl -X POST http://push.appgain.io/status
# Verify email configuration
telnet smtp.appgain.io 587
# Debug Jest tests
jest --verbose --detectOpenHandles
# Debug Cypress tests
cypress open --config video=false
# Debug API tests
postman --collection test-collection --environment test-env
# Monitor test performance
jest --coverage --verbose
๐ Learning Path¶
Week 1: Foundation¶
- Complete support and testing foundation courses
- Learn Appgain's platform architecture
- Understand support tools and processes
- Set up testing environment
- Learn testing tools and frameworks
- Understand testing methodologies
- Shadow experienced support and testing engineers
Week 2: Hands-on¶
- Handle basic support tickets
- Learn troubleshooting procedures
- Understand escalation processes
- Practice customer communication
- Write first automated tests
- Set up CI/CD integration
- Learn manual testing techniques
- Practice bug reporting
Week 3: Advanced¶
- Handle complex technical issues
- Contribute to knowledge base
- Work with development teams
- Monitor system health
- Implement comprehensive test suites
- Set up performance testing
- Learn security testing
- Optimize test execution
Week 4: Independence¶
- Handle all support scenarios
- Mentor new support engineers
- Improve support processes
- Contribute to system improvements
- Deploy testing to production
- Monitor test quality
- Mentor team members
- Improve testing processes
๐ฅ Video Resources & Tutorials¶
Support Training Videos¶
Customer Support Training¶
Testing Training Videos¶
Testing Automation Training¶
๐ฏ Quick Navigation¶
- System Architecture? โ Common Knowledge
- Foundation Knowledge? โ Foundation Courses
- Learning Resources? โ Learning Resources
- Support? โ Support & Contacts
๐ ๏ธ๐งช Support & Testing Engineers combine technical expertise with customer service skills to ensure platform quality and provide excellent user support.
โ Back to Home | โ Previous: Common Knowledge