Troubleshooting Guide¶
Common issues and solutions for CertifyClouds.
Azure Authentication Issues¶
"Failed to authenticate with Azure"¶
Symptoms:
- Discovery scans fail immediately
- "No Azure credentials available" error
Causes & Solutions:
-
Managed Identity not enabled
Fix: Enable system-assigned or user-assigned managed identity -
Service Principal credentials expired
Fix: Rotate service principal secret -
Wrong environment variables
- Check
AZURE_CLIENT_ID,AZURE_CLIENT_SECRET,AZURE_TENANT_ID -
Verify
AZURE_USE_MANAGED_IDENTITY=trueif using managed identity -
Network connectivity
- Ensure outbound access to
login.microsoftonline.com - Check NSG/firewall rules
Key Vault Access Issues¶
"Access denied" or "Forbidden" for Key Vault¶
Symptoms:
- Some vaults show errors in scan results
- "AuthorizationPermissionDenied" error
Solutions:
-
Run the setup script
-
Check RBAC vs Access Policies
- Some vaults use RBAC, others use Access Policies
-
The setup script handles both
-
Verify role assignments
"Network unreachable" for Key Vault¶
Symptoms:
- Vaults with firewall enabled fail
- Timeout errors during scan
Solutions:
-
Add subnet to firewall
-
Use Private Endpoints
- Configure private endpoints for Key Vault
-
Update DNS resolution
-
Check service endpoints
- Ensure
Microsoft.KeyVaultservice endpoint is enabled on subnet
License Issues¶
"License validation failed"¶
Symptoms:
- Application shows "License required" message
- API returns 403 with license error
Solutions:
- Verify license key
-
Should be in format
CC-XXXX-XXXX-XXXX -
Check network access
-
Should return HTTP 200
-
Check license expiry
-
Cloudflare blocking
- If deployed in Azure/AWS, Cloudflare Bot Fight Mode may block
- Contact support for whitelisting
"License expired"¶
Solutions:
- Contact sales@certifyclouds.com to renew
- Receive new license key
- Update
CERTIFYCLOUDS_LICENSE_KEYenvironment variable - Restart the application
Database Issues¶
"Cannot connect to database"¶
Symptoms:
- Application fails to start
- "Connection refused" errors
Solutions:
-
Verify PostgreSQL is running
-
Check connection string
-
Test connectivity
-
Check firewall rules
- For Azure PostgreSQL, ensure VNet integration or public access
"Database migration failed"¶
Solutions:
- Check application logs for specific error
- Ensure database user has CREATE/ALTER permissions
- Contact support with migration error details
Performance Issues¶
Scans are slow¶
Symptoms:
- Discovery takes >10 minutes
- Timeout errors
Solutions:
-
Reduce concurrent workers
-
Use delta scans
- Full scans refresh everything
-
Delta scans only fetch changes
-
Filter subscriptions
- Scan only needed subscriptions
-
Use
DISCOVERY_ALLOWED_SUBSCRIPTIONS -
Check rate limiting
- Azure ARM has rate limits
- Reduce workers if hitting 429 errors
Application is slow¶
Solutions:
- Increase container resources
- Minimum: 1 vCPU, 2GB RAM
-
Recommended: 2 vCPU, 4GB RAM
-
Check database performance
- Upgrade PostgreSQL tier if needed
-
Review slow query logs
-
Enable caching
Rotation Issues (PRO)¶
"No matches found" for rotation¶
Symptoms:
- App Registrations discovered but no Key Vault matches
Solutions:
- Enable hint caching
-
Improves matching accuracy
-
Add tags to secrets
-
Use consistent naming
- Name secrets similarly to App Registration
"Rotation failed - permission denied"¶
Solutions:
-
Grant Graph API permissions
-
Grant Key Vault write permissions
Sync Issues (PRO)¶
"AWS credentials invalid"¶
Solutions:
- Verify access key/secret in AWS IAM console
- Create new access key if needed
- Check IAM policy allows required operations
"Sync timeout"¶
Solutions:
- Check network connectivity to AWS/GCP
- Reduce concurrent sync operations
- Verify endpoint URLs are correct
Alert Issues¶
Emails not received¶
Solutions:
-
Check SMTP configuration
-
Verify SendGrid API key (if using SendGrid)
-
Check spam/junk folder
-
Review application logs for email errors
Webhooks not delivered¶
Solutions:
- Verify webhook URL
- Test URL is accessible from container
-
Check for typos
-
Test endpoint manually
-
Check firewall rules
- Ensure outbound HTTPS is allowed
Login Issues¶
"Account locked"¶
Solutions:
-
Wait for automatic unlock (default: 15 minutes)
-
Admin unlock
-
Another admin can unlock via Settings > Users
-
Reset via CLI (if no other admin)
"Invalid credentials" but password is correct¶
Solutions:
- Check username is correct (case-sensitive)
- Verify SSO isn't overriding local auth
- Reset password if needed
Container Issues¶
Container crashes on startup¶
Solutions:
-
Check logs
-
Verify environment variables
-
Required:
DATABASE_URL,CERTIFYCLOUDS_LICENSE_KEY -
Check resource limits
- Ensure sufficient CPU/memory
"Exit code 255" on ACI¶
Cause: Bash entrypoint script issues in ACI
Solution: Use direct Python command, not shell script:
Getting Help¶
If these solutions don't resolve your issue:
- Collect information:
- CertifyClouds version:
curl http://localhost:8080/health - Error messages from logs
-
Steps to reproduce
-
Contact support:
- Email: support@certifyclouds.com
-
Include collected information
-
Response times:
- STARTER: 24-48 hours
- PRO: 4-8 hours