Skip to content

Troubleshooting Guide

Common issues and solutions for CertifyClouds.


Azure Authentication Issues

"Failed to authenticate with Azure"

Symptoms:

  • Discovery scans fail immediately
  • "No Azure credentials available" error

Causes & Solutions:

  1. Managed Identity not enabled

    # Verify managed identity is assigned
    az vm show --name <vm-name> --resource-group <rg> --query identity
    
    Fix: Enable system-assigned or user-assigned managed identity

  2. Service Principal credentials expired

    # Check credential expiry
    az ad sp credential list --id <app-id>
    
    Fix: Rotate service principal secret

  3. Wrong environment variables

  4. Check AZURE_CLIENT_ID, AZURE_CLIENT_SECRET, AZURE_TENANT_ID
  5. Verify AZURE_USE_MANAGED_IDENTITY=true if using managed identity

  6. Network connectivity

  7. Ensure outbound access to login.microsoftonline.com
  8. Check NSG/firewall rules

Key Vault Access Issues

"Access denied" or "Forbidden" for Key Vault

Symptoms:

  • Some vaults show errors in scan results
  • "AuthorizationPermissionDenied" error

Solutions:

  1. Run the setup script

    ./setup-certifyclouds-access.sh --principal-id $PRINCIPAL_ID --apply
    

  2. Check RBAC vs Access Policies

  3. Some vaults use RBAC, others use Access Policies
  4. The setup script handles both

  5. Verify role assignments

    az role assignment list --assignee $PRINCIPAL_ID --all
    

"Network unreachable" for Key Vault

Symptoms:

  • Vaults with firewall enabled fail
  • Timeout errors during scan

Solutions:

  1. Add subnet to firewall

    az keyvault network-rule add \
      --name <vault-name> \
      --subnet /subscriptions/.../subnets/<subnet>
    

  2. Use Private Endpoints

  3. Configure private endpoints for Key Vault
  4. Update DNS resolution

  5. Check service endpoints

  6. Ensure Microsoft.KeyVault service endpoint is enabled on subnet

License Issues

"License validation failed"

Symptoms:

  • Application shows "License required" message
  • API returns 403 with license error

Solutions:

  1. Verify license key
    echo $CERTIFYCLOUDS_LICENSE_KEY
    
  2. Should be in format CC-XXXX-XXXX-XXXX

  3. Check network access

    curl https://license.certifyclouds.com/health
    

  4. Should return HTTP 200

  5. Check license expiry

    curl http://localhost:8080/system/license/status
    

  6. Cloudflare blocking

  7. If deployed in Azure/AWS, Cloudflare Bot Fight Mode may block
  8. Contact support for whitelisting

"License expired"

Solutions:

  1. Contact sales@certifyclouds.com to renew
  2. Receive new license key
  3. Update CERTIFYCLOUDS_LICENSE_KEY environment variable
  4. Restart the application

Database Issues

"Cannot connect to database"

Symptoms:

  • Application fails to start
  • "Connection refused" errors

Solutions:

  1. Verify PostgreSQL is running

    docker-compose ps
    az postgres flexible-server show --name <server> --resource-group <rg>
    

  2. Check connection string

    echo $DATABASE_URL
    # Format: postgresql://user:password@host:5432/dbname
    

  3. Test connectivity

    psql $DATABASE_URL -c "SELECT 1"
    

  4. Check firewall rules

  5. For Azure PostgreSQL, ensure VNet integration or public access

"Database migration failed"

Solutions:

  1. Check application logs for specific error
  2. Ensure database user has CREATE/ALTER permissions
  3. Contact support with migration error details

Performance Issues

Scans are slow

Symptoms:

  • Discovery takes >10 minutes
  • Timeout errors

Solutions:

  1. Reduce concurrent workers

    DISCOVERY_MAX_WORKERS=3  # Reduce from default 5
    

  2. Use delta scans

  3. Full scans refresh everything
  4. Delta scans only fetch changes

  5. Filter subscriptions

  6. Scan only needed subscriptions
  7. Use DISCOVERY_ALLOWED_SUBSCRIPTIONS

  8. Check rate limiting

  9. Azure ARM has rate limits
  10. Reduce workers if hitting 429 errors

Application is slow

Solutions:

  1. Increase container resources
  2. Minimum: 1 vCPU, 2GB RAM
  3. Recommended: 2 vCPU, 4GB RAM

  4. Check database performance

  5. Upgrade PostgreSQL tier if needed
  6. Review slow query logs

  7. Enable caching

    ENABLE_HINT_CACHING=true
    


Rotation Issues (PRO)

"No matches found" for rotation

Symptoms:

  • App Registrations discovered but no Key Vault matches

Solutions:

  1. Enable hint caching
  2. Improves matching accuracy

  3. Add tags to secrets

    az keyvault secret set-attributes \
      --vault-name <vault> \
      --name <secret> \
      --tags app-id=<app-id>
    

  4. Use consistent naming

  5. Name secrets similarly to App Registration

"Rotation failed - permission denied"

Solutions:

  1. Grant Graph API permissions

    az ad app permission add \
      --id $CLIENT_ID \
      --api 00000003-0000-0000-c000-000000000000 \
      --api-permissions 1bfefb4e-e0b5-418b-a88f-73c46d2cc8e9=Role
    
    az ad app permission admin-consent --id $CLIENT_ID
    

  2. Grant Key Vault write permissions

    az role assignment create \
      --assignee $PRINCIPAL_ID \
      --role "Key Vault Secrets Officer" \
      --scope /subscriptions/<sub-id>
    


Sync Issues (PRO)

"AWS credentials invalid"

Solutions:

  1. Verify access key/secret in AWS IAM console
  2. Create new access key if needed
  3. Check IAM policy allows required operations

"Sync timeout"

Solutions:

  1. Check network connectivity to AWS/GCP
  2. Reduce concurrent sync operations
  3. Verify endpoint URLs are correct

Alert Issues

Emails not received

Solutions:

  1. Check SMTP configuration

    echo $SMTP_HOST
    echo $SMTP_PORT
    echo $SMTP_USERNAME
    

  2. Verify SendGrid API key (if using SendGrid)

  3. Check spam/junk folder

  4. Review application logs for email errors

Webhooks not delivered

Solutions:

  1. Verify webhook URL
  2. Test URL is accessible from container
  3. Check for typos

  4. Test endpoint manually

    curl -X POST <webhook-url> \
      -H "Content-Type: application/json" \
      -d '{"test": true}'
    

  5. Check firewall rules

  6. Ensure outbound HTTPS is allowed

Login Issues

"Account locked"

Solutions:

  1. Wait for automatic unlock (default: 15 minutes)

  2. Admin unlock

  3. Another admin can unlock via Settings > Users

  4. Reset via CLI (if no other admin)

    docker exec -it certifyclouds-app python3 -m scripts.unlock_user --username admin
    

"Invalid credentials" but password is correct

Solutions:

  1. Check username is correct (case-sensitive)
  2. Verify SSO isn't overriding local auth
  3. Reset password if needed

Container Issues

Container crashes on startup

Solutions:

  1. Check logs

    docker logs certifyclouds-app
    az containerapp logs show --name <app> --resource-group <rg>
    

  2. Verify environment variables

  3. Required: DATABASE_URL, CERTIFYCLOUDS_LICENSE_KEY

  4. Check resource limits

  5. Ensure sufficient CPU/memory

"Exit code 255" on ACI

Cause: Bash entrypoint script issues in ACI

Solution: Use direct Python command, not shell script:

CMD ["python", "-m", "uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8080"]


Getting Help

If these solutions don't resolve your issue:

  1. Collect information:
  2. CertifyClouds version: curl http://localhost:8080/health
  3. Error messages from logs
  4. Steps to reproduce

  5. Contact support:

  6. Email: support@certifyclouds.com
  7. Include collected information

  8. Response times:

  9. STARTER: 24-48 hours
  10. PRO: 4-8 hours