Keycloak in Production: High Availability and Performance Tuning

Sep 18, 2024

Keycloak is a powerful open-source identity and access management solution, but running it in production requires careful consideration of scalability, performance, and reliability. This guide covers enterprise-grade deployment strategies.

High Availability Architecture

For production environments, Keycloak should be deployed in a clustered configuration with proper load balancing and session management:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: keycloak
  namespace: identity
spec:
  replicas: 3
  selector:
    matchLabels:
      app: keycloak
  template:
    metadata:
      labels:
        app: keycloak
    spec:
      containers:
      - name: keycloak
        image: quay.io/keycloak/keycloak:23.0.1
        args:
        - start
        - --cache-stack=kubernetes
        - --hostname-strict=false
        - --http-enabled=true
        - --import-realm
        env:
        - name: KC_DB
          value: postgres
        - name: KC_DB_URL
          value: jdbc:postgresql://postgres:5432/keycloak
        - name: KC_DB_USERNAME
          value: keycloak
        - name: KC_DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: keycloak-db-secret
              key: password
        - name: KEYCLOAK_ADMIN
          value: admin
        - name: KEYCLOAK_ADMIN_PASSWORD
          valueFrom:
            secretKeyRef:
              name: keycloak-admin-secret
              key: password
        - name: JAVA_OPTS_APPEND
          value: "-Djgroups.dns.query=keycloak-headless"
        ports:
        - containerPort: 8080
          name: http
        - containerPort: 7800
          name: jgroups
        readinessProbe:
          httpGet:
            path: /realms/master
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
        livenessProbe:
          httpGet:
            path: /realms/master
            port: 8080
          initialDelaySeconds: 60
          periodSeconds: 30
        resources:
          requests:
            memory: "1Gi"
            cpu: "500m"
          limits:
            memory: "2Gi"
            cpu: "1000m"

Database Optimization

Keycloak's performance heavily depends on database configuration. Here's an optimized PostgreSQL setup:

# PostgreSQL configuration for Keycloak
max_connections = 200
shared_buffers = 256MB
effective_cache_size = 1GB
maintenance_work_mem = 64MB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1
effective_io_concurrency = 200

# Keycloak-specific optimizations
work_mem = 4MB
max_wal_size = 1GB
min_wal_size = 80MB
💡

Database Scaling Tip: Consider using read replicas for reporting and analytics queries to reduce load on the primary database. Keycloak supports read-only connections for certain operations.

Performance Tuning

JVM tuning is crucial for Keycloak performance in production environments:

JVM Configuration

# JVM optimization for Keycloak
JAVA_OPTS_APPEND="-XX:+UseG1GC   -XX:MaxGCPauseMillis=100   -XX:+UseStringDeduplication   -XX:+OptimizeStringConcat   -XX:+UseCompressedOops   -Xms2g -Xmx2g   -XX:MetaspaceSize=96M   -XX:MaxMetaspaceSize=256m   -Djava.net.preferIPv4Stack=true   -Djboss.modules.system.pkgs=org.jboss.byteman   -Djava.awt.headless=true"

Security Hardening

Production Keycloak deployments require additional security measures:

Network Security

TLS termination: Use proper SSL certificates and terminate TLS at load balancer Network policies: Restrict traffic between pods and external access Firewall rules: Allow only necessary ports (8080, 8443, 7800 for clustering)

Access Control

# NetworkPolicy for Keycloak
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: keycloak-network-policy
  namespace: identity
spec:
  podSelector:
    matchLabels:
      app: keycloak
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: ingress-nginx
    ports:
    - protocol: TCP
      port: 8080
  - from:
    - podSelector:
        matchLabels:
          app: keycloak
    ports:
    - protocol: TCP
      port: 7800
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          name: database
    ports:
    - protocol: TCP
      port: 5432

Monitoring and Observability

Implement comprehensive monitoring for Keycloak instances:

Key Metrics to Monitor

Authentication metrics: Login success/failure rates, response times System metrics: CPU, memory, GC performance Database metrics: Connection pool usage, query performance Cache metrics: Hit rates, eviction rates

# Prometheus ServiceMonitor for Keycloak
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: keycloak-metrics
spec:
  selector:
    matchLabels:
      app: keycloak
  endpoints:
  - port: metrics
    interval: 30s
    path: /metrics

Backup and Disaster Recovery

Implement robust backup strategies for Keycloak data:

Database Backups

# Automated PostgreSQL backup
apiVersion: batch/v1
kind: CronJob
metadata:
  name: keycloak-db-backup
spec:
  schedule: "0 2 * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: postgres-backup
            image: postgres:15
            command:
            - /bin/bash
            - -c
            - |
              pg_dump -h postgres -U keycloak -d keycloak |               gzip > /backup/keycloak-$(date +%Y%m%d_%H%M%S).sql.gz
              # Upload to S3 or other storage
              aws s3 cp /backup/keycloak-*.sql.gz s3://backup-bucket/keycloak/
            env:
            - name: PGPASSWORD
              valueFrom:
                secretKeyRef:
                  name: keycloak-db-secret
                  key: password
          restartPolicy: OnFailure
💡

Best Practice: Test your backup and recovery procedures regularly. Automate the process and verify that backups can be restored to a clean environment.

Scaling Strategies

As your organization grows, consider these scaling approaches:

Horizontal Pod Autoscaling

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: keycloak-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: keycloak
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

This production-ready Keycloak setup provides the foundation for enterprise identity management with high availability, performance, and security.

Ops & Cloud