Upgrade Postgresql-ha Helm Chart for GoodData 2.1.0

If you’re using external PostgreSQL database (when you deployed gooddata-cn helm chart with option deployPostgresHA: false), you can skip this step because it is not relevant for your deployment.

If you’re using embedded PostgreSQL database (with deployPostgresHA: true set in values.yaml), you will need to perform the following process to upgrade postgresql-ha helm chart from version 8.6.13 to 9.1.5. The upgrade includes migration of PostgreSQL database version from 11 to 14. Due to the nature of the upgrade, this action will cause a service disruption. Please schedule maintenance window for this operation.

Steps:

  1. If you have not done so already, create a /tmp/organization-layout.json JSON dump from your GoodData.CN 2.0.x organization layout. See Back Up the Organization.

  2. Set up your shell environment, make sure you change the following values:

    # namespace GoodData.CN is deployed to
    export NAMESPACE=gooddata-cn
    # name of helm release used to deploy GoodData.CN 
    export HELM_RELEASE=release-name
    # path to helm values file used to deploy GoodData.CN
    export HELM_VALUES_FILE="values-gooddata-cn.yaml"
    # postgres name as specified in values.yaml postgresql-ha.nameOverride
    export PG_NAME=db
    # PG-HA admin user as defined in values.yaml postgresql-ha.postgresql.username
    export PGUSER=postgres
    # PG-HA admin user password as defined in values.yaml postgresql-ha.postgresql.password  
    export PGPASSWORD=$(cat pg_password.txt)
    export PGHOST=${HELM_RELEASE}-${PG_NAME}-postgresql-0
    # helm release name of temporary PG in destination version started to execute pg_dump
    export TMP_PGDUMP_RELEASE=tmp-pg-dump
    export TMP_PGDUMP_POD=${TMP_PGDUMP_RELEASE}-postgresql-0
    # location of dumps in temporary container
    export DUMP_LOCATION=/bitnami/postgresql/dumps
    
  3. Disable access to your GoodData.CN application:

    helm upgrade --namespace $NAMESPACE --version 2.0.1 \
      --wait --timeout 7m -f $HELM_VALUES_FILE \
      --set metadataApi.replicaCount=0 \
      --set sqlExecutor.replicaCount=0 \
      --set dex.replicaCount=0 \
      $HELM_RELEASE gooddata/gooddata-cn
    

    Note that once the command finishes, users will see Internal Server Error message when trying to access any deployment organization.

  4. Deploy a temporary container to dump your data into:

    cat << EOT > /tmp/values-${TMP_PGDUMP_RELEASE}.yaml
    auth:
      postgresPassword: dumpdata
    primary:
      persistence:
        enabled: false
    readReplicas:
      replicaCount: 0
    EOT
    
    helm upgrade --install --namespace $NAMESPACE --version 11.6.6 \
      --wait --timeout 2m --values /tmp/values-${TMP_PGDUMP_RELEASE}.yaml \
      ${TMP_PGDUMP_RELEASE} bitnami/postgresql     
    
  5. Enable network access between gooddata-cn-db-pgpool and tmp-pg-dump-postgresql:

    cat << EOT > /tmp/network-access-tmp-pg-2-prod-pg.yaml
    apiVersion: networking.k8s.io/v1
    kind: NetworkPolicy
    metadata:
      namespace: $NAMESPACE
      name: $TMP_PGDUMP_RELEASE-ingress
    spec:
      ingress:
        - from:
          - namespaceSelector:
              matchLabels:
                kubernetes.io/metadata.name: $NAMESPACE
            podSelector:
              matchLabels:
                app.kubernetes.io/instance: $TMP_PGDUMP_RELEASE
                app.kubernetes.io/name: postgresql
          ports:
            - port: 5432
              protocol: TCP
      podSelector:
        matchLabels:
          app.kubernetes.io/component: postgresql
          app.kubernetes.io/instance: $NAMESPACE
          app.kubernetes.io/name: $PG_NAME
      policyTypes:
        - Ingress
    EOT
    
    kubectl apply -f /tmp/network-access-tmp-pg-2-prod-pg.yaml
    
  6. List available databases in the postgres-ha deployment:

    kubectl -n $NAMESPACE exec $PGHOST -- env PGPASSWORD=$PGPASSWORD psql -U postgres -c "\l" 
    

    Pick databases to be preserved. Use their names as input for variable USER_DBS_TO_TRANSFER in the next step. Note that:

    • databases md, execution and dex are always preserved
    • databases template0 and template1 must be always skipped
  7. Dump databases that you want preserved, edit the USER_DBS_TO_TRANSFER and USER_ROLES_BY_PG_HA values:

    # space separate list of user DBs to be dumped, system DBs md, execution and dex are included automatically
    export USER_DBS_TO_TRANSFER="tigerdb"
    # space separated list of user-defined PG roles delivered by postgresql-ha helm chart;
    # the roles will be excluded from roles dump as they are created automatically during PG-HA ecosystem provisioning;
    # roles repmgr, postgres and executor are excluded automatically
    export USER_ROLES_BY_PG_HA=""
    
    cat << "EOT" > ./dump-pg-dbs.sh
    #!/bin/bash
    set -x
    set -e
    
    PGHOST_IP=$(kubectl get pod -n $NAMESPACE $PGHOST --template '{{.status.podIP}}')   
    
    kubectl -n $NAMESPACE exec $TMP_PGDUMP_POD -- mkdir -p $DUMP_LOCATION
    # exclude all the roles created automatically by:
    #   - PG-HA chart 
    #   - postgres installation
    #   - GD chart
    time kubectl exec -it -n $NAMESPACE $TMP_PGDUMP_POD -- bash -c "env PGPASSWORD=$PGPASSWORD \
      pg_dumpall -h $PGHOST_IP -U $PGUSER --roles-only > /tmp/0_dump_pg_roles_all.sql"
    ROLES_TO_EXCLUDE="repmgr postgres executor"
    if [[ "$USER_ROLES_BY_PG_HA" != "" ]]; then
      ROLES_TO_EXCLUDE="${ROLES_TO_EXCLUDE} $USER_ROLES_BY_PG_HA"
    fi
    ROLES_TO_EXCLUDE=$(echo $ROLES_TO_EXCLUDE | tr ' ' '|')
    kubectl exec -n $NAMESPACE $TMP_PGDUMP_POD -- bash -c "grep -i -v \
      -E \"^(CREATE|ALTER) ROLE (${ROLES_TO_EXCLUDE})\" \
      /tmp/0_dump_pg_roles_all.sql | gzip > $DUMP_LOCATION/0_dump_pg_roles.sql.gz"
    
    # dump selected databases
    ALL_DBS_TO_TRANSFER="md execution dex ${USER_DBS_TO_TRANSFER}"
    ITER_ID=1
    for db in $ALL_DBS_TO_TRANSFER; do
      DUMP_DEST_FILE="$DUMP_LOCATION/${ITER_ID}_dump_pg_db_${db}.sql.gz"
      echo "Creating dump of DB ${db} to file ${DUMP_DEST_FILE}"
      time kubectl exec -it -n $NAMESPACE $TMP_PGDUMP_POD -- bash -c "env PGPASSWORD=$PGPASSWORD \
        pg_dump -h $PGHOST_IP -U $PGUSER --quote-all-identifiers --create ${db} | gzip > $DUMP_DEST_FILE"
      ITER_ID=$(($ITER_ID+1))
    done
    EOT
    
    chmod 754 dump-pg-dbs.sh
    ./dump-pg-dbs.sh
    

    The script first dumps only roles, those are PostgreSQL instance level definitions. After that, databases are dumped one by one.

    Before moving forward, verify that temporary POD contains all the requested dumps. It means:

    • 3 dumps (files) for md, dex and execution databases
    • 1 dump for roles
    • 1 dump for each database specified in USER_DBS_TO_TRANSFER variable
    kubectl exec -n $NAMESPACE $TMP_PGDUMP_POD -- bash -c "ls -la $DUMP_LOCATION"
    
  8. (Optional) Download dumps to local machine for backup:

    kubectl exec -n $NAMESPACE $TMP_PGDUMP_POD -- bash -c "cd $DUMP_LOCATION/../; tar cf dumps.tar dumps/*"
    kubectl cp -n $NAMESPACE --retries -1 $TMP_PGDUMP_POD:/bitnami/postgresql/dumps.tar ./dumps.tar
    
    # verify, data were transferred completely - compare MD5 hash
    kubectl exec -n $NAMESPACE $TMP_PGDUMP_POD -- md5sum $DUMP_LOCATION/../dumps.tar
    md5sum ./dumps.tar
    
  9. Remove postgresql-ha:

    helm upgrade --namespace $NAMESPACE --version 2.0.1 \
      --wait --timeout 2m -f $HELM_VALUES_FILE \
      --set metadataApi.replicaCount=0 \
      --set sqlExecutor.replicaCount=0 \
      --set dex.replicaCount=0 \
      --set deployPostgresHA=false \
      $HELM_RELEASE gooddata/gooddata-cn
    
  10. Remove persistent volume claims and backup related persistent volumes:

    cat << "EOT" > ./backup-pvs.sh
    #!/bin/bash
    set -x
    set -e
    
    PV_NAME_BACKUP_PREFIX=pv_name_backup
    PVC_BACKUP_PREFIX=pvc_bck
    PVC_PV_LINES=$(kubectl get pvc -n $NAMESPACE --sort-by=.metadata.name \
    --selector="app.kubernetes.io/component=postgresql,app.kubernetes.io/instance=${HELM_RELEASE},app.kubernetes.io/name=${PG_NAME}" \
    -o jsonpath='{range .items[*]}{@.metadata.name}{" "}{@.spec.volumeName}{"\n"}{end}')
    
    echo "$PVC_PV_LINES" | while read -r line; do
      PVC_INSTANCE=$(echo $line | cut -f 1 -d ' ')
      PV_INSTANCE=$(echo $line | cut -f 2 -d ' ')
    
      # make sure PV is not deleted by k8s after PVC is removed
      echo "Setting Retain policy for PV=${PV_INSTANCE}"
      kubectl patch pv "$PV_INSTANCE" -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'
    
      # remember PV name for rollback purposes
      PV_NAME_BACKUP_FILE="${PV_NAME_BACKUP_PREFIX}_${PV_INSTANCE}"
      echo "Creating PV name backup as $PV_NAME_BACKUP_FILE"
      touch $PV_NAME_BACKUP_FILE
    
      # Backup PVC definition for rollback purposes
      PVC_BCK_FILE="${PVC_BACKUP_PREFIX}_$PVC_INSTANCE.yaml"
      echo "Creating PVC backup file $PVC_BCK_FILE"
      kubectl get pvc -n $NAMESPACE $PVC_INSTANCE -o yaml > $PVC_BCK_FILE
    
      # delete PVC
      echo "Deleting PVC $PVC_INSTANCE"
      kubectl delete pvc -n $NAMESPACE $PVC_INSTANCE
    done
    EOT
    
    chmod 754 backup-pvs.sh
    ./backup-pvs.sh
    
  11. Upgrade chart to 2.1.0, this includes upgrade of postgresql-ha chart to 9.x.x:

    helm upgrade --namespace $NAMESPACE --version 2.1.0 \
      --wait --timeout 7m -f $HELM_VALUES_FILE \
      --set metadataApi.replicaCount=0 \
      --set sqlExecutor.replicaCount=0 \
      --set dex.replicaCount=0 \
      --set deployPostgresHA=true \
      $HELM_RELEASE gooddata/gooddata-cn
    
  12. Verify postgres DB version:

    kubectl exec -n $NAMESPACE $PGHOST -c postgresql -- psql --version
    
  13. Restore DB data:

    cat << "EOT" > ./restore-data.sh
    #!/bin/bash
    set -x
    set -e
    
    PGHOST_IP=$(kubectl get pod -n $NAMESPACE $PGHOST --template '{{.status.podIP}}')   
    DUMP_FILES=$(kubectl exec -n $NAMESPACE $TMP_PGDUMP_POD -- ls $DUMP_LOCATION)
    
    for dump_file in $DUMP_FILES; do
       echo "Restoring dump ${dump_file}"
       time kubectl exec -it -n $NAMESPACE $TMP_PGDUMP_POD -c postgresql -- bash -c "\
         gzip -cd $DUMP_LOCATION/$dump_file | \
         env PGPASSWORD=$PGPASSWORD psql -h $PGHOST_IP -U $PGUSER"
    done
    EOT
    
    chmod 754 restore-data.sh
    ./restore-data.sh      
    
  14. Verify that all databases exist in a PostgreSQL instance:

    kubectl -n $NAMESPACE exec $PGHOST -- env PGPASSWORD=$PGPASSWORD psql -U postgres -c "\l" 
    
  15. Enable application:

     helm upgrade --namespace $NAMESPACE --version 2.1.0 \
       --wait --timeout 7m -f $HELM_VALUES_FILE \
      $HELM_RELEASE gooddata/gooddata-cn
    
  16. Test the GoodData.CN deployment:

    • Is it possible to log in?
    • Does dashboard reports compute?
  17. Remove network policies:

    kubectl delete -f /tmp/network-access-tmp-pg-2-prod-pg.yaml
    
  18. Remove temporary PG deployed to create DB dumps:

    helm uninstall --namespace $NAMESPACE $TMP_PGDUMP_RELEASE
    

Rollback

In case something goes wrong, you can use persistent volumes backed in step 7 to revert back to version 2.0.1.

Steps:

  1. Disable access to the application and remove postgresql-ha:

     helm upgrade --namespace $NAMESPACE --version 2.1.0 \
       --wait --timeout 2m -f $HELM_VALUES_FILE \
       --set metadataApi.replicaCount=0 \
       --set sqlExecutor.replicaCount=0 \
       --set dex.replicaCount=0 \
       --set deployPostgresHA=false \
      $HELM_RELEASE gooddata/gooddata-cn
    
  2. Remove PVCs and backup new PVs:

    cat << "EOT" > ./remove-new-pvc.sh
    #!/bin/bash
    set -x
    set -e
    
    PV_NAME_BACKUP_PREFIX=pv14_name_backup
    PVC_BACKUP_PREFIX=pvc14_bck
    PVC_PV_LINES=$(kubectl get pvc -n $NAMESPACE --sort-by=.metadata.name \
    --selector="app.kubernetes.io/component=postgresql,app.kubernetes.io/instance=${HELM_RELEASE},app.kubernetes.io/name=${PG_NAME}" \
    -o jsonpath='{range .items[*]}{@.metadata.name}{" "}{@.spec.volumeName}{"\n"}{end}')
    
    
    echo "$PVC_PV_LINES" | while read -r line; do
      PVC_INSTANCE=$(echo $line | cut -f 1 -d ' ')
      PV_INSTANCE=$(echo $line | cut -f 2 -d ' ')
    
      # make sure PV is not deleted by k8s after PVC is removed
      echo "Setting Retain policy for PV=${PV_INSTANCE}"
      kubectl patch pv "$PV_INSTANCE" -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'
      PV_NAME_BACKUP_FILE="${PV_NAME_BACKUP_PREFIX}_${PV_INSTANCE}"
    
      # remember PV name for possible investigation purposes
      echo "Creating PV name backup as $PV_NAME_BACKUP_FILE"
      touch $PV_NAME_BACKUP_FILE
    
      # Backup PVC definition for rollback purposes
      PVC_BCK_FILE="${PVC_BACKUP_PREFIX}_$PVC_INSTANCE.yaml"
      echo "Creating PVC backup file $PVC_BCK_FILE"
      kubectl get pvc -n $NAMESPACE $PVC_INSTANCE -o yaml > $PVC_BCK_FILE
    
      # delete PVC
      echo "Deleting PVC $PVC_INSTANCE"
      kubectl delete pvc -n $NAMESPACE $PVC_INSTANCE
    done
    EOT
    
    chmod 754 remove-new-pvc.sh
    ./remove-new-pvc.sh
    
  3. Restore original PVCs and bind them to PV backups:

    cat << "EOT" > ./restore-orig-pvc.sh
    #!/bin/bash
    set -x
    set -e
    
    PV_NAME_BACKUP_PREFIX=pv_name_backup
    PVC_BACKUP_PREFIX=pvc_bck
    
    # prepare original PVs to be joined by PVCs
    for bck_pv in $(ls ${PV_NAME_BACKUP_PREFIX}*); do
      PV_INSTANCE=${bck_pv#"${PV_NAME_BACKUP_PREFIX}_"}
      echo "Making PV ${PV_INSTANCE} available"
      kubectl patch pv "$PV_INSTANCE" --type json -p '[{"op": "remove", "path": "/spec/claimRef"}]'
      echo "Setting Retain policy for PV=${PV_INSTANCE} to delete"
      kubectl patch pv "$PV_INSTANCE" -p '{"spec":{"persistentVolumeReclaimPolicy":"Delete"}}'
    done
    
    # restore original PVCs
    for bck_pvc in $(ls ${PVC_BACKUP_PREFIX}*); do
      PVC_INSTANCE=${bck_pvc#"${PVC_BACKUP_PREFIX}_"}
      echo "Creating PVC ${PVC_INSTANCE}"
      kubectl create -f $bck_pvc
    done
    EOT
    
    chmod 754 restore-orig-pvc.sh
    ./restore-orig-pvc.sh
    
  4. Install application in the original version:

     helm upgrade --namespace $NAMESPACE --version 2.0.1 \
       --wait --timeout 2m -f $HELM_VALUES_FILE \
       --set metadataApi.replicaCount=0 \
       --set sqlExecutor.replicaCount=0 \
       --set dex.replicaCount=0 \
       --set deployPostgresHA=true \
      $HELM_RELEASE gooddata/gooddata-cn
    
  5. Verify postgres DB version:

    kubectl exec -n $NAMESPACE $PGHOST -c postgresql -- psql --version
    
  6. Enable application:

     helm upgrade --namespace $NAMESPACE --version 2.0.1 \
       --wait --timeout 7m -f $HELM_VALUES_FILE \
      $HELM_RELEASE gooddata/gooddata-cn
    
  7. Test the GoodData.CN deployment after rollback:

    • Is it possible to log in?
    • Does dashboard reports compute?