In the world of Apache Airflow, data pipelines are only as secure as the credentials used to fetch them. For a long time, Personal Access Tokens (PATs) have been the standard for syncing DAGs and plugins from private Git repositories. However, in modern Kubernetes-based Airflow deployments, SSH keys have proven to be the superior choice for both security and maintainability.
This article explains why your Airflow cluster should be using SSH and provides a step-by-step guide to making the transition.
🔐 The PAT Problem in Airflow
When you use a PAT to sync your DAGs, you often end up with a URL that looks like this in your values.yaml:
https://<username>:personal_access_token>@github.com/<organization>/<repository>.git
This approach has major security flaws:
- Credential Leaks: The token is hardcoded in plain text. It can be accidentally committed to Git history or exposed in Helm chart logs.
- No Machine Locking: A PAT can be used from any device globally. There is no way to restrict it to just your Airflow cluster.
- Tied to Individuals: PATs are often tied to specific developer accounts. If that person leaves, the Airflow sync breaks immediately.
🔥 The SSH Advantage for Airflow
SSH keys use asymmetric cryptography, which fundamentally changes how your cluster authenticates:
1. The Key Never Leaves the Cluster
With SSH, only the public key is shared with GitHub. Your private key stays inside a Kubernetes Secret, protected by the cluster's native security layers.
2. Built-in permission handling
Modern Airflow Helm charts (like those from Bitnami or the Community) have built-in support for SSH secrets. This allows for automated permission handling (e.g., changing file modes to 0600) which is required for secure authentication.
3. Deploy Keys: Scoped and Read-Only
By using GitHub Deploy Keys, you give your Airflow cluster access to only the repository it needs, and only for reading DAGs. This adheres to the Principle of Least Privilege.
🛠️ Steps to Migrate Airflow to SSH
Follow these steps to transition your Airflow cluster from PAT to SSH keys safely.
Step 1: Generate a Dedicated SSH Key Pair
On your local machine, generate a key without a passphrase (for automation):
ssh-keygen -t ed25519 -C "airflow-sync-key" -f airflow-deploy-key -N ""Step 2: Add the Public Key to GitHub
- Copy the content of
airflow-deploy-key.pub. - Go to your GitHub repo → Settings → Deploy keys.
- Add the key, give it a title like
airflow-gke-dags, and keep it Read-only.
Step 3: Create a Kubernetes Secret
Upload your private key to your Airflow namespace:
kubectl create secret generic airflow-git-ssh-key \
--namespace=airflow \
--from-file=ssh-private-key=airflow-deploy-keyStep 4: Update values.yaml
Replace your HTTPS PAT URLs with SSH URLs and reference the secret:
dags:
repositories:
- repository: "git@github.com:YourOrg/YourRepo.git"
branch: "main"
existingSshKeySecret: "airflow-git-ssh-key"
existingSshKeySecretKey: "ssh-private-key"Step 5: Redeploy and Verify
Run your Helm upgrade and check the logs of your git-sync or load-dags containers:
kubectl logs <airflow-pod-name> -c git-syncConclusion
Switching from PATs to SSH isn't just a technical preference — it's a critical security upgrade for your data infrastructure. By eliminating plain-text tokens and moving to machine-specific Deploy Keys, you ensure your DAGs are synced securely and reliably.
The verdict: Stop hardcoding tokens. Start using SSH. 🗝️