AWS
This guide walks through deploying NKP on AWS with nic (the Nebari Infrastructure Core CLI), from an empty AWS account to a running cluster you can manage and tear down.
What your team gets
When nic deploy finishes, your team will have all standard NKP services plus:
- A managed Kubernetes cluster ready for workloads (AWS EKS, multi-AZ by default).
- Optional shared storage (EFS) that pods on any node can mount when enabled in your config.
Prerequisites
AWS account
You'll need access to an AWS account you can deploy into. If you don't have one, sign up for AWS.
- Region: pick one where EKS is available and has at least two availability zones. The example uses
us-west-2aandus-west-2b. - Quotas: confirm your region has enough EKS service quota for the cluster you're about to create. New accounts often need an increase.
- Cost: NKP does not fit within the free tier; EKS, NAT gateways, and EFS all bill from day one.
IAM permissions
Running nic deploy (and nic destroy) needs permissions across S3, EC2, EKS, IAM, EFS, SSM, CloudWatch Logs, and Elastic Load Balancing.
For a first deployment, admin credentials are the fastest path. For production, create a customer-managed policy from this least-privilege IAM policy and attach it to your deploy user or role.
The policy grants the minimum for a complete cluster with all optional features (EFS, brownfield VPC adoption). If you don't use those, you can omit the corresponding permissions.
Install nic
Follow the Install NIC guide to download and install the nic CLI for your platform.
GitOps repository
See GitOps repository in the Prepare to deploy guide.
Secrets and credentials
From inside your GitOps repo clone, download the template:
cd /path/to/your-gitops-repo
curl -o .env https://raw.githubusercontent.com/nebari-dev/nebari-infrastructure-core/main/.env.example
Then uncomment and fill in the AWS block and the GitOps tokens. For .gitignore setup and GitOps token configuration, see Secrets and credentials in the Prepare to deploy guide.
AWS credentials
NIC uses the AWS Go SDK's standard credential chain, so any of these work:
AWS_PROFILE=<name>: points to an SSO or static profile in~/.aws/config. For SSO, runaws sso login --profile <name>first so the session is valid.AWS_ACCESS_KEY_ID+AWS_SECRET_ACCESS_KEY: static IAM user keys. Use only if you've created an IAM user with the required permissions.AWS_SESSION_TOKEN(in addition to the above): required if your keys are temporary (e.g., copied from the AWS access portal's "Access keys" popup).
Pick one pattern for .env:
AWS_PROFILE=my-sso-profile # SSO / IAM Identity Center
# OR static keys:
# AWS_ACCESS_KEY_ID=AKIA...
# AWS_SECRET_ACCESS_KEY=...
# AWS_SESSION_TOKEN=... # only if keys are temporary
Verify the credentials work before deploying:
source .env
aws sts get-caller-identity
You should see your AWS account ID and the role/user the credentials resolve to.
Cost considerations
A NKP deployment provisions several AWS services that bill from day one. Check the AWS pricing pages for current rates in your region.
- EKS control plane: flat per-cluster hourly rate.
- EC2 instances in your node groups: instance hours per running node.
- EBS root volumes on each node: per-GB-month for node disks.
- NAT gateways: one per AZ (minimum two, since EKS requires two AZs). Hourly rate per gateway plus per-GB data processing.
- NLBs: the AWS Load Balancer Controller creates one for ingress. Hourly rate plus Load Balancer Capacity Units (LCU).
- EFS: per-GB-month for shared cluster storage.
burstingis the default and is cheap at low utilization. - VPC interface endpoints: around nine created by default. Hourly rate per endpoint per AZ, plus per-GB data processing.
- KMS: one customer-managed key for EKS secrets encryption. Per-key-month plus per-request.
- CloudWatch Logs: EKS control-plane log ingestion and storage.
Configuration
Pick the starter config that matches your DNS setup:
aws-config.yamlfor any DNS provider: you'll create A/CNAME records manually at deploy time using valuesnicprints.aws-config-with-dns.yamlfor Cloudflare-hosted domains:niccreates the records automatically.
Download the one you want into the directory you'll deploy from:
# Pick one:
curl -O https://raw.githubusercontent.com/nebari-dev/nebari-infrastructure-core/main/examples/aws-config.yaml
curl -O https://raw.githubusercontent.com/nebari-dev/nebari-infrastructure-core/main/examples/aws-config-with-dns.yaml
In later steps, <config-file> refers to the local copy you just created (e.g., aws-config.yaml).
At minimum, edit these fields:
project_name: my-cluster # lowercase alphanumeric
domain: nebari.example.com # a hostname you own
certificate:
acme:
email: you@example.com # required for Let's Encrypt; renewal notices go here
git_repository:
url: "https://github.com/<your-org>/<your-gitops-repo>.git"
path: clusters/my-cluster # subdirectory in the repo; conventionally matches project_name
auth:
token_env: GIT_TOKEN # matches the GIT_TOKEN set in .env
cluster:
aws:
region: us-west-2 # an EKS-supported region
availability_zones:
- us-west-2a # at least two AZs
- us-west-2b
# Only if you used aws-config-with-dns.yaml:
dns:
cloudflare:
zone_name: example.com # your Cloudflare zone (parent of `domain`)
To use aws-config-with-dns.yaml, generate a Cloudflare API token with Zone:Read and DNS:Edit permissions on the zone in dns.cloudflare.zone_name. See Cloudflare DNS for how to add it to .env.
For the full schema (brownfield VPC adoption, custom KMS keys, advanced node-group options, Longhorn, log types), see the NIC configuration reference.
Deploy and verify
Follow Deploy a cluster to deploy and verify. The first deployment takes at least 30 minutes (about 10 for the EKS control plane, then ArgoCD syncing).
After deploy, DNS handling depends on which config you chose: aws-config.yaml requires a manual A/CNAME record; aws-config-with-dns.yaml creates DNS records automatically. See Cloudflare DNS for details.
Verifying on AWS also needs the AWS CLI — the generated kubeconfig calls aws eks get-token to authenticate. If kubectl fails with Token has expired and refresh failed, your AWS SSO session timed out (default 8 hours); run aws sso login --profile <aws-profile> and retry.
IAM roles nic creates in your account
nic deploy creates a few IAM roles in your account so the cluster, the nodes, and the in-cluster controllers can each do their AWS work safely:
| Role | Purpose |
|---|---|
| EKS cluster role | Lets EKS create the network interfaces, security groups, and log groups the cluster needs. |
| EKS node-group role | Grants nodes the standard EKS worker, ECR-read-only, and CNI policies. |
| AWS Load Balancer Controller role | Lets the in-cluster controller create and manage load balancers for ingress. |
| EBS CSI driver role | Lets the in-cluster driver mount EBS volumes into pods. |
| EFS CSI driver role (if EFS enabled) | Lets the in-cluster driver mount EFS into pods. |
First sign-in
See Keycloak authentication for first sign-in.
Update an existing deployment
To change something about a running cluster (scale a node group, add a gpu group, change tags, switch EFS throughput), edit your config and re-run the deploy commands as described in Update a cluster.
Changing region, project_name, or vpc_cidr_block triggers destructive resource recreation. Treat these as one-way decisions.
Destroy
Run the destroy commands as described in Destroy a cluster.
nic destroy removes EKS, node groups, EFS, VPC components, and the state bucket. If a resource fails to delete (commonly, a leftover load balancer from the cluster's ingress), remove it manually in the AWS console before retrying with --force.
Always confirm in the AWS console that no orphan resources remain. NAT gateways, load balancers, and EBS volumes can keep billing if they're not cleaned up.