12 KiB
Talos Cluster on Proxmox - Terraform Configuration
This Terraform project creates and provisions a Talos Kubernetes cluster on Proxmox VE with integrated Proxmox Cloud Controller Manager (CCM) and Container Storage Interface (CSI) driver.
Features
- 🚀 Automated VM provisioning on Proxmox VE
- ☁️ Proxmox Cloud Controller Manager - Native Proxmox integration for Kubernetes
- 💾 Proxmox CSI Driver - Dynamic volume provisioning using Proxmox storage
- 🔄 High Availability - Multi-node control plane with optional VIP
- 🌐 Flexible networking - DHCP or static IP configuration
- 📦 Full stack deployment - From VMs to running Kubernetes cluster
Prerequisites
- Proxmox VE server with API access
- Terraform >= 1.0
- SSH access to Proxmox node
- Network requirements:
- Available IP addresses for VMs (DHCP or static)
- Network connectivity between VMs
- Access to download Talos ISO (for initial setup)
Quick Start
1. Create terraform.tfvars
Create a terraform.tfvars file with your Proxmox and cluster configuration:
# Proxmox Connection
proxmox_endpoint = "https://proxmox.example.com:8006"
proxmox_username = "root@pam"
proxmox_password = "your-password"
proxmox_node = "pve"
# Proxmox API Tokens (required for CCM/CSI)
proxmox_ccm_token_secret = "your-ccm-token-secret"
proxmox_csi_token_secret = "your-csi-token-secret"
# Cluster Configuration
cluster_name = "talos-cluster"
cluster_endpoint = "https://10.0.0.100:6443"
# VM Configuration
controlplane_count = 3
worker_count = 2
# Network (DHCP - IPs will be auto-assigned)
# For static IPs, see advanced configuration below
2. Initialize and Apply
terraform init
terraform plan
terraform apply
3. Get Cluster Access
# Get talosconfig
terraform output -raw talosconfig > ~/.talos/config
# Get kubeconfig
terraform output -raw kubeconfig > ~/.kube/config
# Verify cluster
talosctl version --nodes <controlplane-ip>
kubectl get nodes
4. Verify Proxmox Integration
# Check CCM is running
kubectl get pods -n kube-system | grep proxmox-cloud-controller
# Check CSI is running
kubectl get pods -n csi-proxmox
# View available storage classes
kubectl get storageclass
# Create a test PVC
kubectl apply -f - <<EOF
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: test-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: proxmox-data
EOF
Configuration Options
Basic Configuration
| Variable | Description | Default |
|---|---|---|
proxmox_endpoint |
Proxmox API endpoint | - |
proxmox_username |
Proxmox username | root@pam |
proxmox_password |
Proxmox password | - |
proxmox_insecure |
Allow insecure Proxmox API connections | true |
proxmox_ssh_user |
SSH user for Proxmox node | root |
proxmox_node |
Proxmox node name | - |
proxmox_storage |
Storage location for VM disks | local |
proxmox_network_bridge |
Network bridge for VMs | vmbr40 |
proxmox_ccm_token_secret |
Proxmox API token for CCM (sensitive) | - |
proxmox_csi_token_secret |
Proxmox API token for CSI (sensitive) | - |
cluster_name |
Talos cluster name | - |
cluster_endpoint |
Cluster API endpoint | - |
vm_id_prefix |
Starting VM ID prefix | 800 |
talos_version |
Talos version to use | v1.9.1 |
talos_iso_url |
Custom Talos ISO URL | "" (uses default) |
Network Configuration
| Variable | Description | Default |
|---|---|---|
controlplane_ips |
Static IPs for control plane nodes | [] (DHCP) |
worker_ips |
Static IPs for worker nodes | [] (DHCP) |
gateway |
Default gateway (required for static IPs) | "" |
netmask |
Network mask in CIDR notation | 24 |
nameservers |
DNS nameservers | ["1.1.1.1", "8.8.8.8"] |
cluster_vip |
Virtual IP for HA control plane | "" (disabled) |
Proxmox Integration
| Variable | Description | Default |
|---|---|---|
proxmox_region |
Region identifier for CCM | proxmox |
VM Resources
| Variable | Description | Default |
|---|---|---|
controlplane_count |
Number of control plane nodes | 3 |
worker_count |
Number of worker nodes | 2 |
controlplane_cpu |
CPU cores per control plane | 2 |
controlplane_memory |
Memory (MB) per control plane | 4096 |
controlplane_disk_size |
Disk size per control plane | 20 |
worker_cpu |
CPU cores per worker | 4 |
worker_memory |
Memory (MB) per worker | 8192 |
worker_disk_size |
Disk size per worker | 10 |
Static IP Configuration
For production deployments, use static IPs. All three parameters (IPs, gateway, and netmask) must be configured together:
# Control plane IPs
controlplane_ips = [
"10.0.0.101",
"10.0.0.102",
"10.0.0.103"
]
# Worker IPs
worker_ips = [
"10.0.0.104",
"10.0.0.105"
]
# Network settings (required for static IPs)
gateway = "10.0.0.1" # Default gateway
netmask = 24 # CIDR notation (e.g., 24 = 255.255.255.0)
nameservers = ["1.1.1.1", "8.8.8.8"] # DNS servers
# Use VIP for control plane endpoint
cluster_vip = "10.0.0.100"
cluster_endpoint = "https://10.0.0.100:6443"
Important: When using static IPs, you must configure:
controlplane_ipsand/orworker_ips- List of IP addressesgateway- Network gateway IP addressnetmask- Network mask in CIDR notation (default: 24)nameservers- DNS servers (default: ["1.1.1.1", "8.8.8.8"])
If any of these are missing, the nodes will use DHCP instead.
High Availability Setup
For HA control plane, configure a virtual IP:
cluster_vip = "10.0.0.100"
cluster_endpoint = "https://10.0.0.100:6443"
controlplane_count = 3 # Minimum 3 for HA
Custom Talos Version
talos_version = "v1.9.1"
# Or use custom ISO URL
talos_iso_url = "https://custom-mirror.com/talos.iso"
Advanced Configuration
Custom Storage Backend
proxmox_storage = "ceph-storage" # or "nfs-backup", etc.
Custom Network Bridge
proxmox_network_bridge = "vmbr1"
Custom VM ID Range
vm_id_prefix = 1000 # VMs will be 1000, 1001, 1002, etc.
Proxmox API Token Setup
The CCM and CSI drivers require Proxmox API tokens for authentication. Generate tokens in Proxmox:
- Navigate to Datacenter → Permissions → API Tokens
- Create a token for CCM with appropriate permissions
- Create a token for CSI with storage permissions
- Add the token secrets to your
terraform.tfvars:
proxmox_ccm_token_secret = "your-ccm-api-token-secret"
proxmox_csi_token_secret = "your-csi-api-token-secret"
Architecture
The project creates:
-
Control Plane VMs (default: 3)
- Run Kubernetes control plane components
- Can schedule workload pods if configured
- Participate in etcd cluster
- Run Proxmox CCM for cloud provider integration
-
Worker VMs (default: 2)
- Run application workloads
- Join the cluster automatically
- Support CSI for dynamic volume provisioning
-
Talos Configuration
- Machine secrets and certificates
- Node-specific configurations
- Client configurations (talosconfig, kubeconfig)
- Cloud provider configuration for CCM integration
-
Proxmox Integration
- CCM (Cloud Controller Manager): Provides node lifecycle management and metadata
- CSI (Container Storage Interface): Enables dynamic PV provisioning from Proxmox storage
Workflow
- VM Creation: VMs are created in Proxmox with Talos ISO attached
- Boot to Maintenance: VMs boot into Talos maintenance mode
- Configuration Apply: Terraform applies Talos machine configurations with cloud-provider settings
- Cluster Bootstrap: First control plane node bootstraps the cluster
- Node Join: Remaining nodes join automatically
- Kubeconfig Generation: Cluster credentials are generated
- CCM Installation: Proxmox Cloud Controller Manager is deployed (if enabled)
- CSI Installation: Proxmox CSI driver and storage class are deployed (if enabled)
Proxmox Integration Details
Cloud Controller Manager (CCM)
The CCM provides:
- Node Management: Automatic node registration with Proxmox metadata
- Node Labels: Topology labels (region, zone, instance-type)
- Node Lifecycle: Proper handling of node additions and removals
Nodes are automatically labeled with:
node.kubernetes.io/instance-type: proxmox
topology.kubernetes.io/region: <proxmox_region>
topology.kubernetes.io/zone: <proxmox_node>
Container Storage Interface (CSI)
The CSI driver provides:
- Dynamic Provisioning: Automatically create volumes in Proxmox storage
- Volume Expansion: Support for expanding PVCs
- Multiple Storage Backends: Use any Proxmox storage (LVM, ZFS, Ceph, NFS, etc.)
Example usage:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
storageClassName: proxmox-data
Accessing the Cluster
Talos CLI
# Export talosconfig
terraform output -raw talosconfig > ~/.talos/config
# Get nodes
talosctl get members
# Get service status
talosctl services
# Access logs
talosctl logs kubelet
Kubernetes CLI
# Export kubeconfig
terraform output -raw kubeconfig > ~/.kube/config
# Get cluster info
kubectl cluster-info
kubectl get nodes -o wide
kubectl get pods -A
# Check Proxmox integrations
kubectl get pods -n kube-system | grep proxmox
kubectl get pods -n csi-proxmox
kubectl get storageclass
Maintenance
Upgrading Talos
# Update talos_version variable
talos_version = "v1.9.2"
# Apply changes
terraform apply
# Or upgrade manually
talosctl upgrade --image ghcr.io/siderolabs/installer:v1.9.2
Scaling Workers
# Update worker_count
worker_count = 5
# Apply changes
terraform apply
Removing the Cluster
terraform destroy
Troubleshooting
VMs not getting IP addresses
For DHCP:
- Check Proxmox network bridge configuration
- Verify DHCP server is running on the network
- Ensure VMs are connected to the correct network bridge
For Static IPs:
- Verify all required parameters are set:
controlplane_ips/worker_ips,gateway, andnetmask - Check that IPs are in the correct subnet
- Ensure gateway IP is correct and reachable
- Verify no IP conflicts with existing devices
Cannot connect to nodes
- Verify firewall rules allow port 50000 (Talos API)
- Check VM networking in Proxmox
- Ensure nodes are in maintenance mode:
talosctl version --nodes <ip>
Bootstrap fails
- Check control plane IPs are correct
- Verify cluster_endpoint is accessible
- Review logs:
talosctl logs etcd
ISO upload fails
- Verify SSH access to Proxmox node
- Check
/var/lib/vz/template/iso/permissions - Manually upload ISO if needed
CCM/CSI not working
- Verify Proxmox API token secrets are correct
- Check that tokens have appropriate permissions in Proxmox
- Review template logs for CCM/CSI configuration
Project Structure
.
├── main.tf # Main VM, Talos, CCM/CSI resources
├── variables.tf # Input variables
├── outputs.tf # Output values
├── versions.tf # Provider versions (Talos, Proxmox, Helm, K8s)
├── locals.tf # Local values
├── terraform.tfvars # Your configuration (create this)
├── templates/
│ ├── install-disk-and-hostname.yaml.tmpl
│ ├── static-ip.yaml.tmpl # Static IP configuration
│ ├── node-labels.yaml.tmpl
│ └── vip-config.yaml.tmpl
└── files/
├── cp-scheduling.yaml
└── cloud-provider.yaml
References
- Talos Documentation
- Talos Terraform Provider
- Proxmox Terraform Provider
- Proxmox CCM
- Proxmox CSI
- Siderolabs Contrib Examples
License
Based on examples from siderolabs/contrib