Add multi-tenant DNS server for external and tenant use
This commit is contained in:
220
DNS-SERVER-GUIDE.md
Normal file
220
DNS-SERVER-GUIDE.md
Normal file
@ -0,0 +1,220 @@
|
||||
# Multi-Tenant DNS Server Setup
|
||||
|
||||
## 🎯 Overview
|
||||
|
||||
This DNS server provides:
|
||||
- **External DNS** for connectvm.cloud (public access)
|
||||
- **Multi-tenant DNS** for dev, prod, and QA teams
|
||||
- **High Availability** with 3 replicas
|
||||
- **Metrics** for monitoring
|
||||
|
||||
## 🌐 DNS Zones Configured
|
||||
|
||||
### 1. Main Domain: `connectvm.cloud`
|
||||
**Public DNS for main services**
|
||||
|
||||
Available records:
|
||||
- `rancher.connectvm.cloud` → 160.30.114.10
|
||||
- `paste.connectvm.cloud` → 160.30.114.10
|
||||
- `fleet.connectvm.cloud` → 160.30.114.10
|
||||
- `hello.connectvm.cloud` → 160.30.114.10
|
||||
- `dns.connectvm.cloud` → 160.30.114.10
|
||||
- `*.connectvm.cloud` → 160.30.114.10 (wildcard)
|
||||
|
||||
### 2. Dev Team: `dev.connectvm.cloud`
|
||||
**Development team's DNS zone**
|
||||
|
||||
Available records:
|
||||
- `app1.dev.connectvm.cloud`
|
||||
- `app2.dev.connectvm.cloud`
|
||||
- `api.dev.connectvm.cloud`
|
||||
- `web.dev.connectvm.cloud`
|
||||
- `dashboard.dev.connectvm.cloud`
|
||||
- `jenkins.dev.connectvm.cloud`
|
||||
- `gitlab.dev.connectvm.cloud`
|
||||
- `db.dev.connectvm.cloud`
|
||||
- `redis.dev.connectvm.cloud`
|
||||
- `*.dev.connectvm.cloud` (wildcard)
|
||||
|
||||
### 3. Prod Team: `prod.connectvm.cloud`
|
||||
**Production team's DNS zone**
|
||||
|
||||
Available records:
|
||||
- `api.prod.connectvm.cloud`
|
||||
- `web.prod.connectvm.cloud`
|
||||
- `app.prod.connectvm.cloud`
|
||||
- `admin.prod.connectvm.cloud`
|
||||
- `portal.prod.connectvm.cloud`
|
||||
- `db.prod.connectvm.cloud`
|
||||
- `monitoring.prod.connectvm.cloud`
|
||||
- `*.prod.connectvm.cloud` (wildcard)
|
||||
|
||||
### 4. QA Team: `qa.connectvm.cloud`
|
||||
**QA/Testing team's DNS zone**
|
||||
|
||||
Available records:
|
||||
- `test.qa.connectvm.cloud`
|
||||
- `staging.qa.connectvm.cloud`
|
||||
- `selenium.qa.connectvm.cloud`
|
||||
- `automation.qa.connectvm.cloud`
|
||||
- `reports.qa.connectvm.cloud`
|
||||
- `*.qa.connectvm.cloud` (wildcard)
|
||||
|
||||
## 🔧 External Access
|
||||
|
||||
### DNS Server IPs:
|
||||
- **Primary**: `160.30.114.10:30053` (UDP/TCP)
|
||||
- **Internal**: `10.96.100.100:53` (ClusterIP)
|
||||
|
||||
### Configure Clients to Use DNS:
|
||||
|
||||
**Linux/Mac:**
|
||||
```bash
|
||||
# Add to /etc/resolv.conf
|
||||
nameserver 160.30.114.10
|
||||
|
||||
# Or use specific port with dig:
|
||||
dig @160.30.114.10 -p 30053 rancher.connectvm.cloud
|
||||
```
|
||||
|
||||
**Windows:**
|
||||
```powershell
|
||||
# Set DNS server
|
||||
netsh interface ip set dns "Ethernet" static 160.30.114.10
|
||||
```
|
||||
|
||||
**Kubernetes Pods:**
|
||||
```yaml
|
||||
spec:
|
||||
dnsPolicy: None
|
||||
dnsConfig:
|
||||
nameservers:
|
||||
- 10.96.100.100
|
||||
searches:
|
||||
- connectvm.cloud
|
||||
- dev.connectvm.cloud
|
||||
```
|
||||
|
||||
## 📊 Monitoring
|
||||
|
||||
Access DNS metrics:
|
||||
- **URL**: https://dns.connectvm.cloud/metrics
|
||||
- **Prometheus endpoint**: Port 9153
|
||||
|
||||
Metrics include:
|
||||
- Query count
|
||||
- Query types
|
||||
- Response codes
|
||||
- Cache hit/miss rates
|
||||
- Zone transfer stats
|
||||
|
||||
## 🔒 Security Features
|
||||
|
||||
- **Zone isolation**: Each tenant has separate DNS zone
|
||||
- **No zone transfers**: Zones are read-only
|
||||
- **Query logging**: All queries are logged
|
||||
- **Rate limiting**: Built-in caching (300s TTL)
|
||||
|
||||
## 🛠️ Management
|
||||
|
||||
### Add New Record:
|
||||
|
||||
1. Edit `dns-server.yaml` ConfigMap
|
||||
2. Add record to appropriate zone file
|
||||
3. Increment Serial number
|
||||
4. Git push → Fleet auto-deploys
|
||||
|
||||
**Example** - Add new dev app:
|
||||
```
|
||||
newapp IN A 160.30.114.10
|
||||
```
|
||||
|
||||
### Add New Tenant Zone:
|
||||
|
||||
1. Create new zone file in ConfigMap
|
||||
2. Add zone to Corefile
|
||||
3. Git push
|
||||
|
||||
## 🧪 Testing
|
||||
|
||||
Test DNS resolution:
|
||||
```bash
|
||||
# Test main domain
|
||||
dig @160.30.114.10 -p 30053 rancher.connectvm.cloud
|
||||
|
||||
# Test dev tenant
|
||||
dig @160.30.114.10 -p 30053 app1.dev.connectvm.cloud
|
||||
|
||||
# Test prod tenant
|
||||
dig @160.30.114.10 -p 30053 api.prod.connectvm.cloud
|
||||
|
||||
# Test QA tenant
|
||||
dig @160.30.114.10 -p 30053 test.qa.connectvm.cloud
|
||||
|
||||
# Test wildcard
|
||||
dig @160.30.114.10 -p 30053 anything.dev.connectvm.cloud
|
||||
```
|
||||
|
||||
## 📱 Use Cases
|
||||
|
||||
### For Dev Team:
|
||||
```bash
|
||||
# Deploy app with custom DNS
|
||||
kubectl create deployment myapp --image=nginx
|
||||
kubectl expose deployment myapp --port=80
|
||||
|
||||
# Access via: myapp.dev.connectvm.cloud
|
||||
```
|
||||
|
||||
### For Prod Team:
|
||||
```bash
|
||||
# Production API endpoint
|
||||
api.prod.connectvm.cloud → Production API server
|
||||
```
|
||||
|
||||
### For QA Team:
|
||||
```bash
|
||||
# Automated testing
|
||||
selenium.qa.connectvm.cloud → Selenium Grid
|
||||
automation.qa.connectvm.cloud → Test Runner
|
||||
```
|
||||
|
||||
## 🚀 High Availability
|
||||
|
||||
- **3 replicas** across different nodes
|
||||
- **Anti-affinity** rules for pod distribution
|
||||
- **Auto-restart** on failure
|
||||
- **Health checks** every 10 seconds
|
||||
|
||||
## 📝 DNS Server Info
|
||||
|
||||
- **Software**: CoreDNS 1.11.1
|
||||
- **Protocol**: DNS (UDP/TCP port 53)
|
||||
- **Upstream**: Google DNS (8.8.8.8, 8.8.4.4)
|
||||
- **Cache TTL**: 300 seconds
|
||||
- **Zone TTL**: 3600 seconds (1 hour)
|
||||
|
||||
## 🎯 Architecture
|
||||
|
||||
```
|
||||
External Queries (port 30053)
|
||||
↓
|
||||
NodePort Service
|
||||
↓
|
||||
DNS Pods (3 replicas)
|
||||
↓
|
||||
┌─────────────┬──────────────┬─────────────┐
|
||||
│ Main Zone │ Dev Zone │ Prod Zone │
|
||||
│ qa Zone │ K8s DNS │ Upstream │
|
||||
└─────────────┴──────────────┴─────────────┘
|
||||
```
|
||||
|
||||
## 🔄 Updates
|
||||
|
||||
All DNS updates happen via GitOps:
|
||||
1. Edit zone files in Git
|
||||
2. Push to Gitea
|
||||
3. Fleet auto-deploys in ~15 seconds
|
||||
4. DNS records updated automatically
|
||||
|
||||
No manual DNS server management needed!
|
||||
378
dns-server.yaml
Normal file
378
dns-server.yaml
Normal file
@ -0,0 +1,378 @@
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: dns-server
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: custom-dns-config
|
||||
namespace: dns-server
|
||||
data:
|
||||
Corefile: |
|
||||
# ================================================
|
||||
# External DNS Server for connectvm.cloud
|
||||
# Handles public DNS queries and tenant DNS
|
||||
# ================================================
|
||||
|
||||
# Main domain - connectvm.cloud (Public)
|
||||
connectvm.cloud:53 {
|
||||
errors
|
||||
log
|
||||
file /etc/coredns/connectvm.cloud.db connectvm.cloud
|
||||
prometheus :9153
|
||||
}
|
||||
|
||||
# Tenant: dev-team (Development Team)
|
||||
dev.connectvm.cloud:53 {
|
||||
errors
|
||||
log
|
||||
file /etc/coredns/dev.connectvm.cloud.db dev.connectvm.cloud
|
||||
prometheus :9153
|
||||
}
|
||||
|
||||
# Tenant: prod-team (Production Team)
|
||||
prod.connectvm.cloud:53 {
|
||||
errors
|
||||
log
|
||||
file /etc/coredns/prod.connectvm.cloud.db prod.connectvm.cloud
|
||||
prometheus :9153
|
||||
}
|
||||
|
||||
# Tenant: qa-team (QA Team)
|
||||
qa.connectvm.cloud:53 {
|
||||
errors
|
||||
log
|
||||
file /etc/coredns/qa.connectvm.cloud.db qa.connectvm.cloud
|
||||
prometheus :9153
|
||||
}
|
||||
|
||||
# Internal Kubernetes DNS
|
||||
cluster.local:53 {
|
||||
errors
|
||||
kubernetes cluster.local in-addr.arpa ip6.arpa {
|
||||
pods insecure
|
||||
fallthrough in-addr.arpa ip6.arpa
|
||||
}
|
||||
prometheus :9153
|
||||
}
|
||||
|
||||
# Forward all other queries to upstream DNS
|
||||
.:53 {
|
||||
errors
|
||||
log
|
||||
forward . 8.8.8.8 8.8.4.4
|
||||
cache 300
|
||||
prometheus :9153
|
||||
}
|
||||
|
||||
# Main domain zone file
|
||||
connectvm.cloud.db: |
|
||||
$ORIGIN connectvm.cloud.
|
||||
$TTL 3600
|
||||
@ IN SOA ns1.connectvm.cloud. admin.connectvm.cloud. (
|
||||
2025111501 ; Serial
|
||||
7200 ; Refresh
|
||||
3600 ; Retry
|
||||
1209600 ; Expire
|
||||
3600 ) ; Negative Cache TTL
|
||||
|
||||
; Name Servers
|
||||
@ IN NS ns1.connectvm.cloud.
|
||||
@ IN NS ns2.connectvm.cloud.
|
||||
ns1 IN A 160.30.114.10
|
||||
ns2 IN A 160.30.114.10
|
||||
|
||||
; Main Services (Public)
|
||||
@ IN A 160.30.114.10
|
||||
www IN A 160.30.114.10
|
||||
rancher IN A 160.30.114.10
|
||||
paste IN A 160.30.114.10
|
||||
fleet IN A 160.30.114.10
|
||||
hello IN A 160.30.114.10
|
||||
dns IN A 160.30.114.10
|
||||
|
||||
; Tenant Delegations
|
||||
dev IN NS ns1.connectvm.cloud.
|
||||
prod IN NS ns1.connectvm.cloud.
|
||||
qa IN NS ns1.connectvm.cloud.
|
||||
|
||||
; Email
|
||||
@ IN MX 10 mail.connectvm.cloud.
|
||||
mail IN A 160.30.114.10
|
||||
|
||||
; Wildcard
|
||||
* IN A 160.30.114.10
|
||||
|
||||
# Dev Team Tenant Zone
|
||||
dev.connectvm.cloud.db: |
|
||||
$ORIGIN dev.connectvm.cloud.
|
||||
$TTL 3600
|
||||
@ IN SOA ns1.connectvm.cloud. admin.dev.connectvm.cloud. (
|
||||
2025111501 ; Serial
|
||||
3600 ; Refresh
|
||||
1800 ; Retry
|
||||
604800 ; Expire
|
||||
3600 ) ; Negative Cache TTL
|
||||
|
||||
; Name Server
|
||||
@ IN NS ns1.connectvm.cloud.
|
||||
|
||||
; Dev Team Applications
|
||||
@ IN A 160.30.114.10
|
||||
app1 IN A 160.30.114.10
|
||||
app2 IN A 160.30.114.10
|
||||
api IN A 160.30.114.10
|
||||
web IN A 160.30.114.10
|
||||
dashboard IN A 160.30.114.10
|
||||
jenkins IN A 160.30.114.10
|
||||
gitlab IN A 160.30.114.10
|
||||
|
||||
; Development databases
|
||||
db IN A 160.30.114.10
|
||||
redis IN A 160.30.114.10
|
||||
|
||||
; Wildcard for dev team
|
||||
* IN A 160.30.114.10
|
||||
|
||||
# Production Team Tenant Zone
|
||||
prod.connectvm.cloud.db: |
|
||||
$ORIGIN prod.connectvm.cloud.
|
||||
$TTL 3600
|
||||
@ IN SOA ns1.connectvm.cloud. admin.prod.connectvm.cloud. (
|
||||
2025111501 ; Serial
|
||||
3600 ; Refresh
|
||||
1800 ; Retry
|
||||
604800 ; Expire
|
||||
3600 ) ; Negative Cache TTL
|
||||
|
||||
; Name Server
|
||||
@ IN NS ns1.connectvm.cloud.
|
||||
|
||||
; Production Applications
|
||||
@ IN A 160.30.114.10
|
||||
api IN A 160.30.114.10
|
||||
web IN A 160.30.114.10
|
||||
app IN A 160.30.114.10
|
||||
admin IN A 160.30.114.10
|
||||
portal IN A 160.30.114.10
|
||||
|
||||
; Production infrastructure
|
||||
db IN A 160.30.114.10
|
||||
cache IN A 160.30.114.10
|
||||
monitoring IN A 160.30.114.10
|
||||
|
||||
; Wildcard for prod team
|
||||
* IN A 160.30.114.10
|
||||
|
||||
# QA Team Tenant Zone
|
||||
qa.connectvm.cloud.db: |
|
||||
$ORIGIN qa.connectvm.cloud.
|
||||
$TTL 3600
|
||||
@ IN SOA ns1.connectvm.cloud. admin.qa.connectvm.cloud. (
|
||||
2025111501 ; Serial
|
||||
3600 ; Refresh
|
||||
1800 ; Retry
|
||||
604800 ; Expire
|
||||
3600 ) ; Negative Cache TTL
|
||||
|
||||
; Name Server
|
||||
@ IN NS ns1.connectvm.cloud.
|
||||
|
||||
; QA/Testing Applications
|
||||
@ IN A 160.30.114.10
|
||||
test IN A 160.30.114.10
|
||||
staging IN A 160.30.114.10
|
||||
selenium IN A 160.30.114.10
|
||||
automation IN A 160.30.114.10
|
||||
reports IN A 160.30.114.10
|
||||
|
||||
; QA infrastructure
|
||||
db IN A 160.30.114.10
|
||||
|
||||
; Wildcard for QA team
|
||||
* IN A 160.30.114.10
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: dns-server
|
||||
namespace: dns-server
|
||||
labels:
|
||||
app: dns-server
|
||||
spec:
|
||||
replicas: 3
|
||||
selector:
|
||||
matchLabels:
|
||||
app: dns-server
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: dns-server
|
||||
spec:
|
||||
affinity:
|
||||
podAntiAffinity:
|
||||
preferredDuringSchedulingIgnoredDuringExecution:
|
||||
- weight: 100
|
||||
podAffinityTerm:
|
||||
labelSelector:
|
||||
matchExpressions:
|
||||
- key: app
|
||||
operator: In
|
||||
values:
|
||||
- dns-server
|
||||
topologyKey: kubernetes.io/hostname
|
||||
containers:
|
||||
- name: coredns
|
||||
image: coredns/coredns:1.11.1
|
||||
args: ["-conf", "/etc/coredns/Corefile"]
|
||||
volumeMounts:
|
||||
- name: config-volume
|
||||
mountPath: /etc/coredns
|
||||
readOnly: true
|
||||
ports:
|
||||
- containerPort: 53
|
||||
name: dns
|
||||
protocol: UDP
|
||||
- containerPort: 53
|
||||
name: dns-tcp
|
||||
protocol: TCP
|
||||
- containerPort: 9153
|
||||
name: metrics
|
||||
protocol: TCP
|
||||
livenessProbe:
|
||||
httpGet:
|
||||
path: /health
|
||||
port: 8080
|
||||
scheme: HTTP
|
||||
initialDelaySeconds: 60
|
||||
periodSeconds: 10
|
||||
timeoutSeconds: 5
|
||||
successThreshold: 1
|
||||
failureThreshold: 5
|
||||
readinessProbe:
|
||||
httpGet:
|
||||
path: /ready
|
||||
port: 8181
|
||||
scheme: HTTP
|
||||
initialDelaySeconds: 10
|
||||
periodSeconds: 5
|
||||
timeoutSeconds: 5
|
||||
successThreshold: 1
|
||||
failureThreshold: 3
|
||||
resources:
|
||||
requests:
|
||||
memory: "256Mi"
|
||||
cpu: "200m"
|
||||
limits:
|
||||
memory: "1Gi"
|
||||
cpu: "1000m"
|
||||
volumes:
|
||||
- name: config-volume
|
||||
configMap:
|
||||
name: custom-dns-config
|
||||
items:
|
||||
- key: Corefile
|
||||
path: Corefile
|
||||
- key: connectvm.cloud.db
|
||||
path: connectvm.cloud.db
|
||||
- key: dev.connectvm.cloud.db
|
||||
path: dev.connectvm.cloud.db
|
||||
- key: prod.connectvm.cloud.db
|
||||
path: prod.connectvm.cloud.db
|
||||
- key: qa.connectvm.cloud.db
|
||||
path: qa.connectvm.cloud.db
|
||||
---
|
||||
# External DNS Service (NodePort for external access)
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: dns-external
|
||||
namespace: dns-server
|
||||
labels:
|
||||
app: dns-server
|
||||
annotations:
|
||||
metallb.universe.tf/allow-shared-ip: dns
|
||||
spec:
|
||||
selector:
|
||||
app: dns-server
|
||||
type: NodePort
|
||||
ports:
|
||||
- port: 53
|
||||
targetPort: 53
|
||||
nodePort: 30053
|
||||
protocol: UDP
|
||||
name: dns-udp
|
||||
- port: 53
|
||||
targetPort: 53
|
||||
nodePort: 30053
|
||||
protocol: TCP
|
||||
name: dns-tcp
|
||||
---
|
||||
# Internal DNS Service (ClusterIP for internal use)
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: dns-internal
|
||||
namespace: dns-server
|
||||
labels:
|
||||
app: dns-server
|
||||
spec:
|
||||
selector:
|
||||
app: dns-server
|
||||
type: ClusterIP
|
||||
clusterIP: 10.96.100.100
|
||||
ports:
|
||||
- port: 53
|
||||
targetPort: 53
|
||||
protocol: UDP
|
||||
name: dns-udp
|
||||
- port: 53
|
||||
targetPort: 53
|
||||
protocol: TCP
|
||||
name: dns-tcp
|
||||
---
|
||||
# Metrics Service
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: dns-metrics
|
||||
namespace: dns-server
|
||||
labels:
|
||||
app: dns-server
|
||||
spec:
|
||||
selector:
|
||||
app: dns-server
|
||||
type: ClusterIP
|
||||
ports:
|
||||
- port: 9153
|
||||
targetPort: 9153
|
||||
protocol: TCP
|
||||
name: metrics
|
||||
---
|
||||
# Web UI/Metrics Ingress
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: dns-metrics-ingress
|
||||
namespace: dns-server
|
||||
annotations:
|
||||
cert-manager.io/cluster-issuer: "selfsigned-issuer"
|
||||
nginx.ingress.kubernetes.io/ssl-redirect: "true"
|
||||
spec:
|
||||
ingressClassName: nginx
|
||||
tls:
|
||||
- hosts:
|
||||
- dns.connectvm.cloud
|
||||
secretName: dns-metrics-tls
|
||||
rules:
|
||||
- host: dns.connectvm.cloud
|
||||
http:
|
||||
paths:
|
||||
- path: /metrics
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: dns-metrics
|
||||
port:
|
||||
number: 9153
|
||||
Reference in New Issue
Block a user