Description
redis_cache stores DNS responses in a shared Redis-compatible backend (Redis, Valkey, or any RESP-protocol server) so that multiple CoreDNS instances can amortize upstream lookups across the fleet — for example several pods in a Kubernetes cluster, or a fleet of node-local-dns daemons. It is intended to sit behind the built-in cache plugin, which stays as the L1 (in-process) cache; redis_cache is the L2 (networked) cache.
If the Redis backend is unreachable the plugin becomes a noop and lookups continue to flow
through the rest of the chain. Writes never block the DNS reply (they run in a fire-and-forget
goroutine on a detached context). Reads are bounded by the configured timeout read budget
(default 500ms) — the GET + TTL pipeline, pool wait and any retries all share that single
budget — so a stalled Redis adds at most one read timeout to a single DNS reply before the
plugin falls through. Read and write errors are surfaced via the get_errors_total and
set_errors_total metrics so a broken cache is distinguishable from a cold one.
Each response is cached for the duration of its record TTL, clamped into a configurable range:
max(min, min(record_TTL, max)). Defaults are 1h max for positive responses and 30m max
for denials, both with no minimum floor; raise or lower either bound via the success and
denial directives.
Syntax
redis_cache [ZONES...] {
success MAX_TTL [MIN_TTL]
denial MAX_TTL [MIN_TTL]
endpoint ENDPOINT
read_endpoint ENDPOINT [ENDPOINT...]
key_prefix STRING
db NUMBER
sentinel MASTER_NAME SENTINEL_ADDR [SENTINEL_ADDR...]
cluster SEED_ADDR [SEED_ADDR...]
read_from latency|random|primary
username USERNAME
password PASSWORD
sentinel_username USERNAME
sentinel_password PASSWORD
timeout {
connect DURATION
read DURATION
write DURATION
}
pool {
size N
min_idle N
max_idle N
max_active N
max_idle_time DURATION
max_lifetime DURATION
wait_timeout DURATION
}
retries {
max N
min_backoff DURATION
max_backoff DURATION
}
tcp_keepalive DURATION
tls
tls_cert PATH
tls_key PATH
tls_ca PATH
tls_verify_chain BOOL
tls_verify_hostname BOOL
resolver ADDRESS
}
Each sub-directive can be omitted; when present, its own arguments are required. Bare
redis_cache with no block attempts to connect to 127.0.0.1:6379 with default TTL
bounds — useful only against a sidecar Redis on localhost; production deployments must
specify at least one of endpoint, sentinel, or cluster. The chosen topology mode
determines which other directives are valid; the parser errors at load time on conflicting
combinations:
-
clustermode rejectsendpoint,read_endpoint,sentinel, and anydbother than0(Redis Cluster only supports DB 0). Seed addresses come fromcluster; the rest of the topology is discovered viaCLUSTER SLOTS. -
sentinelmode rejectsendpointandread_endpoint— the master and replicas are discovered via Sentinel. -
Default mode (neither
clusternorsentinel): writes go toendpoint. With noread_endpoint, the same client serves reads. With one, that client serves reads. With ≥2, each GET picks a replica at random. Rejectsread_fromandsentinel_username/sentinel_password. -
ZONES (positional) — zones to cache for. Defaults to the surrounding server-block zones.
-
success MAX_TTL [MIN_TTL]— override TTL bounds for positive responses. MAX_TTL caps the cache duration (default1h). MIN_TTL sets a floor (default0) — when the upstream record TTL is shorter than this value, the cache duration is raised to this floor. Each value accepts a Go duration (30s,1h) or a bare integer (seconds); sub-second values like500msare rejected. -
denial MAX_TTL [MIN_TTL]— same assuccessbut for negative responses (NXDOMAIN/NODATA). Defaults: MAX_TTL30m, MIN_TTL0. -
endpoint— write endpoint address (default127.0.0.1:6379). Accepts IPs or hostnames. If a port is omitted, 6379 is assumed. -
read_endpoint— one or more read-only replica addresses. GETs route here, SETs go toendpoint. With ≥2 replicas, each GET picks one at random. -
key_prefix STRING— namespace prefix for cache keys (defaultcdrc). Keys are stored as<key_prefix>:<hex>; the:separator is appended automatically. Set to""to disable the prefix entirely (bare hex keys on a dedicated instance). A trailing:in the configured value is trimmed sokey_prefix mycacheandkey_prefix mycache:are equivalent. -
db NUMBER— Redis logical database index for the data plane. Default0. Not allowed inclustermode (Redis Cluster supports only DB 0). -
sentinel— enable Sentinel mode. Master Group Name is mandatory and must be followed by one or more sentinel addresses. The plugin discovers the current master and replicas via Sentinel (single quorum subscription); writes go to the master, reads pick a replica at random per GET. -
cluster— enable Cluster mode. Takes one or more seed node addresses; the smart client discovers the full topology viaCLUSTER SLOTS. -
read_from— replica routing strategy in cluster mode. Only valid whenclusteris set.latency(default) — pick the replica with the lowest measured RTT.random— pick a random replica.primary— read only from primaries (no replica reads).
-
username— ACL username for the data plane (primary, replicas, or cluster nodes). Optional. -
password— AUTH password for the data plane. Optional. -
sentinel_username— ACL username for the Sentinel API. Optional; only used insentinelmode. -
sentinel_password— AUTH password for the Sentinel API. Optional; only used insentinelmode. -
timeout— Redis connection and operation timeouts:connect— TCP dial timeout (default:1s).read— per-command read timeout (default:500ms).write— per-command write timeout (default:2s).
-
pool— connection-pool tuning. Values are non-negative integers.size N— maximum sockets per client (default10 × runtime.GOMAXPROCS()).min_idle N— minimum idle sockets to keep warm (default0).max_idle N— maximum idle sockets (default0= unlimited).max_active N— hard cap on total open sockets including in-use (default0= unlimited).max_idle_time DURATION— close a connection that has been idle for this long (default30m). Set to less than your load balancer / NAT idle drop window.max_lifetime DURATION— force-recycle any connection older than this regardless of activity (default0= no limit).wait_timeout DURATION— how long a query waits for a free pool connection before erroring (default500ms).
-
retries— retry behavior for transient network errors:max N— number of retries per operation (default1),0disables retries.min_backoff DURATION— initial backoff between retries (default8ms— go-redis).max_backoff DURATION— cap on backoff between retries (default512ms— go-redis). Constraint:min_backoffmust not exceedmax_backoffwhen both are set.
-
tcp_keepalive DURATION— TCP keepalive probe interval (default Go’s built-in). Set below your NAT / firewall / mesh idle-drop window to prevent silent kills. -
tls— enable TLS. No args. Verifies the server cert against the OS trust store. Usetls_cato override the trust store,tls_cert/tls_keyfor mTLS. Implicitly enabled by any othertls_*directive — baretlsis only needed when no other TLS knob is set. The TLS config applies to every connection the plugin opens (Sentinel API, master, replicas, cluster nodes); bundle CAs if planes use different roots. -
tls_cert PATH— PEM client certificate for mTLS. Must be paired withtls_key. -
tls_key PATH— PEM private key matchingtls_cert. -
tls_ca PATH— PEM CA file used to verify the server certificate. Replaces the OS trust store when set; use only when your server’s cert chains to a CA the OS doesn’t ship. -
tls_verify_chain BOOL— verify the server certificate chains to a trusted root. Defaulton. Set tooffto disable all server-cert verification (chain and hostname); use only for development or fully-trusted networks. Acceptson/off,true/false,yes/no,1/0. -
tls_verify_hostname BOOL— verify the server cert’s SAN/CN matches the dialed hostname. Defaulton. Workaround for topologies where the dialed name cannot match the cert SAN (per-pod certs, Cluster MOVED redirects, Sentinel master/replica discovery, VIP fronting); chain verification still runs. Properly-issued certs should not require this. Has no effect whentls_verify_chainisoff. See the example below. -
resolver ADDRESS— DNS server to use for resolving Redis endpoint hostnames instead of the system resolver. Useful in deployments where CoreDNS itself intercepts the system resolver (e.g. node-local-dns) and resolving the Redis service name through it would create a circular dependency. Set this to an upstream DNS service IP. Port defaults to 53.
Authentication
The data plane (Redis nodes) and the Sentinel API authenticate independently — credentials across the two planes may be the same or different. In each plane the auth mode follows the standard Redis convention:
- neither set → unauthenticated.
- password only → legacy
AUTH <password>(matchesrequirepasson any version, or authenticates as thedefaultuser on ACL-enabled servers). - username + password → full ACL auth (Redis 6+ for the data plane, Sentinel 6.2+ for the Sentinel API).
Cache key isolation
The cache key is xxhash64(qclass || qtype || DO || CD || lowercase(qname)), namespaced
by key_prefix. All five components are mixed into the hash and re-verified after each
GET — a mismatch is treated as a miss, self-healed via async eviction, and reported via
coredns_redis_cache_collisions_total.
Practical guarantees this gives operators running mixed-client traffic:
- IN and CHAOS lookups (e.g.
version.bind.) never share a slot with normal Internet queries for the same qname. - DNSSEC-aware (
DO=1) and non-DNSSEC clients keep separate entries — neither receives the other’s response with extra or missingRRSIG/NSECrecords. - DNSSEC-validating (
CD=0) and validation-bypassing (CD=1) queries are isolated. A CD=1 query for a DNSSEC-bogus name cannot poison the cache against a CD=0 client that would have received SERVFAIL from a validating upstream.
Known Compatibility
The plugin speaks only standard RESP commands (AUTH, GET, SET … EX, TTL, EXPIRE,
PING, plus CLUSTER SLOTS in cluster mode and SENTINEL get-master-addr-by-name in
Sentinel mode), so it is expected to work with any reasonably complete Redis-protocol
implementation.
Metrics
If monitoring is enabled (via the prometheus directive) the following metrics are exported:
coredns_redis_cache_hits_total{server}- The count of cache hits from Redis.coredns_redis_cache_misses_total{server}- The count of cache misses from Redis.coredns_redis_cache_get_errors_total{server,reason}- The count of errors when reading entries from Redis. See Error reasons below for thereasonbuckets.coredns_redis_cache_set_errors_total{server,reason}- The count of errors when adding entries to Redis. Samereasonbuckets asget_errors_total.coredns_redis_cache_encode_errors_total{server}- The count of DNS messages that could not be serialized to wire format and so were not cached.coredns_redis_cache_response_mismatches_total{server}- The count of upstream replies whose question did not match the original request and were therefore refused for caching (the reply itself is still passed to the client). Non-zero suggests a misbehaving forwarder upstream or an attempted cache-poisoning probe.coredns_redis_cache_collisions_total{server}- The count of cache hits whose stored entry did not match the request (qname/qtype/qclass/DO/CD all re-verified after GET; mismatched entries are treated as a miss and asynchronously evicted). Should be zero in normal operation. The only innocent trigger is a statistical xxhash64 collision, which is ≈2⁻⁶⁴ per pair and effectively never fires at any plausible cache size. A non-zero value therefore points to a bug to investigate — Redis returning the wrong key’s value, in-process mutation of cached bytes, or a coding error in this plugin — rather than something to ignore.
The server label indicates which server handled the request, see the metrics plugin for details.
Error reasons
get_errors_total and set_errors_total are bucketed by reason:
timeout- context deadline / cancellation, a network timeout, or a connection-pool wait timeout. Look at Redis latency / CPU, pool sizing, and the configuredtimeout read/pool wait_timeoutbudgets.connection- non-timeout network failures: dial refused, connection reset, EOF mid-op. Look at connectivity (DNS, firewall, route), and whether Redis is up and accepting connections.other- RESP-level errors (NOAUTH,WRONGPASS, parse failures, unhandledMOVED, etc.) or anything that isn’t a network error. Typically a configuration or code issue rather than a transient outage.
Examples
Examples after the first show only the redis_cache { ... } block; wrap it in the same
. { cache {...} … forward . … } shape from the Standalone example. They also omit
success / denial — reuse the values from Standalone or rely on the defaults documented
in the directive list.
Local L1 plus a shared Redis L2:
. {
cache {
success 9984 30
denial 9984 5
}
redis_cache {
endpoint redis.cache.svc.cluster.local:6379
success 1h 1m
denial 30m 30s
}
forward . 8.8.8.8:53
}
Writes to a known master, reads random-balanced across explicit replicas:
redis_cache {
endpoint 10.0.0.1:6379
read_endpoint 10.0.0.2:6379 10.0.0.3:6379
password secretPass
}
Sentinel with separate data-plane and Sentinel-API passwords:
redis_cache {
sentinel mymaster 10.0.0.1:26379 10.0.0.2:26379 10.0.0.3:26379
password masterReplicaPass
sentinel_password sentinelPass
}
Redis 6+ ACL (username + password):
redis_cache {
endpoint redis.cache.svc.cluster.local:6379
username dns-cache
password s3cret
}
Cluster mode for capacity scaling beyond a single node’s RAM:
redis_cache {
cluster valkey-cluster-0:6379 valkey-cluster-1:6379 valkey-cluster-2:6379
password secretPass
read_from latency
}
Kubernetes note: the smart client connects directly to every primary and replica the seeds advertise via
CLUSTER SLOTS. If nodes advertise pod IPs (chart default), ensure they’re routable from CoreDNS pods, or setcluster-announce-hostnameon each node so the announced addresses match whatresolverresolves.
TLS — server-only, OS trust store, no client cert:
redis_cache {
endpoint redis.example.com:6380
tls
password s3cret
}
TLS — server-only, internal CA:
redis_cache {
endpoint redis.example.com:6380
tls_ca /etc/ssl/certs/redis-ca.pem
password s3cret
}
TLS — mTLS:
redis_cache {
endpoint redis.cache.svc.cluster.local:6379
username dns-cache
password s3cret
tls_cert /etc/redis/tls/client.crt
tls_key /etc/redis/tls/client.key
tls_ca /etc/redis/tls/ca.pem
}
TLS — Kubernetes Redis Cluster with per-pod certs. Workaround for setups where issuing
certs whose SAN matches the dialed name is not practical: a StatefulSet-deployed
Redis/Valkey cluster typically presents per-pod certs (SAN =
<pod>.<headless-svc>.<ns>.svc.cluster.local), the client dials a service name, and
Cluster MOVED redirects further route to peers whose SANs won’t match anything
pre-declared. Chain verification still applies to every peer:
redis_cache {
cluster redis-cluster-0.redis-cluster-headless.cache.svc.cluster.local:6379 \
redis-cluster-1.redis-cluster-headless.cache.svc.cluster.local:6379 \
redis-cluster-2.redis-cluster-headless.cache.svc.cluster.local:6379
tls_ca /etc/redis/tls/ca.pem
tls_verify_hostname off
password s3cret
}
Same workaround applies to Sentinel-discovered masters/replicas and HA-proxy/VIP fronting a fleet of per-pod certs. Prefer issuing certs whose SAN covers the dialed name where you control the PKI.
Kubernetes node-local-dns. When CoreDNS itself intercepts the cluster DNS VIP, resolving
the Redis service name through it would loop. Use resolver to point at the upstream
kube-dns; __PILLAR__CLUSTER__DNS__ is substituted by node-local-dns at runtime:
.:53 {
errors
cache {
success 9984 30
denial 9984 5
}
redis_cache {
endpoint k8s-dns-cache-redis-master.k8s-dns-cache.svc.cluster.local:6379
read_endpoint k8s-dns-cache-redis-replicas.k8s-dns-cache.svc.cluster.local:6379
password secretPass
success 1h 1m
denial 30m 30s
resolver __PILLAR__CLUSTER__DNS__
}
forward . __PILLAR__UPSTREAM__SERVERS__
}
Building
Add this line to CoreDNS’s plugin.cfg. It must appear after the cache:cache line so
the in-process cache runs as L1 and redis_cache as L2:
cache:cache
redis_cache:github.com/dragoangel/coredns-redis-cache-plugin
Then go get "github.com/dragoangel/coredns-redis-cache-plugin@latest" go generate coredns.go && go build in the CoreDNS source tree.
See Also
See the Redis and Valkey sites for backend details.
Spiritual successor to miekg/redis (directive redisc, archived November 2025).