Logs¶
Pipeline¶
graph LR
subgraph K8s Nodes
C[Container logs\n/var/log/containers]
A[Audit logs\n/var/log/audit/kube]
T[Talos system logs\nTCP :5170]
end
subgraph NAS
D[Docker containers\nvia docker_sd]
end
FB[Fluent-bit\nDaemonSet]
PT[Promtail\nDocker container]
Loki[(Loki)]
C --> FB
A --> FB
T -->|json_lines| FB
D --> PT
FB -->|source label| Loki
PT --> Loki
Fluent-bit¶
Fluent-bit runs as a DaemonSet on every K8s node and collects three log streams:
| Stream | Source | Parser |
|---|---|---|
| Container logs | /var/log/containers/*.log (tail) |
containerd (RFC3339) |
| Audit logs | /var/log/audit/kube/*.log (tail) |
JSON |
| Talos system logs | TCP :5170 |
JSON |
All three streams are tagged with a source field (kubernetes, audit, or talos) before being shipped to Loki.
Fixed ClusterIP
Fluent-bit is assigned a fixed ClusterIP (10.96.0.20) so Talos nodes can forward logs to a stable address.
Talos machine config is baked in at provision time and can't follow a changing IP, so the fixed ClusterIP is necessary.
Talos nodes forward both kernel and service logs to 10.96.0.20:5170 in json_lines format. The TCP input tags them talos.*.
Audit log ordering
Audit log entries older than 5 minutes are ignored. On log rotation, the rotated file starts replaying from its beginning — Loki rejects out-of-order entries, so stale entries are dropped at the Fluent-bit level.
Promtail¶
Promtail runs as a Docker container on the NAS and ships logs from all NAS Docker containers to Loki.
Why Promtail instead of Fluent-bit?
Fluent-bit is already used for K8s nodes, but it lacks native Docker service discovery. Promtail's docker_sd config automatically discovers all running containers via the Docker socket and attaches container metadata (name, image, labels) as Loki labels — no manual job per container needed. This makes it the better fit for the NAS where containers come and go.
Loki¶
Loki runs in distributed mode (write / read / backend). Chunks and index are stored in MinIO. Retention is 28 days.
No authentication
auth_enabled: false — any pod in the cluster can write or read logs without credentials.
Ingestion rate limits are raised above defaults to handle cluster log volume:
| Limit | Value |
|---|---|
| Per-instance rate | 20 MB/s |
| Per-instance burst | 40 MB/s |
| Per-stream rate | 10 MB/s |
| Per-stream burst | 40 MB/s |
Querying¶
Always use | json
The source field lives in the log body, not a Loki stream label. Use | json to parse it before filtering.
Talos logs also carry talos_service and talos_level fields after JSON parsing, enabling fine-grained filtering: