Zentralisiertes Logging in OpenShift implementiert einheitliche Log-Aggregation und Management-Systeme für umfassende System-Observability durch automatisierte Log-Sammlung, -Verarbeitung und -Analyse. Diese zentralisierten Logging-Architekturen ermöglichen cluster-weite Log-Sichtbarkeit, Troubleshooting-Funktionen und Compliance-Berichterstattung für komplexe Container-Orchestrierungs-Umgebungen.
Log-Aggregation sammelt verteilte Log-Streams von Nodes, Pods und System-Komponenten in zentralen Repositories für einheitliche Analyse und langfristige Aufbewahrung. Diese Zentralisierung eliminiert die Notwendigkeit des Node-spezifischen Log-Zugriffs und ermöglicht cluster-weite Log-Suchfunktionen.
Event-driven Log-Processing implementiert Echtzeit-Log-Stream-Verarbeitung für sofortige Analyse und Alert-Generierung basierend auf Log-Inhalten. Diese Stream-Processing-Architektur unterstützt proaktive Problem-Erkennung und automatisierte Antwort-Mechanismen.
[Diagramm: Zentralisierte Logging-Architektur mit Log-Sammlung, -Verarbeitung und -Speicherung]
Der Standard OpenShift Logging-Stack besteht aus mehreren Komponenten:
Fluentd/Vector sammelt Logs von allen Nodes und verarbeitet sie. Diese Log-Collector laufen als DaemonSet auf jedem Node und sammeln sowohl Container- als auch System-Logs.
Elasticsearch speichert und indiziert die Logs für schnelle Suche und Analyse. OpenShift verwendet Elasticsearch als primäre Log-Speicher-Engine mit optimierten Indizierungs-Strategien.
Kibana bietet eine Web-Oberfläche für Log-Visualisierung, -Suche und -Analyse. Benutzer können komplexe Queries erstellen und Dashboards für Log-Monitoring konfigurieren.
Log Forwarding ermöglicht das Weiterleiten von Logs an externe Systeme wie SIEM-Tools oder Cloud-Log-Services.
Log-Lifecycle-Management umfasst Collection, Processing, Storage, Indexing und Archival-Phasen mit definierten Retention-Policies und Storage-Optimierung:
OpenShift bietet verschiedene Strategien für die Log-Sammlung, die je nach Anwendungsfall und Performance-Anforderungen gewählt werden können.
DaemonSet-basierte Log-Collection implementiert Node-Level-Log-Aggregation durch Pod-Agents auf jedem Cluster-Node für vollständige Log-Abdeckung:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluent-bit
namespace: openshift-logging
spec:
selector:
matchLabels:
name: fluent-bit
template:
metadata:
labels:
name: fluent-bit
spec:
containers:
- name: fluent-bit
image: fluent/fluent-bit:latest
volumeMounts:
- name: varlog
mountPath: /var/log
readOnly: true
- name: containers
mountPath: /var/lib/docker/containers
readOnly: true
- name: config
mountPath: /fluent-bit/etc/
volumes:
- name: varlog
hostPath:
path: /var/log
- name: containers
hostPath:
path: /var/lib/docker/containers
- name: config
configMap:
name: fluent-bit-configSidecar-Container-Logging ermöglicht anwendungsspezifische Log-Verarbeitung durch dedizierte Logging-Container innerhalb von Anwendungs-Pods:
apiVersion: v1
kind: Pod
metadata:
name: webapp-with-logging
spec:
containers:
- name: webapp
image: my-webapp:latest
volumeMounts:
- name: app-logs
mountPath: /app/logs
- name: log-processor
image: fluent/fluent-bit:latest
volumeMounts:
- name: app-logs
mountPath: /var/log/app
readOnly: true
- name: fluent-config
mountPath: /fluent-bit/etc/
volumes:
- name: app-logs
emptyDir: {}
- name: fluent-config
configMap:
name: app-log-configContainer-Runtime-Integration sammelt Container-STDOUT/STDERR-Streams für standardisierte Anwendungs-Log-Collection:
# Fluentd Konfiguration für Container-Logs
<source>
@type tail
@id container-input
path "/var/log/containers/*.log"
pos_file "/var/log/fluentd-containers.log.pos"
tag kubernetes.*
read_from_head true
<parse>
@type multi_format
<pattern>
format json
time_key time
time_format %Y-%m-%dT%H:%M:%S.%NZ
</pattern>
<pattern>
format /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/
time_format %Y-%m-%dT%H:%M:%S.%N%:z
</pattern>
</parse>
</source>Effektive Log-Verarbeitung ist entscheidend für die Extraktion wertvoller Informationen aus rohen Log-Daten.
Log-Parsing-Engines implementieren Struktur-Extraktion aus unstrukturierten Log-Daten:
# Fluentd Parser für Apache-Logs
<parse>
@type regexp
expression /^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$/
time_format %d/%b/%Y:%H:%M:%S %z
</parse>
# Vector Konfiguration für JSON-Logs
[transforms.parse_json]
type = "remap"
inputs = ["container_logs"]
source = '''
. = parse_json!(.message)
.timestamp = parse_timestamp!(.timestamp, format: "%Y-%m-%dT%H:%M:%S%.fZ")
.level = upcase!(.level)
'''Log-Enrichment-Prozesse erweitern Log-Einträge um Metadaten wie Pod-Namen, Namespace-Informationen und Node-Details:
# Fluentd Kubernetes-Metadaten-Plugin
<filter kubernetes.**>
@type kubernetes_metadata
@id filter_kube_metadata
kubernetes_url "#{ENV['FLUENT_FILTER_KUBERNETES_URL'] || 'https://' + ENV.fetch('KUBERNETES_SERVICE_HOST') + ':' + ENV.fetch('KUBERNETES_SERVICE_PORT') + '/api'}"
verify_ssl "#{ENV['KUBERNETES_VERIFY_SSL'] || true}"
preserve_json_log true
merge_json_log true
de_dot false
annotation_match [ ".*" ]
</filter>Log-Normalisierung standardisiert Log-Formate und Feldnamen über verschiedene Log-Quellen:
# Normalisierung verschiedener Log-Level
<filter **>
@type record_transformer
<record>
level ${record["level"]&.downcase == "warn" ? "warning" : record["level"]&.downcase}
component ${record["kubernetes"]["labels"]["app"] || "unknown"}
node_name ${record["kubernetes"]["node_name"]}
</record>
</filter>Multi-Line-Log-Processing für Stack-Traces und mehrzeilige JSON-Logs:
# Fluent Bit Multi-Line-Parser
[PARSER]
Name java_multiline
Format regex
Regex (?<time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}.\d{3})\s+(?<level>\w+)\s+(?<message>.*)
Time_Key time
Time_Format %Y-%m-%d %H:%M:%S.%L
[INPUT]
Name tail
Path /var/log/java-app/*.log
Parser java_multiline
Multiline.parser java_multilineEffiziente Log-Speicherung und -Retention sind kritisch für Performance und Kostenmanagement.
# Elasticsearch Index-Template für Logs
apiVersion: v1
kind: ConfigMap
metadata:
name: elasticsearch-template
data:
template.json: |
{
"index_patterns": ["app-logs-*"],
"settings": {
"number_of_shards": 1,
"number_of_replicas": 1,
"index.refresh_interval": "30s",
"index.codec": "best_compression"
},
"mappings": {
"properties": {
"@timestamp": {
"type": "date"
},
"level": {
"type": "keyword"
},
"message": {
"type": "text",
"analyzer": "standard"
},
"kubernetes.namespace": {
"type": "keyword"
},
"kubernetes.pod": {
"type": "keyword"
}
}
}
}# ILM Policy für Log-Retention
apiVersion: v1
kind: ConfigMap
metadata:
name: ilm-policy
data:
policy.json: |
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": {
"max_size": "10GB",
"max_age": "1d"
}
}
},
"warm": {
"min_age": "2d",
"actions": {
"shrink": {
"number_of_shards": 1
},
"allocate": {
"number_of_replicas": 0
}
}
},
"cold": {
"min_age": "7d",
"actions": {
"allocate": {
"number_of_replicas": 0
}
}
},
"delete": {
"min_age": "30d"
}
}
}
}| Storage-Tier | Alter | Hardware | Replikation | Anwendungsfall |
|---|---|---|---|---|
| Hot | 0-2 Tage | SSD, hohe CPU | 1 Replikat | Aktive Suche und Analyse |
| Warm | 2-7 Tage | SSD/HDD, moderate CPU | 0 Replikate | Gelegentliche Suche |
| Cold | 7-30 Tage | HDD, niedrige CPU | 0 Replikate | Archiv-Zugriff |
| Delete | > 30 Tage | - | - | Automatische Löschung |
OpenShift Logging bietet umfangreiche Suchfunktionen für effektive Log-Analyse.
# Fehler-Logs der letzten 24 Stunden
{
"query": {
"bool": {
"must": [
{
"match": {
"level": "ERROR"
}
},
{
"range": {
"@timestamp": {
"gte": "now-24h"
}
}
}
]
}
}
}
# Application-spezifische Logs mit Namespace-Filter
{
"query": {
"bool": {
"must": [
{
"match": {
"kubernetes.namespace": "production"
}
},
{
"wildcard": {
"kubernetes.labels.app": "frontend-*"
}
}
]
}
}
}# OpenShift CLI Log-Queries
# Logs eines spezifischen Pods
oc logs -f deployment/my-app -c container-name
# Logs mit Label-Selector
oc logs -l app=frontend --tail=100
# Logs aus mehreren Containern
oc logs --selector='app=backend' --all-containers=true
# Logs mit Zeitfilter (über Kibana/Elasticsearch)
curl -X GET "elasticsearch:9200/app-logs-*/_search" -H 'Content-Type: application/json' -d '
{
"query": {
"range": {
"@timestamp": {
"gte": "2024-01-01T00:00:00",
"lte": "2024-01-02T00:00:00"
}
}
},
"sort": [
{
"@timestamp": {
"order": "desc"
}
}
]
}'# Vector Konfiguration für Real-Time-Streaming
[sources.kubernetes]
type = "kubernetes_logs"
[transforms.filter_errors]
type = "filter"
inputs = ["kubernetes"]
condition = '.level == "ERROR" || .level == "FATAL"'
[sinks.websocket]
type = "websocket"
inputs = ["filter_errors"]
uri = "ws://log-viewer:8080/logs"Sicherheit ist ein kritischer Aspekt des zentralisierten Logging, insbesondere bei sensiblen Log-Daten.
# ClusterRole für Log-Viewer
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: log-viewer
rules:
- apiGroups: [""]
resources: ["pods", "pods/log"]
verbs: ["get", "list"]
- apiGroups: ["apps"]
resources: ["deployments", "replicasets"]
verbs: ["get", "list"]
---
# RoleBinding für Projekt-spezifischen Log-Zugriff
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: project-log-viewers
namespace: my-project
subjects:
- kind: User
name: developer@company.com
apiGroup: rbac.authorization.k8s.io
- kind: Group
name: development-team
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: log-viewer
apiGroup: rbac.authorization.k8s.io# Elasticsearch mit TLS-Verschlüsselung
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: logging-cluster
spec:
version: 8.5.0
http:
tls:
selfSignedCertificate:
disabled: false
transport:
tls:
selfSignedCertificate:
disabled: false
nodeSets:
- name: masters
count: 3
config:
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.http.ssl.enabled: true# Fluentd Plugin für PII-Scrubbing
<filter **>
@type record_modifier
<record>
message ${record["message"].gsub(/\b\d{4}[- ]?\d{4}[- ]?\d{4}[- ]?\d{4}\b/, "****-****-****-****")}
message ${record["message"].gsub(/\b[\w\.-]+@[\w\.-]+\.\w+\b/, "***@***.***")}
</record>
</filter>
# Vector Konfiguration für Sensitive-Data-Masking
[transforms.mask_pii]
type = "remap"
inputs = ["logs"]
source = '''
# Kreditkarten-Nummern maskieren
.message = replace(.message, r'\b\d{4}[-\s]?\d{4}[-\s]?\d{4}[-\s]?\d{4}\b', "****-****-****-****")
# E-Mail-Adressen maskieren
.message = replace(.message, r'\b[\w\.-]+@[\w\.-]+\.\w+\b', "***@***.***")
# IP-Adressen maskieren
.message = replace(.message, r'\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b', "***.***.***.***")
'''Log-basierte Alerts erweitern Metrics-basierte Überwachung um inhaltsbasierte Intelligence.
# Elasticsearch Watcher für Log-Alerts
PUT _watcher/watch/error_rate_alert
{
"trigger": {
"schedule": {
"interval": "1m"
}
},
"input": {
"search": {
"request": {
"search_type": "query_then_fetch",
"indices": ["app-logs-*"],
"body": {
"query": {
"bool": {
"must": [
{
"match": {
"level": "ERROR"
}
},
{
"range": {
"@timestamp": {
"gte": "now-5m"
}
}
}
]
}
},
"aggs": {
"error_count": {
"cardinality": {
"field": "message.keyword"
}
}
}
}
}
}
},
"condition": {
"compare": {
"ctx.payload.aggregations.error_count.value": {
"gt": 10
}
}
},
"actions": {
"send_slack": {
"slack": {
"account": "monitoring",
"message": {
"to": "#alerts",
"text": "High error rate detected: {{ctx.payload.aggregations.error_count.value}} errors in last 5 minutes"
}
}
}
}
}# Fluentd Prometheus-Plugin
<filter **>
@type prometheus
<metric>
name fluentd_output_status_num_records_total
type counter
desc The total number of outgoing records
<labels>
tag ${tag}
hostname ${hostname}
</labels>
</metric>
</filter>
# Vector Prometheus-Metriken
[sinks.prometheus]
type = "prometheus_exporter"
inputs = ["logs"]
address = "0.0.0.0:9090"
[sinks.prometheus.metrics]
log_events_total = { type = "counter", description = "Total log events processed" }
log_errors_total = { type = "counter", description = "Total log errors", tags = { level = "{{level}}" } }Für große OpenShift-Cluster sind spezielle Optimierungen erforderlich.
# Fluentd Buffer-Konfiguration
<match **>
@type elasticsearch
host elasticsearch.logging.svc.cluster.local
port 9200
<buffer>
@type file
path /var/log/fluentd-buffers/kubernetes.system.buffer
flush_mode interval
retry_type exponential_backoff
flush_thread_count 2
flush_interval 5s
retry_forever
retry_max_interval 30
chunk_limit_size 2M
queue_limit_length 8
overflow_action block
</buffer>
</match>
# Vector Buffer-Konfiguration
[sinks.elasticsearch]
type = "elasticsearch"
inputs = ["processed_logs"]
endpoints = ["http://elasticsearch:9200"]
[sinks.elasticsearch.batch]
max_bytes = 1048576
max_events = 1000
timeout_secs = 5
[sinks.elasticsearch.buffer]
type = "disk"
max_size = 268435456
when_full = "block"# Elasticsearch Index-Optimierung
PUT /app-logs-*/_settings
{
"refresh_interval": "30s",
"number_of_replicas": 0,
"codec": "best_compression",
"max_result_window": 50000
}
# Force-Merge für ältere Indizes
POST /app-logs-2024.01.*/_forcemerge?max_num_segments=1# Elasticsearch Node-Konfiguration
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: logging
spec:
version: 8.5.0
nodeSets:
- name: masters
count: 3
config:
node.roles: ["master"]
xpack.ml.enabled: false
podTemplate:
spec:
containers:
- name: elasticsearch
resources:
requests:
memory: 2Gi
cpu: 1
limits:
memory: 2Gi
cpu: 2
env:
- name: ES_JAVA_OPTS
value: "-Xms1g -Xmx1g"
- name: data
count: 3
config:
node.roles: ["data", "ingest"]
podTemplate:
spec:
containers:
- name: elasticsearch
resources:
requests:
memory: 4Gi
cpu: 2
limits:
memory: 4Gi
cpu: 4
env:
- name: ES_JAVA_OPTS
value: "-Xms2g -Xmx2g"
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: fast-ssdModerne Logging-Architecturen erfordern Integration mit verschiedenen externen Systemen.
# Splunk HTTP Event Collector
[sinks.splunk]
type = "splunk_hec"
inputs = ["security_logs"]
endpoint = "https://splunk.company.com:8088"
default_token = "${SPLUNK_TOKEN}"
compression = "gzip"
[sinks.splunk.encoding]
codec = "json"
# QRadar SIEM Integration
[sinks.qradar]
type = "syslog"
inputs = ["security_logs"]
address = "qradar.company.com:514"
mode = "tcp"
[sinks.qradar.encoding]
codec = "syslog"# AWS CloudWatch Logs
[sinks.cloudwatch]
type = "aws_cloudwatch_logs"
inputs = ["app_logs"]
group_name = "openshift-cluster"
stream_name = "{{ kubernetes.namespace }}/{{ kubernetes.pod }}"
region = "us-east-1"
[sinks.cloudwatch.auth]
access_key_id = "${AWS_ACCESS_KEY_ID}"
secret_access_key = "${AWS_SECRET_ACCESS_KEY}"
# Google Cloud Logging
[sinks.gcp_stackdriver]
type = "gcp_stackdriver_logs"
inputs = ["app_logs"]
log_id = "openshift-logs"
project_id = "my-gcp-project"
resource.type = "k8s_container"# Apache Kafka für Data Lake
[sinks.kafka]
type = "kafka"
inputs = ["all_logs"]
bootstrap_servers = "kafka-cluster:9092"
topic = "openshift-logs"
compression = "snappy"
[sinks.kafka.encoding]
codec = "json"
# Apache Parquet für Object Storage
[sinks.parquet]
type = "aws_s3"
inputs = ["historical_logs"]
bucket = "log-archive-bucket"
key_prefix = "year=%Y/month=%m/day=%d/"
compression = "gzip"
[sinks.parquet.encoding]
codec = "parquet"Effektives Kosten-Management ist entscheidend für nachhaltige Logging-Strategien.
# Sampling-basiert auf Log-Level
<filter **>
@type sampling
@id sampling_filter
sampling_rate 10
remove_keys message
<condition>
key level
pattern DEBUG
</condition>
</filter>
# Adaptive Sampling
<filter **>
@type record_transformer
enable_ruby true
<record>
sampled ${rand < (record["level"] == "ERROR" ? 1.0 : 0.1) ? "true" : "false"}
</record>
</filter># Index-Template mit Compression
PUT _index_template/cost_optimized_logs
{
"index_patterns": ["logs-*"],
"template": {
"settings": {
"index.codec": "best_compression",
"index.refresh_interval": "60s",
"index.number_of_replicas": 0,
"index.query.default_field": ["message", "kubernetes.namespace"]
},
"mappings": {
"dynamic": false,
"properties": {
"@timestamp": {"type": "date"},
"level": {"type": "keyword"},
"message": {
"type": "text",
"index": false,
"store": true
},
"kubernetes.namespace": {"type": "keyword"},
"kubernetes.pod": {"type": "keyword"}
}
}
}
}# Log-Volumen pro Namespace tracken
[transforms.add_cost_labels]
type = "remap"
inputs = ["kubernetes_logs"]
source = '''
.cost_center = get_env_var!("COST_CENTER_" + upcase!(.kubernetes.namespace)) ?? "default"
.log_size_bytes = length(.message)
'''
[sinks.cost_metrics]
type = "prometheus_exporter"
inputs = ["add_cost_labels"]
address = "0.0.0.0:9091"
[sinks.cost_metrics.metrics]
log_volume_bytes_total = {
type = "counter",
description = "Total log volume in bytes",
tags = {
namespace = "{{kubernetes.namespace}}",
cost_center = "{{cost_center}}"
}
}