PART-3 – Monitoring

As we want to know how our ECK is performing, we also want to monitor it with the built in and supplied "Stack Monitoring"

Prepare Filebeat For Monitoring The Cluster

For this setup you can orient at the https://github.com/elastic/cloud-on-k8s/tree/master/config/recipes/beats – As we have our "ELK" in its own namespace and also a different name, we will have to change some lines:

# filebeat may take a while until running in a healthy state
apiVersion: beat.k8s.elastic.co/v1beta1
kind: Beat
metadata:
  name: filebeat
  # change it if needed
  namespace: elk
spec:
  type: filebeat
  version: 7.10.0
  # change the references if needed
  elasticsearchRef:
    name: elk
  kibanaRef:
    name: kibana
  config:
    monitoring:
      # we use the NOT deprecated internal monitoring feature. it will send the metrics to elasticsearch
      enabled: true
      # GET / - shows the UUID
      # stack monitoring will show you a "Standalone Cluster"
      cluster_uuid: n2KDDWUMS2q4h8P5F3_Z8Q
      elasticsearch:
        hosts: ["https://elk-es-http:9200"]
        username: ${MONITORED_ES_USERNAME}
        password: ${MONITORED_ES_PASSWORD}
        # TODO: use es' ca from secret
        ssl.verification_mode: none
    filebeat:
      autodiscover:
        providers:
        - type: kubernetes
          node: ${NODE_NAME}
          hints:
            enabled: true
            default_config:
              type: container
              paths:
              - /var/log/containers/*${data.kubernetes.container.id}.log
          templates:
            - condition:
                contains:
                  common.k8s.elastic.co/type: elasticsearch
              config:
                - module: elasticsearch
                  server:
                    enabled: true
                    var.paths:
                      - /var/log/containers/*${data.kubernetes.container.id}.log
                  gc:
                    var.paths:
                      - /var/log/containers/*${data.kubernetes.container.id}.log
                  # if you have audit-logging enabled and a proper license (trial or enterprise)
                  audit:
                    var.paths:
                      - /var/log/containers/*${data.kubernetes.container.id}.log
                  slowlog:
                    var.paths:
                      - /var/log/containers/*${data.kubernetes.container.id}.log
                  deprecation:
                    var.paths:
                      - /var/log/containers/*${data.kubernetes.container.id}.log
            - condition:
                contains:
                  common.k8s.elastic.co/type: kibana
              config:
                - module: kibana
                  log:
                    var.paths:
                      - /var/log/containers/*${data.kubernetes.container.id}.log
    processors:
    - add_cloud_metadata: {}
    - add_host_metadata: {}
    - drop_event:
        # we drop all events with following content(s) in the log-line (in case audit-logging is on and you have the proper license ;)
        # this is very rudimentary and will only help to keep it running for the first few hours - you have to set proper configuration via the API
        # to restrict what's being logged. Search for elasticsearch audit ignore policies.
        # If you do not set ignore-policies, this will start the loop of doom ;) - Elasticsearch creating logs, Filebeat ingesting the logs, 
        # Elasticsearch creating even more logs, Filebeat cannot keep up and not closing filehandles, ....
        when:
          and:
            - contains:
                common.k8s.elastic.co/type: elasticsearch
            - or:
                - regexp:
                    message: '"action":".*data/write'
                - regexp:
                    message: '"action":".*data\/read'
                - regexp:
                    message: '"action":".*monitor'
    - add_tags:
        # we add a nice tag
        when:
          contains:
            or:
              - common.k8s.elastic.co/type: elasticsearch
              - common.k8s.elastic.co/type: kibana
        tags: [ "compliance" ]
  daemonSet:
    podTemplate:
      spec:
        serviceAccountName: filebeat
        automountServiceAccountToken: true
        terminationGracePeriodSeconds: 30
        dnsPolicy: ClusterFirstWithHostNet
        hostNetwork: true # Allows to provide richer host metadata
        containers:
        - name: filebeat
          securityContext:
            runAsUser: 0
            # If using Red Hat OpenShift uncomment this:
            #privileged: true
          volumeMounts:
          - name: varlogcontainers
            mountPath: /var/log/containers
          - name: varlogpods
            mountPath: /var/log/pods
          - name: varlibdockercontainers
            mountPath: /var/lib/docker/containers
          - name: data
            mountPath: /usr/share/filebeat/data
          env:
            - name: NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            - name: MONITORED_ES_USERNAME
              value: elastic
            - name: MONITORED_ES_PASSWORD
              valueFrom:
                secretKeyRef:
                  key: elastic
                  # change if needed
                  name: elk-es-elastic-user
          resources:
            requests:
              cpu: 100m 
              memory: 1024Mi
            limits:
              cpu: 100m
              memory: 1024Mi
        volumes:
        - name: varlogcontainers
          hostPath:
            path: /var/log/containers
        - name: varlogpods
          hostPath:
            path: /var/log/pods
        - name: varlibdockercontainers
          hostPath:
            path: /var/lib/docker/containers
        # data folder stores a registry of read status for all files, so we don't send everything again on a Filebeat pod restart
        - name: data
          hostPath:
            path: /var/lib/filebeat-data
            type: DirectoryOrCreate
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: filebeat
rules:
- apiGroups: [""] # "" indicates the core API group
  resources:
  - namespaces
  - pods
  verbs:
  - get
  - watch
  - list
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: filebeat
  # change it if needed
  namespace: elk
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: filebeat
subjects:
- kind: ServiceAccount
  name: filebeat
  # change it if needed
  namespace: elk
roleRef:
  kind: ClusterRole
  name: filebeat
  apiGroup: rbac.authorization.k8s.io

Prepare Metricbeat For Monitoring The Cluster

Metricbeat can use substential resources on larger environments!

Metricbeat will scrape the metrics from given pods if they have a certain label set.

apiVersion: beat.k8s.elastic.co/v1beta1
kind: Beat
metadata:
  name: metricbeat
  namespace: elk
spec:
  type: metricbeat
  version: 7.10.0
  elasticsearchRef:
    # on a productive environment, you should use an own monitoring stack.
    # here we try to use the same stack
    name: elk
  config:
    monitoring:
      # we use the NOT deprecated internal monitoring feature. it will send the metrics to elasticsearch
      enabled: true
      # GET / - shows in the output the UUID - if you do not define it
      # stack monitoring will show you a "Standalone Cluster"
      cluster_uuid: n2KDDWUMS2q4h8P5F3_Z8Q
      elasticsearch:
        hosts: ["https://elk-es-http:9200"]
        username: ${MONITORED_ES_USERNAME}
        password: ${MONITORED_ES_PASSWORD}
        ssl.verification_mode: none
    metricbeat:
      autodiscover:
        providers:
          - type: kubernetes
            scope: cluster
            node: ${NODE_NAME}
            hints:
              enabled: true
            templates:
              # this will monitor elasticsearch pods that have this label
              - condition:
                  contains:
                    kubernetes.labels.stackmonitoring: elasticsearch
                config:
                  - module: elasticsearch
                    metricsets:
                      - ccr
                      - cluster_stats
                      - enrich
                      - index
                      - index_recovery
                      - index_summary
                      - ml_job
                      - node
                      - node_stats
                      - pending_tasks
                      - shard
                    period: 10s
                    hosts: "https://${data.host}:9200"
                    username: ${MONITORED_ES_USERNAME}
                    password: ${MONITORED_ES_PASSWORD}
                    # WARNING: disables TLS as the default certificate is not valid for the pod FQDN
                    # TODO: switch this to "certificate" when available: https://github.com/elastic/beats/issues/8164
                    ssl.verification_mode: "none"
                    # so the metrics land in a ".monitoring"-index which will be used by Kibana's "Stack Monitoring"-app
                    xpack.enabled: true
              # monitoring kibana pods
              - condition:
                  contains:
                    kubernetes.labels.stackmonitoring: kibana
                config:
                  - module: kibana
                    metricsets:
                      - stats
                      - status
                    period: 10s
                    hosts: "https://${data.host}:5601"
                    username: ${MONITORED_ES_USERNAME}
                    password: ${MONITORED_ES_PASSWORD}
                    # WARNING: disables TLS as the default certificate is not valid for the pod FQDN
                    # TODO: switch this to "certificate" when available: https://github.com/elastic/beats/issues/8164
                    ssl.verification_mode: "none"
                    xpack.enabled: true
              # monitoring logstash pods
              - condition:
                  contains:
                    kubernetes.labels.stackmonitoring: logstash
                config:
                  - module: logstash
                    metricsets:
                      - node
                      - node_stats
                    period: 10s
                    hosts: "http://${data.host}:9600"
                    #username: ${MONITORED_ES_USERNAME}
                    #password: ${MONITORED_ES_PASSWORD}
                    # WARNING: disables TLS as the default certificate is not valid for the pod FQDN
                    # TODO: switch this to "certificate" when available: https://github.com/elastic/beats/issues/8164
                    #ssl.verification_mode: "none"
                    xpack.enabled: true
      modules:
      - module: system
        period: 10s
        metricsets:
        - cpu
        - load
        - memory
        - network
        - process
        - process_summary
        process:
          include_top_n:
            by_cpu: 5
            by_memory: 5
        processes:
        - .*
      - module: system
        period: 1m
        metricsets:
        - filesystem
        - fsstat
        processors:
        - drop_event:
            when:
              regexp:
                system:
                  filesystem:
                    mount_point: ^/(sys|cgroup|proc|dev|etc|host|lib)($|/)
      - module: kubernetes
        period: 10s
        host: ${NODE_NAME}
        hosts:
        - https://${NODE_NAME}:10250
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
        ssl:
          verification_mode: none
        metricsets:
        - node
        - system
        - pod
        - container
        - volume
    processors:
    - add_cloud_metadata: {}
    - add_host_metadata: {}
    logging.json: true
  daemonSet:
    podTemplate:
      spec:
        serviceAccountName: metricbeat
        automountServiceAccountToken: true
        # required to read /etc/beat.yml
        securityContext:
          runAsUser: 0
        containers:
        - args:
          - -e
          - -c
          - /etc/beat.yml
          - -system.hostfs=/hostfs
          name: metricbeat
          volumeMounts:
          - mountPath: /hostfs/sys/fs/cgroup
            name: cgroup
          - mountPath: /var/run/docker.sock
            name: dockersock
          - mountPath: /hostfs/proc
            name: proc
          env:
          - name: NODE_NAME
            valueFrom:
              fieldRef:
                fieldPath: spec.nodeName
          - name: MONITORED_ES_USERNAME
            value: elastic
          - name: MONITORED_ES_PASSWORD
            valueFrom:
              secretKeyRef:
                key: elastic
                # change it accordingly to your cluster name
                name: elk-es-elastic-user
          # metricbeat can peak quite high with its RAM usage - observe it and set proper value. The default of 200Mi is not enough
          # after monitoring a few pods
          resources:
            requests:
              cpu: 100m 
              memory: 2048Mi
            limits:
              cpu: 100m
              memory: 2048Mi
        dnsPolicy: ClusterFirstWithHostNet
        hostNetwork: true # Allows to provide richer host metadata
        securityContext:
          runAsUser: 0
        terminationGracePeriodSeconds: 30
        volumes:
        - name: cgroup
          hostPath:
            path: /sys/fs/cgroup
        - name: dockersock
          hostPath:
            path: /var/run/docker.sock
        - name: proc
          hostPath:
            path: /proc
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: metricbeat
rules:
- apiGroups:
  - ""
  resources:
  - nodes
  - namespaces
  - events
  - pods
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - "extensions"
  resources:
  - replicasets
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - apps
  resources:
  - statefulsets
  - deployments
  - replicasets
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - nodes/stats
  verbs:
  - get
- nonResourceURLs:
  - /metrics
  verbs:
  - get
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: metricbeat
  namespace: elk
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: metricbeat
subjects:
- kind: ServiceAccount
  name: metricbeat
  # change it if needed
  namespace: elk
roleRef:
  kind: ClusterRole
  name: metricbeat
  apiGroup: rbac.authorization.k8s.io

Proceed to PART-4

Zuletzt bearbeitet: Januar 5, 2021

Autor

Kommentare

Kommentar verfassen

Diese Website verwendet Akismet, um Spam zu reduzieren. Erfahre mehr darüber, wie deine Kommentardaten verarbeitet werden.