PART-5 - Elasticsearch Templates And Policies

Because we use Logstash for analysing Logs, we have to prepare templates and ILM policies.

Setting Up ILM-Policies And Index-Templates

For this setup I created an index-template listening for index-pattern logstash_syslog_ilm-. The field mapping is configured for using the ECS-mapping https://github.com/elastic/ecs/tree/master/generated/elasticsearch/7 so you can e.g. simultanously query filebeat, logstash, metricbeat indices - assumed you set up a proper index-pattern in Kibana.

Default Templates

For the shards of fresh indices to be placed on the hot-nodes, we need a default index-template and a default legacy index template.

I name them 01-default-template so they are listed on top, they are kept very simple. Important is that they have a higher order priority than every other template so that in case they overwrite settings set by e.g. filebeat-managed templates.

Legacy Default Index Template

PUT _template/01-default-template
{
  "version": 1,
  "order": 1000,
  "index_patterns": [
    "*"
  ],
  "settings": {
    "index": {
      "number_of_shards": "2",
      "number_of_replicas": "1",
      "routing": {
        "allocation": {
          "require": {
            "data": "hot"
          },
          "total_shards_per_node": "2"
        }
      }
    }
  },
  "aliases": {},
  "mappings": {}
}

New Index Template

Create a component-template 01-default-template

PUT _component_template/01-default-template
{
  "version": 1,
  "template": {
    "settings": {
      "index": {
        "number_of_shards": "2",
        "number_of_replicas": "1",
        "routing": {
          "allocation": {
            "require": {
              "data": "hot"
            },
            "total_shards_per_node": "2"
          }
        }
      }
    }
  }
}

and then create the index-template using the component-template (It seems you unfortunately cannot use * as index-pattern anymore in new index-templates, so I added only the pattern, which will the later set up Logstash use.

PUT _index_template/01-default-template
{
  "version": 1,
  "priority": 1000,
  "index_patterns": [
    "logstash*"
  ],
  "composed_of": [
    "01-default-template"
  ]
}

ILM Policy For The Syslog-Index

You can simply create it via Kibana I named it logstash-syslog - if you set the "Data allocation" to the recommended option, you won't see the allocate-key like in the request below.

PUT _ilm/policy/logstash-syslog
{
  "policy": {
    "phases": {
      "hot": {
        "min_age": "0ms",
        "actions": {
          "rollover": {
            "max_size": "50gb",
            "max_age": "1d"
          },
          "set_priority": {
            "priority": 100
          }
        }
      },
      "warm": {
        "min_age": "1d",
        "actions": {
          "forcemerge": {
            "max_num_segments": 1,
            "index_codec": "best_compression"
          },
          "set_priority": {
            "priority": 50
          },
          "shrink": {
            "number_of_shards": 1
          },
          "allocate": {
            "require": {
              "data": "warm"
            }
          }
        }
      },
      "cold": {
        "min_age": "2d",
        "actions": {
          "freeze": {},
          "set_priority": {
            "priority": 25
          },
          "allocate": {
            "require": {
              "data": "cold"
            }
          }
        }
      },
      "delete": {
        "min_age": "3d",
        "actions": {
          "delete": {
            "delete_searchable_snapshot": true
          }
        }
      }
    }
  }
}

As you see I'm old school and I want daily indices. Either they are rolled over when they reach a size of 50GB or they are older than 1 day.

A note about the max_size - calculated on size of all primary shards! So if you have 2 shards and 1 replicas, the rollover will happen after each of the primary shards has 25GB - which is good, as we later shrink it to only 1 shard with 1 replica. After the shrinking the shard has a size of roughly 50GB. In "Index Management" you always see the "Storage Size" which sums up the storage usage of primary and replica shards!

Logstash-Syslog Index Template

It consists of multiple component-templates:

PUT _index_template/logstash-syslog
{
  "index_patterns": [
    "logstash_syslog_ilm-*-*"
  ],
  "composed_of": [
    "syslog-mappings-dynamic",
    "syslog-mappings",
    "syslog-settings-general",
    "syslog-settings-ilm",
    "syslog-settings-shards",
    "additional-mappings-apache",
    "additional-mappings-geo",
    "additional-mappings-http",
    "additional-mappings-nginx",
    "additional-mappings-syslog5424"
  ]
}

mappings-dynamic

I used the content of dynamic_templates from https://github.com/elastic/ecs/blob/master/generated/elasticsearch/7/template.json (or download the release zip and copy paste it from the file. I enabled numeric_detection but disabled date_detection - the request should look similar to (shortened):

PUT _component_template/syslog-mappings-dynamic
{
  "version": 1,
  "template": {
    "mappings": {
      "_routing": {
        "required": false
      },
      "numeric_detection": true,
      "dynamic": true,
      "_source": {
        "excludes": [],
        "includes": [],
        "enabled": true
      },
      "dynamic_templates": [
        {
          "booleans": {
            "mapping": {
              "type": "boolean"
            },
            "match_mapping_type": "boolean"
          }
        },
        [...]
      ],
      "date_detection": false
    }
  }
}

mappings

Again from the ECS blob I used the properties-mappings content. As there are a lot of mappings, you prolly also have to include some settings in this component-template to even be able to save it. Mine looks (again shortend) like:

PUT _component_template/syslog-mappings
{
  "version": 3,
  "template": {
    "settings": {
      "index": {
        "mapping": {
          "total_fields": {
            "limit": "10000"
          }
        }
      }
    },
    "mappings": {
      "properties": {
        "container": {
          "properties": {
            "image": {
              "properties": {
                "name": {
                  "ignore_above": 1024,
                  "type": "keyword"
                },
                "tag": {
                  "ignore_above": 1024,
                  "type": "keyword"
                }
              }
            },
            "name": {
            [...]
            }
          }
        }
      }
    }
  }
}

settings-general

A very short template with some general settings

PUT _component_template/syslog-settings-general
{
  "version": 1,
  "template": {
    "settings": {
      "index": {
        "refresh_interval": "15s",
        "translog": {
          "durability": "async"
        },
        "auto_expand_replicas": "false",
        "max_docvalue_fields_search": "200"
      }
    }
  }
}

settings-ilm

This template defines which ILM-policy should be used

PUT _component_template/syslog-settings-ilm
{
  "version": 1,
  "template": {
    "settings": {
      "index": {
        "lifecycle": {
          "name": "logstash-syslog",
          "rollover_alias": "logstash_syslog_ilm"
        }
      }
    }
  }
}

settings-shards

Some sharding settings, we always want a minimum of 2 shards each with 1 replica. In warm/cold we shrink it to 1 down, so the shard has a max size of about 50GB

PUT _component_template/syslog-settings-shards
{
  "version": 1,
  "template": {
    "settings": {
      "index": {
        "number_of_shards": "2",
        "number_of_replicas": "1",
        "routing": {
          "allocation": {
            "total_shards_per_node": "2"
          }
        }
      }
    }
  }
}

So if you want to rollover in the ILM-policy after a max size of 100GB and want to use shrinking feature and so on, you shoud use 4 shards (4x25GB = 100GB) and shrink it to two shards. A shard should not have more than 50GB in size!

addtional-mappings

Just a few additional mappings that can be used on every other index-template too. For example if you want to use the Geo-IP feature from Logstash and have dynamic-number detection on, it can happen that the geo.postal_code is detected as number, but this also can have entries with string-charaters (which will create a mapping-error)

Strangely my "path_unmatch": "*.postal_code" configs in the dynamic-mapping-template do not work as expected - didn't have time yet to inspect what I'm doing wrong there 😐

The additional-mappings-geo look like for example:

PUT _component_template/additional-mappings-geo
{
  "version": 1,
  "template": {
    "mappings": {
      "dynamic_templates": [],
      "properties": {
        "geo": {
          "type": "object",
          "properties": {
            "continent_name": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "region_iso_code": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "city_name": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "country_iso_code": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "country_name": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "name": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "location": {
              "type": "geo_point"
            },
            "region_name": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "postal_code": {
              "ignore_above": 1024,
              "type": "keyword"
            },
            "region_code": {
              "ignore_above": 1024,
              "type": "keyword"
            }
          }
        }
      }
    }
  }
}

Logstash-Filebeat Index Template

For creating your own ECS compliant Filebeat template, you first have to export the template and then create similar component-templates like the syslog ones. For exporting the Filebeat template, you can install Filebeat somewhere, look for the binary and e.g.:

# export the beats templates via: 
...beat.exe export template --es.version 7.10.0 > ...beat.template.json

Then use the differnt parts of the ...beat.template.json to create your component-templates. In the end you can create your index-template - mine looks like

PUT _index_template/logstash-filebeat
{
  "index_patterns": [
    "logstash_filebeat_ilm-*-*"
  ],
  "composed_of": [
    "filebeat-mappings-dynamic",
    "filebeat-mappings",
    "filebeat-settings-general",
    "filebeat-settings-ilm",
    "filebeat-settings-query",
    "filebeat-settings-shards",
    "additional-mappings-apache",
    "additional-mappings-geo",
    "additional-mappings-http",
    "additional-mappings-nginx",
    "additional-mappings-syslog5424"
  ]
}

As you can see I also reuse some of the templates that I created for the syslog one before.

Proceed to PART-6

Zuletzt bearbeitet: Dezember 19, 2020

Autor

Kommentare

Kommentar verfassen

Diese Website verwendet Akismet, um Spam zu reduzieren. Erfahre mehr darüber, wie deine Kommentardaten verarbeitet werden.