Simple logstash, elastic search and kibana example

Share on:

Hello Everyone, now that we know how to use logstash from Post 1 and Post 2. This time we will load world cities data apply a few filters, transform it and store results in elastic search to analyze it.

ELK Download

1wget https://artifacts.elastic.co/downloads/logstash/logstash-[version-os-arch].tar.gz
2wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-[version-os-arch].tar.gz
3wget https://artifacts.elastic.co/downloads/kibana/kibana-[version-os-arch].tar.gz

Setup

1tar -xvzf logstash-[version-os-arch].tar.gz
2tar -xvzf elasticsearch-[version-os-arch].tar.gz
3tar -xvzf kibana-[version-os-arch].tar.gz

update the content of .profile file located at ~ or do cd ~

.profile file contents are as follows

1...
2export LOGSTASH_HOME=/path/to/logstash-version
3export ES_HOST_HOME=/path/to/logstash-version
4export KIBANA_HOME=/path/to/logstash-version
5PATH=$PATH:$LOGSTASH_HOME/bin:ES_HOST_HOME/bin:KIBANA_HOME:/bin
6export PATH
7...

Do a source .profile at home directory

Then try running elasticsearch from terminal it should start server @localhost:9200 similarly for kibana @localhost:5601

Otherwise, if you do not wish to flood .profile file, use .bash_aliases instead.

1cd ~
2touch .bash_aliases
3echo "alias logstash7=/path/to/logstash-version/binlogstash" > ~/.bash_aliases
4echo "alias es6=/path/to/elastic-search-version/bin/elasticsearch" >> ~/.bash_aliases
5echo "alias kibana6=/path/to/kibana-version/bin/kibana" >> ~/   .bash_aliases

WorldCities Data Download

You can download data in csv format from here.

Input Data

 1input {
 2    file {
 3        path => "/home/ashish/worldcities.csv"
 4        start_position => "beginning"
 5        sincedb_path => "/dev/null"
 6    }
 7}
 8
 9filter {
10    csv {
11        separator => ","
12        skip_header => true
13        columns => ["city","city_ascii","lat","lng","country","iso2","iso3","admin_name","capital","population","id"]
14    }
15
16    mutate {
17        add_field => {
18            "[location][lat]" => "%{lat}"
19            "[location][lon]" => "%{lng}"
20        }
21        remove_field => ["lat", "lng", "path", "message", "host"]
22    }
23}
24
25output {
26    elasticsearch {
27        hosts => ["localhost:9200"]
28        index => "world-cities"
29        document_id => "%{id}"
30    }
31
32}

We are using input file plugin to read contents from the beginning of the csv file. While we receive line by line from the input source we apply csv filter plugin. All this data we are going to dump inside es host. But before we do that we examine our input data.

  • city - The name of the city/town as a Unicode string (e.g. Goiânia)
  • lat/lng - location identifier
  • country - The name of the city/town's country.
  • capital - country's capital.
  • population - An estimate of the city's urban population. Only available for some (prominent) cities. If the urban population is not available, the municipal population is used.

reference

Filter

Let all fields form together with a document, but lat and lng we can transform into a nested object this will later help us easily do geo distance queries on the index when we have location's field mapping as geo_point in the elastic search index.

1{
2    "city": "",
3    "location": {
4        "lat": "a double",
5        "lon": "a double"
6    }
7}

We have used mutate filter here to convert values from 2 columns into a new nested object of the same row. Once the setup for the input and filter stage is completed let’s focus on output. We are going to redirect the output to the elastic search index. As informed before location field will be defined as geo_point for performing geo distance queries.

If i do

1GET world-cities/_mapping

I get a descriptive json but in that big structure I am only interested in the location key which looks something similar to

 1...
 2      "location" : {
 3         "properties" : {
 4           "lat" : {
 5             "type" : "text",
 6             "fields" : {
 7               "keyword" : {
 8                 "type" : "keyword",
 9                 "ignore_above" : 256
10               }
11             }
12           },
13           "lon" : {
14             "type" : "text",
15             "fields" : {
16               "keyword" : {
17                 "type" : "keyword",
18                 "ignore_above" : 256
19               }
20             }
21           }
22         }
23       }
24...

Which doesn't quite look like a proper location field entry, so we do change the mapping or rather before we run our logstash script let’s create index with proper mapping, so far rest of the fields looks alright other than location, so let’s fix location mapping as below by first deleting index(since this is an example)

1DELETE world-cities

Then creating an index with proper mapping:

 1PUT world-cities
 2{
 3 "mappings": {
 4   "properties": {
 5     "location":{
 6       "type": "geo_point"
 7     }
 8   }
 9 }
10}

Now when you execute get mapping for index it should return json like below

1{
2   ...
3   "location" : {
4       "type" : "geo_point"
5   }
6   ...
7}

Having the location field setup properly now will help us do geo queries easily on the index.

Using Kibana

Let's find all the city by its country iso2 code

 1POST world-cities/_search?size=0
 2{
 3 "query": {
 4   "query_string": {
 5     "default_field": "iso2",
 6     "query": "IN"
 7   }
 8 },
 9 "aggs": {
10   "city_agg": {
11     "terms": {
12       "field": "city.keyword"
13     }
14   }
15 }
16}

Adding Index and populating maps

comments powered by Disqus