Simple logstash, elastic search and kibana example
Hello Everyone, now that we know how to use logstash from Post 1 and Post 2. This time we will load world cities data apply a few filters, transform it and store results in elastic search to analyze it.
ELK Download
1wget https://artifacts.elastic.co/downloads/logstash/logstash-[version-os-arch].tar.gz
2wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-[version-os-arch].tar.gz
3wget https://artifacts.elastic.co/downloads/kibana/kibana-[version-os-arch].tar.gz
Setup
1tar -xvzf logstash-[version-os-arch].tar.gz
2tar -xvzf elasticsearch-[version-os-arch].tar.gz
3tar -xvzf kibana-[version-os-arch].tar.gz
update the content of .profile file located at ~
or do cd ~
.profile
file contents are as follows
1...
2export LOGSTASH_HOME=/path/to/logstash-version
3export ES_HOST_HOME=/path/to/logstash-version
4export KIBANA_HOME=/path/to/logstash-version
5PATH=$PATH:$LOGSTASH_HOME/bin:ES_HOST_HOME/bin:KIBANA_HOME:/bin
6export PATH
7...
Do a source .profile
at home directory
Then try running elasticsearch
from terminal it should start server @localhost:9200
similarly for kibana @localhost:5601
Otherwise, if you do not wish to flood .profile file, use .bash_aliases
instead.
1cd ~
2touch .bash_aliases
3echo "alias logstash7=/path/to/logstash-version/binlogstash" > ~/.bash_aliases
4echo "alias es6=/path/to/elastic-search-version/bin/elasticsearch" >> ~/.bash_aliases
5echo "alias kibana6=/path/to/kibana-version/bin/kibana" >> ~/ .bash_aliases
WorldCities Data Download
You can download data in csv format from here.
Input Data
1input {
2 file {
3 path => "/home/ashish/worldcities.csv"
4 start_position => "beginning"
5 sincedb_path => "/dev/null"
6 }
7}
8
9filter {
10 csv {
11 separator => ","
12 skip_header => true
13 columns => ["city","city_ascii","lat","lng","country","iso2","iso3","admin_name","capital","population","id"]
14 }
15
16 mutate {
17 add_field => {
18 "[location][lat]" => "%{lat}"
19 "[location][lon]" => "%{lng}"
20 }
21 remove_field => ["lat", "lng", "path", "message", "host"]
22 }
23}
24
25output {
26 elasticsearch {
27 hosts => ["localhost:9200"]
28 index => "world-cities"
29 document_id => "%{id}"
30 }
31
32}
We are using input file plugin to read contents from the beginning of the csv file. While we receive line by line from the input source we apply csv filter plugin. All this data we are going to dump inside es host. But before we do that we examine our input data.
- city - The name of the city/town as a Unicode string (e.g. Goiânia)
- lat/lng - location identifier
- country - The name of the city/town's country.
- capital - country's capital.
- population - An estimate of the city's urban population. Only available for some (prominent) cities. If the urban population is not available, the municipal population is used.
Filter
Let all fields form together with a document, but lat and lng we can transform into a nested object this will later help us easily do geo distance queries on the index when we have location's field mapping as geo_point
in the elastic search index.
1{
2 "city": "",
3 "location": {
4 "lat": "a double",
5 "lon": "a double"
6 }
7}
We have used mutate filter here to convert values from 2 columns into a new nested object of the same row. Once the setup for the input and filter stage is completed let’s focus on output. We are going to redirect the output to the elastic search index. As informed before location field will be defined as geo_point for performing geo distance queries.
If i do
1GET world-cities/_mapping
I get a descriptive json but in that big structure I am only interested in the location key which looks something similar to
1...
2 "location" : {
3 "properties" : {
4 "lat" : {
5 "type" : "text",
6 "fields" : {
7 "keyword" : {
8 "type" : "keyword",
9 "ignore_above" : 256
10 }
11 }
12 },
13 "lon" : {
14 "type" : "text",
15 "fields" : {
16 "keyword" : {
17 "type" : "keyword",
18 "ignore_above" : 256
19 }
20 }
21 }
22 }
23 }
24...
Which doesn't quite look like a proper location field entry, so we do change the mapping or rather before we run our logstash script let’s create index with proper mapping, so far rest of the fields looks alright other than location, so let’s fix location mapping as below by first deleting index(since this is an example)
1DELETE world-cities
Then creating an index with proper mapping:
1PUT world-cities
2{
3 "mappings": {
4 "properties": {
5 "location":{
6 "type": "geo_point"
7 }
8 }
9 }
10}
Now when you execute get mapping for index it should return json like below
1{
2 ...
3 "location" : {
4 "type" : "geo_point"
5 }
6 ...
7}
Having the location field setup properly now will help us do geo queries easily on the index.
Using Kibana
Let's find all the city by its country iso2 code
1POST world-cities/_search?size=0
2{
3 "query": {
4 "query_string": {
5 "default_field": "iso2",
6 "query": "IN"
7 }
8 },
9 "aggs": {
10 "city_agg": {
11 "terms": {
12 "field": "city.keyword"
13 }
14 }
15 }
16}