Monday, June 20, 2016

Developing Tibco BW interfaces with Elasticsearch for trace logs – PART 1

Writing trace logs to SQL server is old-fashioned.

Elasticsearch gives many powerful options to search and visualize data:
1. Based on Lucene engine, it's possible to perform a free-text search on data.
2. Visualize data easily with Kibana (which connects to elasticsearch
db).
3. Create alerts based on the data.
4. Many more!

In the first part we'll get to learn a bit of elasticsearch platform.

Getting started

 

Download and install:

Elasticsearch, Kibana, Sense.

Download curl for windows:


 

Start Elasticsearch server and configure mappings


Background


Elasticsearch stores documents, which needs to be sent in JSON format:

a. Each document set should have configured mapping. 
In the example below: mapping created with index named "monitors" and type named "monitor". Types can be reused across several indexes.

b. It's good to have one date field, which will be used as a "Time-Field".

c. Analyzed index (default) – Enables a full-text search on field. It's possible to define various analyzers.
Non-Analyzed index – Full-text search disabled on field. The exact term should be written when searching the doc.

More info here: http://stackoverflow.com/questions/12836642/analyzers-in-elasticsearch

d. It's possible to write documents with fields that doesn't exists in the mapping.

e. It's impossible to modify existing mappings (Only to drop & create). To drop mapping means to delete all documents. 


f. term vs. match - The match query will apply the same standard analyzer to the search term and will therefore match what is stored in the index. The term query does not apply any analyzers to the search term so will only look for that exact term in the index.

h. It's important to partition data by dates (Using Index Templates). For instance: Create date-based indexes (all using the same type).
Please note that after creation of an index template, elasticsearch waits for a first document to be insert in order to automatically build the index. 


Go for it


1. Open command line and start elasticsearch server (default port is 9200):

[Elasticsearch installation]\bin\elasticsearch.bat


2. Open another command line and create mapping:
 Open command line and write the following:

curl -XPOST "localhost:9200/monitors" -d "{ \"settings\" : {
\"number_of_shards\" : 1 }, \"mappings\" : { \"monitor\" : {
\"properties\" : { \"ProcessGroup\": { \"type\": \"string\",
\"index\": \"not_analyzed\" }, \"ProcessName\": { \"type\":
\"string\", \"index\": \"analyzed\" }, \"OpName\": { \"type\":
\"string\", \"index\": \"analyzed\" }, \"Domain\": { \"type\":
\"string\", \"index\": \"not_analyzed\" }, \"TraceType\": {
\"type\": \"string\", \"index\": \"not_analyzed\" },
\"TraceDateTime\": { \"type\": \"date\", \"format\": \"yyyy-MM-
dd HH:mm:ss\" }, \"PatientID\": { \"type\": \"string\", \"index\":
\"analyzed\" }, \"MessageDateTime\": { \"type\": \"string\" },
\"ApplicationCode\": { \"type\": \"string\", \"index\":
\"not_analyzed\" }, \"SrcMessageID\": { \"type\": \"string\",
\"index\": \"analyzed\" }, \"ProcessID\": { \"type\": \"string\",
\"index\": \"not_analyzed\" }, \"OpID\": {\"type\": \"string\",
\"index\": \"not_analyzed\" }, \"OpParentID\": { \"type\":
\"string\", \"index\": \"not_analyzed\" }, \"HostName\": {
\"type\": \"string\", \"index\": \"not_analyzed\"} } } } }"




3. Insert this test document:

{
  "ProcessGroup": "test",
  "ProcessName": "test",
  "OpName": "test",
  "Domain": "test",
  "TraceType": "Info",
  "TraceDateTime": "2016-04-04 04:46:47",
  "PatientID": "test",
  "MessageDateTime": "2016-04-04 04:46:47",
  "ApplicationCode": "test",
  "SrcMessageID": "54000",
  "ProcessID": "test",
  "OpID": "test",
  "OpParentID": "test",
  "HostName": "ohadavn",
  "req1": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
  "req2": "yyyyyyyyyyyyyyyyyyyyyyyyyy"
}


By sending this command:

curl -XPOST "http://localhost:9200/monitors/monitor/?pretty" -
d"{\"ProcessGroup\":\"test\", \"ProcessName\":\"test\",
\"OpName\":\"test\", \"Domain\":\"test\",
\"TraceType\":\"Info\", \"TraceDateTime\":\"2016-04-04
04:46:47\", \"PatientID\":\"test\", \"MessageDateTime\":\"2016-
04-04 04:46:47\", \"ApplicationCode\":\"test\",
\"SrcMessageID\":\"54000\", \"ProcessID\":\"test\",
\"OpID\":\"test\", \"OpParentID\":\"test\",
\"HostName\":\"ohad\",
\"req1\":\"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\",
\"req2\":\"yyyyyyyyyyyyyyyyyyyyyyyyyy\" }" 

Pretty ugly right ? don't worry, we'll get started with Sense right away...

Start Kibana server and add a new index pattern


1. Start kibana server (Default port 5601):

[Kibana installation]\bin\kibana.bat


2. Open web-browser:

http://localhost:5601/

3. Add an index-pattern to the mapping configured previously:

Index name or pattern: monitors*
Time-field name: TraceDateTime



4.  On the top menu - choose "Discover" to view all documents
     (currently, only one exists).
     On the right top menu - change to "Last 5 years".


 

Sense – GUI which enables to send commands to elastic


 Press and choose Sense:


1. Delete the index built before by querying:

DELETE /monitors

2. Create Index Template & Mapping:

POST /_template/template_monitors
{
  "template": "monitors*",
  "settings": {
    "number_of_shards": 1
  },
  "mappings": {
    "monitor": {
      "properties": {
        "ProcessGroup": {
          "type": "string",
          "index": "not_analyzed"
        },
        "ProcessName": {
          "type": "string",
          "index": "not_analyzed"
        },
        "OpName": {
          "type": "string",
          "index": "not_analyzed"
        },
        "Domain": {
          "type": "string",
          "index": "not_analyzed"
        },
        "LogLevel": {
          "type": "string",
          "index": "not_analyzed"
        },
        "StartDateTime": {
          "type": "date",
          "format": "yyyy-MM-dd HH:mm:ss"
        },
        "EndDateTime": {
          "type": "date",
          "format": "yyyy-MM-dd HH:mm:ss"
        },
        "PatientID": {
          "type": "string",
          "index": "not_analyzed"
        },
        "MessageDateTime": {
          "type": "string"
        },
        "ApplicationCode": {
          "type": "string",
          "index": "not_analyzed"
        },
        "SrcMessageID": {
          "type": "string",
          "index": "not_analyzed"
        },
        "ProcessID": {
          "type": "string",
          "index": "not_analyzed"
        },
        "OpID": {
          "type": "string",
          "index": "not_analyzed"
        },
        "OpParentID": {
          "type": "string",
          "index": "not_analyzed"
        },
        "HostName": {
          "type": "string",
          "index": "not_analyzed"
        },
        "Status": {
          "type": "string",
          "index": "not_analyzed"
        }
      }
    }
  }
}

3. Insert a test document into an index of current month (will be generated now):

POST monitors-2016-05/monitor
{
  "ProcessGroup": "test",
  "ProcessName": "test",
  "OpName": "test",
  "Domain": "test",
  "LogLevel": "Info",
  "StartDateTime": "2016-05-04 04:46:47",
  "EndDateTime": "2016-05-04 04:47:47",
  "PatientID": "test me please",
  "MessageDateTime": "2016-05-04 04:46:47",
  "ApplicationCode": "test",
  "SrcMessageID": "54000",
  "ProcessID": "test",
  "OpID": "test",
  "OpParentID": "test",
  "HostName": "ohadavn",
  "Status": "10",
  "req1": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
  "req2": "yyyyyyyyyyyyyyyyyyyyyyyyyy"
}

 Basic useful queries:


 Get "monitor" type mapping:


 GET monitors/_mapping/monitor 

 Get all of "monitor":


POST monitors*/monitor/_search
{
  "query": {
    "match_all" : {}
  }
}

  200 documents of "monitor" type:


 GET monitors*/monitor/_search 
 {
    "from" : 0, "size" : 200,
     "query": {
        "term": {
          "_type" :    "monitor"
        }
    } 
 } 

OR

 GET monitors/monitor/_search 
 {
    "from" : 0, "size" : 203
 }

Today's 10 last documents of "monitor" type inserted:


POST monitors*/monitor/_search
{
  "from": 0,
  "size": 10,
  "query": {
          "range": {
            "StartDateTime": {
              "gte": "now-1d/d",
              "lte": "now/d",
              "boost": 2,
              "format": "yyyy-MM-dd HH:mm:ss"
            }
          }
  },
  "sort": {
    "StartDateTime": {
      "order": "desc",
      "ignore_unmapped": "true"
    }
  }
}

Documents of "monitor" type sorted by descending time and between two dates:


POST monitors*/monitor/_search
{
  "query": {
    "range": {
      "StartDateTime": {
        "gte": "2016-04-01 13:59:50",
        "lte": "2016-04-10 13:59:50",
        "boost": 2,
        "format": "yyyy-MM-dd HH:mm:ss"
      }
    }
  },
  "sort": {
    "StartDateTime": {
      "order": "desc",
      "ignore_unmapped": "true"
    }
  }

Document by id:


POST monitors*/monitor/_search
{
  "query": {
    "term": {
      "_id": {
        "value": "AVP-jR7M3HcXFbGJa5pk"
      }
    }
  }
}

OR

POST monitors*/monitor/_search
{
  "query": {
    "term": {
      "_id": "AVP-jR7M3HcXFbGJa5pk"
    }
  }
}

Documents filtered (doesn't effect scoring) by condition ordered by descending date:


POST monitors*/monitor/_search
{
  "query": {
        "filter": {
            "term" : { "ProcessName" : "myApp" }
        }
  },
  "sort": {
    "TraceDateTime": {
      "order": "desc",
      "ignore_unmapped": "true"
    }
  }
}

Contains query:



POST monitors*/monitor/_search
{
  "query": {
      "wildcard": {
        "ProcessName": "*test*"
      }
  },
  "sort": {
    "StartDateTime": {
      "order": "desc",
      "ignore_unmapped": "true"
    }
  }
}


Create snapshot (backup data):


PUT /_snapshot/dbbackup
{
  "type": "fs",
  "settings": {
      "compress": true,
      "location": "dbbackup"
  }
}

PUT /_snapshot/dbbackup/snap
{
  "type": "fs",
  "settings": {
      "compress": true,
      "location": "dbbackup"
  }
}

Restore snapshot:


POST /_snapshot/dbbackup/snap/_restore


Delete by query:


Requires plugin installation:

[Elasticsearch installation]\bin\plugin install delete-by-query

DELETE /monitors/monitor/_query
{
  "query": {
    "term": { "ProcessName" : "proc" }
  }
}


Visualize & Dashboard


I won't get into this subject in this two-series post but that's really
easy to create various data visualizations and dashboards using kibana.


Remarks


In order to send commands to elasticsearch from other clients:

edit 
[Elasticsearch installation]\config\elasticsearch.yml  

and add: 
cluster.name: [cluster name]
network.host: [Server IP Address in network], 127.0.0.1

In order to create snapshots add:
path.repo: ["C:\\path\\to\\snapshots"]

Thank you Blogger, hello Medium

Hey guys, I've been writing in Blogger for almost 10 years this is a time to move on. I'm happy to announce my new blog at Med...