Using InfluxDB to store performance metrics

Maciej Szafraniec

InfluxDB it is a database oriented on time series. It was designed to accept large number of queries and write requests. It is perfect for storing timestamped data – data we would like to query by timestamp rather than index. It supports SQL-like language to easily pull data off as well as JSON requests through HTTP API. Influx uses tags and fields – tags to easily index series of data (you can query by it after where clause) and fields for keeping values (select field_1, field_2 from … )

In order to get or write data you can use influx shell started directly from Linux command line or use HTTP endpoint with proper JSON requests (to save data) or query (to get data). To start a terminal use:

$ influx -port 7086
Visit https://enterprise.influxdata.com to register for updates
Connected to http://localhost:7086 version 1.4.3
InflushDB shell version: 1.0.0
>

We can have multiple ‘databases’ within single influx instance, so each type of metric can be stored in different database. To switch between databases type: use <database>:

InflushDB shell version: 1.0.0
> use jmeter
Using database jmeter
>

Databases must be created manually with create database <name> command.

Within a database data is structured with so called measurements that are equivalents of tables. To list all measurements in database use: show measurements. Measurements are created dynamically upon write request – when where is a request with data to non-existing measurement it will be created automatically with all the fields and tags (also write request with data with non-existing tags or fields will extend this measurement).  To know the structure of a measurement we can use following commands:

·         show tag keys from <measurement> – to list all tags within measurement

·         show field keys from <measurement> – to list all fields and its type within measurement

name: test#1
------------
tagKey
env
successful
transaction
name: test#1
------------
fieldKey    fieldType
build       integer
response    string
value       float

Having this information, we can create a select query to grab information. The simples one is: select * from “test” (note that measurement’s name must be in quotes):

> select * from "test#1"
name: test#1
------------
time                    build   env     response
1460628592046000000     1429    unit    OK
1460628594449000000     1429    unit    OK
1460628598740400000     1429    unit    OK
1460628602446000000     1429    unit    OK
1460628602987009000     1429    unit    OK
1460628612249600000     1429    unit    OK
1460628624798000000     1429    unit    OK
1460628627985000000     1429    unit    OK
1460628684854200000     1429    unit    OK
1460628692048800000     1429    unit    OK

And the more complicated one using fields and tags will be like:

> select response, value from "test#1" where "transaction"='User Authentication'
name: test#1
------------
time                    build   env     response
1460628592046000000     1429    unit    OK

Please notice that tag names are also put into quotes and string values into apostrophes.

Finally, we can query based on timestamp what is a key feature here:

select * from „test#1” where time < 1460628601937000000 – based on given timestamp

select * from „test#1” where time < now() – 10d – list all entries from the last 10 days

Influx has also several aggregation functions especially useful for data analysis, like:

  • COUNT()
  • MEAN()
  • STDDEV()
  • SUM()
  • DISTINCT()

Like regular SQL it supports data selectors like:

  • FIRST()
  • LAST()
  • MAX()
  • PERCENTILE()

And transformations:

  • EXP()
  • COS()
  • ROUND()
> select MEAN("value") from "test#1" where time < 1460628612249600000
name: test#1
------------
time    mean
0       4425.25

Writing the data to InfluxDB can be done in 2 ways. First one is through Influx shell using INSERT statement which goes according to rules:

INSERT <measurement> [,<tag sets>] <field sets> <timestamp>

So our insert example into InfluxDB via CLI looks like;

INSERT test#1, env=”unit” successful=”true” transaction=”abc” value=1111, build=1234, response=”OK” 1439938800000000000

Second option for writing data into InfluxDB is using POST request with JSON through HTTP API. Request shall be sent to

<influx_host>/db/<database>/series?u=<username>&p=<password[MA1] [MA2] > with a body in a form of JSON:

[{  
"measurement": 'test#1',  
"tags": 
    {
'env': 'unit',  
'transaction': 'abc',
'successful': 'True'
    },  
"time": 1460628601937000000,  
   	"fields": 
    {
'build': 1234,  
         	'value': 2990.0,  
    		'response': 'OK'
    }  
}]  

Instead of putting credentials into request basic HTTP authentication can be used. Also, structure of a body allows to save multiple point within single request. Time can be specified in JSON body but in case of its lack it will be generated automatically. As mentioned before Influx columns (tags or fields) can be created dynamically so we can add them whenever we want without any error – existing series would be filled with nulls in these columns.

Here is python example how to write data to InfluxDB using HTTP API:

from influxdb import InfluxDBClient

host = 'influx_host'
port = 8086 
username = 'username'
password = 'password'
json = [{
    "measurement": 'test#1',
    "tags":
        {
            'env': 'unit',
            'transaction': 'abc',
            'successful': 'True'
        },
    "time": 1460628601937000000,
    "fields":
        {
            'build': 1234,
            'value': 2990.0,
            'response': 'OK'
        }
}]

client = InfluxDBClient(host, port, username, password)
client.write_points(points=json, database='jmeter')

In InfluxDB measurements, tag keys, tag values and field keys are always strings. This is the reason why any math operation cannot be performed on tag values. Different from this are field values where different data types can be stored:

·         Float – default numerical value, when we request value=2990.0 or value=2990 it will be stored as float

·         Integer – to force value be saved as integer it must have an ‘i’ at the end: value=2990i

·         String – value put in quotes will be treated as string: value=”2990”

·         Boolean – TRUE is represented by t, T, true, True, or TRUE; FALSE is represented by f, F, false, False, or FALSE

Type of filed value is set during first write request.

 Summary

Influx seems to be a very decent DB for storing time-oriented large amount of data.  According to external benchmarks it is much more efficient comparing to i.e. Cassandra in terms of write throughput or response times. Additionally, many visualizing data tools like Grafana have a built-in support for pulling out data what makes charts creation very easy.

Poznaj mageek of j‑labs i daj się zadziwić, jak może wyglądać praca z j‑People!

Skontaktuj się z nami