Project

General

Profile

Actions

Task #2221

closed

Experiment with a new monitoring/metrics tool: influxdata (influxdb)

Added by Andrea Dell'Amico over 9 years ago. Updated over 8 years ago.

Status:
Closed
Priority:
Normal
Assignee:
_InfraScience Systems Engineer
Category:
System Application
Target version:
Start date:
Feb 10, 2016
Due date:
% Done:

100%

Estimated time:
Infrastructure:
Development, Pre-Production, Production

Description

The tool is promising:

  • All the components are free software (they sell managed services and consulting)
  • Can do alerts, or delegate them to nagios
  • It has metrics plugins for almost all the software we commonly use

Drawbacks: It seems that does not natively aggregates logs as logstash does, but in the worst case we could use the same input plugins to route the logs to an ELK instance.
Some references:

https://influxdata.com/time-series-platform/
https://www.elastic.co/guide/en/logstash/current/plugins-outputs-influxdb.html
https://www.digitalocean.com/community/tutorials/how-to-analyze-system-metrics-with-influxdb-on-centos-7
https://github.com/rochaporto/collectd-ceph
https://github.com/pbanaszkiewicz/collectd-ganeti (Needs modifications because it only supports KVM right now)

Actions #1

Updated by Andrea Dell'Amico over 9 years ago

  • Related to Task #1854: Tool for usage statistics on DataMiner added
Actions #2

Updated by Andrea Dell'Amico over 9 years ago

  • Status changed from New to In Progress
Actions #3

Updated by Andrea Dell'Amico over 9 years ago

  • % Done changed from 0 to 20

A preliminar dashboard can be used here: node0-monitoring.research-infrastructures.eu
Beware that there's no authentication and all the dashboards can be edited by anyone. Access is restricted to the ISTI networks.

Actions #4

Updated by Andrea Dell'Amico over 9 years ago

It seems better suited to collect application and system metrics and maybe send alerts based on them. While ELK seems more suited to aggregate logs.

Actions #5

Updated by Andrea Dell'Amico over 9 years ago

http://www.collectd.org and http://www.fluentd.org can talk with influxdb (and other backends) and seem to offer a wide range of input plugins.

Actions #6

Updated by Andrea Dell'Amico over 9 years ago

And it seems that fluentd can collect the syslog output and feed it to influxdb: http://www.fluentd.org/guides/recipes/syslog-influxdb

Actions #7

Updated by Andrea Dell'Amico over 9 years ago

I see that Kronograf is still very limited and the useful features will not be open sources. It seems that http://graphana.org is a better alternative that can also read from elasticsearch, prometheus and other dbs. So it will be maybe possible to have the same graphical interface for all the monitoring sources.

Actions #8

Updated by Andrea Dell'Amico over 9 years ago

  • Related to deleted (Task #1854: Tool for usage statistics on DataMiner)
Actions #9

Updated by Andrea Dell'Amico over 8 years ago

  • Status changed from In Progress to Closed
  • % Done changed from 20 to 100

As the interesting tool are commercial, and prometheus seems a better solution, influx will not be deployed.

Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 8.91 MB)