Task #1854
closed
Tool for usage statistics on DataMiner
Added by Gianpaolo Coro over 9 years ago.
Updated about 8 years ago.
Category:
High-Throughput-Computing
Infrastructure:
Production
Description
A tool is required to deeply analyse the requests to DataMiner and their provenance.
This operation is currently done using Linux commands on the wps.statistical server but it could be done more flexibly on DataMiner.
In particular, Andrea Dell'Amico indicated the following tools:
http://piwik.org/
http://www.awstats.org/
Note that this is different from users' accounting because produced information will be for statistical purposes only, to be reported in the course of projects reviews. Furthermore, this system should report all the requests from non-portal users too (e.g. FishBase).
I see that piwik can also ingest log files, so it's the best candidate: it's more modern and more powerful.
- Related to Task #2221: Experiment with a new monitoring/metrics tool: influxdata (influxdb) added
- Related to deleted (Task #2221: Experiment with a new monitoring/metrics tool: influxdata (influxdb))
- Status changed from New to In Progress
Hostname is going to be: analytics.d4science.org, IP 146.48.122.20
- % Done changed from 0 to 40
The basic configuration is ready. The server answers at http://analytics.d4science.org
It authenticates against ldap but without supporto to ldap groups (it can use one group only), so the users have to be manually authorized after their first login.
Now I need a procedure to automatically process the dataminer log files.
I'm going to change the dataminer{1,2}-p-d4s.d4science.org nginx configuration to let the access logs be available for download from the analytics server. The new configuration entry will be something like
location /logs {
alias /var/log/nginx;
allow 146.48.122.20; # analytics.d4science.org
deny all;
}
- Status changed from In Progress to Feedback
- % Done changed from 40 to 100
I've imported the logs of the last two days. The procedure will run every hour from now on.
The imported information is very partial because they are not real visits, but the URLs are complete and the calling IP address is there.
- Assignee changed from Andrea Dell'Amico to Gianpaolo Coro
- Status changed from Feedback to In Progress
- Assignee changed from Gianpaolo Coro to Andrea Dell'Amico
Is it possible to give me access to http://analytics.d4science.org?
I would like to evaluate this solution before we go investigating other ones (e.g. elastic search + grafana)
You should already have access. The credentials are the same of the production portals and redmine.
All the data logs are frozen at march 26th, but the import script seems working. I need to investigate the behaviour.
It is true, the analytics arrive up to March. Further, defining a new Goal does not filter data, it is like the Goals processing tools is not communicating with the data repository.
- Status changed from In Progress to Closed
Also available in: Atom
PDF