Project

General

Profile

Actions

Task #12263

closed

Restart Accounting Aggregator

Added by Luca Frosini almost 7 years ago. Updated almost 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Application
Target version:
Start date:
Jul 27, 2018
Due date:
% Done:

100%

Estimated time:
Infrastructure:
Production

Related issues

Related to D4Science Infrastructure - Incident #12203: News Feed search function returns result up to mid June 2018ClosedLuca FrosiniJul 19, 2018Jul 20, 2018

Actions
Related to D4Science Infrastructure - Task #12270: Shutdown resource-registry until the next upgradeClosedLuca FrosiniJul 30, 2018

Actions
Actions #1

Updated by Luca Frosini almost 7 years ago

  • Related to Incident #12203: News Feed search function returns result up to mid June 2018 added
Actions #2

Updated by Luca Frosini almost 7 years ago

  • Subject changed from Restart Accounting Aggregator which is not running from last upgrade to Restart Accounting Aggregator

The smart-executor did not started correctly as happened for social-indexer-plugin see #12203

orientdb01-d4s denyend any incoming connection because it

Reached maximum number of concurrent connections (max=1000, current=4559), reject incoming connection from /146.48.123.23:60410

146.48.123.23 is monitoring.research-infrastructures.eu

Moreover I found that there are tentative connections from 146.48.122.24 (postgresql-srv-dev.d4science.org)

Why this machines are trying to connect to orientdb?

@andrea.dellamico@isti.cnr.it @tommaso.piccioli@isti.cnr.it can you check this?

Actions #3

Updated by Luca Frosini almost 7 years ago

  • Status changed from In Progress to Closed
  • % Done changed from 0 to 100

I successfully restarted orientdb and accounting-aggregator.d4science.org smartgears container.

Please note that the container was restarted tonight at midnight.

Actions #4

Updated by Andrea Dell'Amico almost 7 years ago

Luca Frosini wrote:

Moreover I found that there are tentative connections from 146.48.122.24 (postgresql-srv-dev.d4science.org)

How did you associate 146.48.122.24 with postgresql-srv-dev.d4science.org? That's the IP of accounting-aggregator.d4science.org

Actions #5

Updated by Luca Frosini almost 7 years ago

Andrea Dell'Amico wrote:

Luca Frosini wrote:

Moreover I found that there are tentative connections from 146.48.122.24 (postgresql-srv-dev.d4science.org)

How did you associate 146.48.122.24 with postgresql-srv-dev.d4science.org? That's the IP of accounting-aggregator.d4science.org

I made 'host' command on shell. Maybe I entered the IP 146.48.123.24 for mistake which is is the one of postgresql-srv-dev.d4science.org.

What about the connections from 'monitoring.research-infrastructures.eu'?

Actions #6

Updated by Luca Frosini almost 7 years ago

the host was down again because orientdb refuses the connections.

Can you please check monitoring.research-infrastructures.eu?

Actions #7

Updated by Luca Frosini almost 7 years ago

The container is restart every midnight because in the past it leaks memory. Every smart-executor try to connect to orientdb at start time to check if there is a sheduling for the plugin.

Actions #8

Updated by Andrea Dell'Amico almost 7 years ago

Luca Frosini wrote:

the host was down again because orientdb refuses the connections.

Can you please check monitoring.research-infrastructures.eu?

What am I supposed to check? The nagios test never changed since we added it: it's a http request to /studio/index.hml (port 2048). The check runs every 5 minutes, as all the other http checks.

Actions #9

Updated by Luca Frosini almost 7 years ago

I restarted the resource-registry which is the old version (the new one cannot be used because it impact on all the clients). It could be the ones using so many connections.

@roberto.cirillo@isti.cnr.it are there any hosts using the new resource-registry? Otherwise we can stop it during the holiday time.

Actions #10

Updated by Roberto Cirillo almost 7 years ago

Luca Frosini wrote:

I restarted the resource-registry which is the old version (the new one cannot be used because it impact on all the clients). It could be the ones using so many connections.

@roberto.cirillo@isti.cnr.it are there any hosts using the new resource-registry? Otherwise we can stop it during the holiday time.

I don't know, I remember that in the past the home-library-webapp was using it, I've checked now and the handler is not more present on home-library webapp.

Actions #11

Updated by Luca Frosini almost 7 years ago

  • Related to Task #12270: Shutdown resource-registry until the next upgrade added
Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 8.91 MB)