Project

General

Profile

Actions

Incident #11632

closed

orientdb01-d4s receives a huge number of connection

Added by Luca Frosini about 7 years ago. Updated about 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
_InfraScience Systems Engineer
Category:
Application
Target version:
Start date:
Apr 12, 2018
Due date:
% Done:

100%

Estimated time:
Infrastructure:
Production

Description

From logs I found:

2018-04-12 18:29:28:484 WARNI Reached maximum number of concurrent connections (max=1000, current=5488), reject incoming connection from /146.48.123.23:55173 [OServerNetworkListener]
2018-04-12 18:30:00:024 WARNI Reached maximum number of concurrent connections (max=1000, current=5528), reject incoming connection from /146.48.122.33:47036 [OServerNetworkListener]

146.48.122.33 : social-indexer.d4science.org
146.48.123.23 : monitoring.research-infrastructures.eu

I'll investigate the first to check if the problem is on smart-executor, but I don't understand the second

Actions #1

Updated by Andrea Dell'Amico about 7 years ago

It's the nagios check. I don't know how the connectios aren't closed, it's the same http call since months: orientdb01-d4s.d4science.org:2480/studio/index.html

Actions #2

Updated by Andrea Dell'Amico about 7 years ago

The check is currently failing with connection reset by peer, it's behaving this way since a couple of days.

Actions #3

Updated by Luca Frosini about 7 years ago

Before restarting the instance, I also deleted some structure so the token the application has already obtained are not valid anymore and orient reset the connection.
Can we stop Nagios check and restart it tomorrow? I'll restart social-indecer too.

Actions #4

Updated by Andrea Dell'Amico about 7 years ago

I stopped the nagios check. The nagios check did not use any token btw, that URL should be public.

Actions #5

Updated by Luca Frosini about 7 years ago

@andrea.dellamico@isti.cnr.it or @roberto.cirillo@isti.cnr.it can you provide me access to social-indexer.d4science.org

Actions #6

Updated by Roberto Cirillo about 7 years ago

Luca Frosini wrote:

@andrea.dellamico@isti.cnr.it or @roberto.cirillo@isti.cnr.it can you provide me access to social-indexer.d4science.org

Done. You can access as gcube

Actions #7

Updated by Luca Frosini about 7 years ago

social-indexer.d4science.org restarted. @andrea.dellamico@isti.cnr.it can you restart nagios? Thanks a lot

Actions #8

Updated by Andrea Dell'Amico about 7 years ago

  • Status changed from New to Closed
  • % Done changed from 0 to 100

Done.

Can you share what the problem was?

Actions #9

Updated by Luca Frosini about 7 years ago

@andrea.dellamico@isti.cnr.it I suspect a bug in Resource Registry. I already investigated the port type to interact with instances and I excluded it. I could be instead schema port type (so also the API used from nagios and haproxy to monitor the instance) but I still have to investigate it. I'll share the problem as soon as I will find it.

Actions #11

Updated by Luca Frosini about 7 years ago

I found a potential bug in the resource registry but it should occur in the actual situation. Anyway, I added an additional line of code to solve it. Please note that in dev instance cannot be tested because there to few contexts to make it happens.

Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 8.91 MB)