Actions
Incident #10373
closed
Authorisation service dev and preprod is broken
Status:
Closed
Priority:
Urgent
Assignee:
_InfraScience Systems Engineer
Category:
-
Target version:
Start date:
Nov 21, 2017
Due date:
Nov 21, 2017
% Done:
100%
Estimated time:
Infrastructure:
Development, Pre-Production
Description
according to @roberto.cirillo@isti.cnr.it and @lucio.lelii@isti.cnr.it the issue is in pgpool
Updated by Massimiliano Assante over 7 years ago
- Is duplicate of Incident #10372: Authorization service: pgpool problem added
Updated by Massimiliano Assante over 7 years ago
- Status changed from New to Closed
Updated by Massimiliano Assante over 7 years ago
- Is duplicate of deleted (Incident #10372: Authorization service: pgpool problem)
Updated by Massimiliano Assante over 7 years ago
- Due date changed from Nov 20, 2017 to Nov 21, 2017
- Status changed from Closed to In Progress
- Start date changed from Nov 20, 2017 to Nov 21, 2017
Authorisation service dev and preprod is broken again today
Updated by Andrea Dell'Amico over 7 years ago
Wut. Sorry, my fault. I didn't remove a broken script.
Updated by Andrea Dell'Amico over 7 years ago
Hm. But on one of the servers only. The other one has open connections since yesterday:
postgres 14025 0.0 0.2 104936 4060 ? S Nov20 0:00 pgpool: authorization authorization auth-d.dev.d4science.org(42564) idle postgres 14027 0.0 0.2 106500 5908 ? S Nov20 0:04 pgpool: authorization authorization auth-d.dev.d4science.org(42235) idle postgres 14028 0.0 0.2 104940 4060 ? S Nov20 0:01 pgpool: authorization authorization auth-d.dev.d4science.org(42242) idle postgres 14030 0.0 0.2 104932 4276 ? S Nov20 0:02 pgpool: authorization authorization auth-d.dev.d4science.org(42240) idle postgres 14031 0.0 0.2 104940 4064 ? S Nov20 0:08 pgpool: authorization authorization auth-d.dev.d4science.org(42236) idle postgres 14032 0.0 0.2 104804 4060 ? S Nov20 0:01 pgpool: authorization authorization auth-d.dev.d4science.org(42241) idle postgres 14034 0.0 0.2 104940 4064 ? S Nov20 0:04 pgpool: authorization authorization auth-d.dev.d4science.org(42238) idle postgres 14035 0.0 0.2 104940 4064 ? S Nov20 0:03 pgpool: authorization authorization auth-d.dev.d4science.org(42239) idle postgres 14036 0.0 0.2 104940 4064 ? S Nov20 0:05 pgpool: authorization authorization auth-d.dev.d4science.org(42237) idle
Updated by Andrea Dell'Amico over 7 years ago
And I don't see any errors in the authorization logs. Can you give any details?
Updated by Massimiliano Assante over 7 years ago
2017-11-21 11:06:35,997 ERROR o.g.p.t.SmartGearsPortalValve [http-bio-9090-exec-270,invoke:62][PORTAL] 667919079 [http-bio-9090-exec-270] ERROR org.gcube.portal.threadlocalexec.SmartGearsPortalValve - Something went wrong in generating token for mister.orange in scope /gcube/preprod/preVRE java.net.SocketException: Connection reset 2017-11-21 11:06:35,997 ERROR o.g.p.t.SmartGearsPortalValve [http-bio-9090-exec-257,invoke:62][PORTAL] 667919079 [http-bio-9090-exec-257] ERROR org.gcube.portal.threadlocalexec.SmartGearsPortalValve - Something went wrong in generating token for mister.orange in scope /gcube/preprod/preVRE at java.net.SocketInputStream.read(SocketInputStream.java:210) at java.net.SocketInputStream.read(SocketInputStream.java:141) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:735) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:706) at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1569) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1474) at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480) at org.gcube.common.authorization.client.proxy.DefaultAuthorizationProxy.resolveTokenByUserAndContext(DefaultAuthorizationProxy.java:136) at org.gcube.portal.threadlocalexec.SmartGearsPortalValve.invoke(SmartGearsPortalValve.java:53) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116) at org.apache.catalina.ha.tcp.ReplicationValve.invoke(ReplicationValve.java:333) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:423) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1079) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:620) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.lang.Thread.run(Thread.java:748) java.net.SocketException: Connection reset
Updated by Massimiliano Assante over 7 years ago
The authorisation service is not responding, according to the logs.
2017-11-21 11:08:40,157 ERROR o.g.c.p.PortalContext [http-bio-9090-exec-264,getCurrentUserToken:393][PORTAL] 668043239 [http-bio-9090-exec-264] ERROR org.gcube.common.portal.PortalContext - Error while trying to generate token for user massimiliano.assantein scope /gcube java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:210) at java.net.SocketInputStream.read(SocketInputStream.java:141) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:735) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:678) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:706)
Updated by Andrea Dell'Amico over 7 years ago
- % Done changed from 100 to 0
OK, there were a lot of connections from the authorization that were stuck, some lasting 24 hours. I had to kill postgres to get rid of them:
[...] 2017-11-21 12:17:38 CET [21365-3] authorization@authorization FATAL: terminating connection due to administrator command 2017-11-21 12:17:38 CET [21365-4] authorization@authorization LOG: disconnection: session time: 2:00:17.162 user=authorization database=authorization host=postgres2-d-d4s.d4science.org port=42645 2017-11-21 12:17:38 CET [16462-3] authorization@authorization FATAL: terminating connection due to administrator command 2017-11-21 12:17:38 CET [16462-4] authorization@authorization LOG: disconnection: session time: 8:19:24.323 user=authorization database=authorization host=postgres2-d-d4s.d4science.org port=42629 2017-11-21 12:17:38 CET [10862-3] authorization@authorization FATAL: terminating connection due to administrator command 2017-11-21 12:17:38 CET [10862-4] authorization@authorization LOG: disconnection: session time: 9:16:33.511 user=authorization database=authorization host=postgres2-d-d4s.d4science.org port=42611 2017-11-21 12:17:38 CET [7559-3] authorization@authorization FATAL: terminating connection due to administrator command 2017-11-21 12:17:38 CET [7559-4] authorization@authorization LOG: disconnection: session time: 9:50:02.203 user=authorization database=authorization host=postgres2-d-d4s.d4science.org port=42601 2017-11-21 12:17:38 CET [17013-3] authorization@authorization FATAL: terminating connection due to administrator command 2017-11-21 12:17:38 CET [17013-4] authorization@authorization LOG: disconnection: session time: 24:11:30.680 user=authorization database=authorization host=postgres2-d-d4s.d4science.org port=42592 2017-11-21 12:17:38 CET [17011-3] authorization@authorization FATAL: terminating connection due to administrator command 2017-11-21 12:17:38 CET [17011-4] authorization@authorization LOG: disconnection: session time: 24:11:30.722 user=authorization database=authorization host=postgres2-d-d4s.d4science.org port=42588 [...]
Updated by Andrea Dell'Amico over 7 years ago
- Status changed from In Progress to Feedback
- % Done changed from 0 to 100
Updated by Massimiliano Assante over 7 years ago
- Status changed from Feedback to Closed
Actions