Incident #12214
closed
DataMiners cannot interact with the Storage System
100%
Description
No DM is currently working in any VRE because they get errors when interacting with the Storage system. The issue is when DM tries to write the output of a computation directly on the (volatile) storage system. Here is the stack trace:
com.mongodb.MongoTimeoutException: Timed out after 30000 ms while waiting to connect. Client view of cluster state is {type=UNKNOWN, servers=[{address=mongo-p-vol.d4science.org:27017, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketOpenException: Exception opening socket}, caused by {java.net.ConnectException: Connection refused (Connection refused)}}] at com.mongodb.connection.BaseCluster.getDescription(BaseCluster.java:167) at com.mongodb.Mongo.getConnectedClusterDescription(Mongo.java:881) at com.mongodb.Mongo.createClientSession(Mongo.java:873) at com.mongodb.Mongo$3.getClientSession(Mongo.java:862) at com.mongodb.Mongo$3.execute(Mongo.java:830) at com.mongodb.Mongo$3.execute(Mongo.java:814) at com.mongodb.DBCollection.createIndex(DBCollection.java:1623) at com.mongodb.DBCollection.createIndex(DBCollection.java:1608) at org.gcube.contentmanagement.blobstorage.transport.backend.MongoOperationManager.getMongoInstance(MongoOperationManager.java:93) at org.gcube.contentmanagement.blobstorage.transport.backend.MongoOperationManager.initBackend(MongoOperationManager.java:78) at org.gcube.contentmanagement.blobstorage.transport.backend.MongoOperationManager.<init>(MongoOperationManager.java:51) at org.gcube.contentmanagement.blobstorage.transport.TransportManagerFactory.load(TransportManagerFactory.java:55) at org.gcube.contentmanagement.blobstorage.transport.TransportManagerFactory.getTransport(TransportManagerFactory.java:42) at org.gcube.contentmanagement.blobstorage.service.operation.Operation.put(Operation.java:164) at org.gcube.contentmanagement.blobstorage.service.operation.Upload.doIt(Upload.java:52) at org.gcube.contentmanagement.blobstorage.service.operation.Upload.doIt(Upload.java:26) at org.gcube.contentmanagement.blobstorage.service.operation.OperationManager.startOperation(OperationManager.java:71) at org.gcube.contentmanagement.blobstorage.service.impl.Resource.retrieveRemoteObject(Resource.java:120) at org.gcube.contentmanagement.blobstorage.service.impl.Resource.getRemoteObject(Resource.java:110) at org.gcube.contentmanagement.blobstorage.service.impl.RemoteResource.RFile(RemoteResource.java:61) at org.gcube.contentmanagement.blobstorage.service.impl.RemoteResource.RFile(RemoteResource.java:43) at org.gcube.dataanalysis.wps.statisticalmanager.synchserver.mapping.OutputsManager.uploadFileOnStorage(OutputsManager.java:183) at org.gcube.dataanalysis.wps.statisticalmanager.synchserver.mapping.OutputsManager.createOutput(OutputsManager.java:105) at org.gcube.dataanalysis.wps.statisticalmanager.synchserver.mapping.AbstractEcologicalEngineMapper.run(AbstractEcologicalEngineMapper.java:432) at org.gcube.dataanalysis.wps.statisticalmanager.synchserver.mappedclasses.transducerers.ITALIANLP_NER.run(ITALIANLP_NER.java:24) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.n52.wps.algorithm.annotation.AnnotationBinding$ExecuteMethodBinding.execute(AnnotationBinding.java:89) at org.n52.wps.server.AbstractAnnotatedAlgorithm.run(AbstractAnnotatedAlgorithm.java:54) at org.gcube.data.analysis.wps.ExecuteRequest.call(ExecuteRequest.java:608) at org.gcube.data.analysis.wps.ExecuteRequest.call(ExecuteRequest.java:67) at org.gcube.common.authorization.library.AuthorizedTasks$1.call(AuthorizedTasks.java:41) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)
Here is a testing url:
http://dataminer0-proto.d4science.org/wps/WebProcessingService?request=Execute&service=WPS&Version=1.0.0&gcube-token=<RprotoToken>&lang=en-US&Identifier=org.gcube.dataanalysis.wps.statisticalmanager.synchserver.mappedclasses.transducerers.LANGUAGE_RECOGNIZER&DataInputs=sentence=North+Korea+has+agreed+to+send+a+delegation+to+next+month+Winter+Olympics+in+South+Korea%2C+the+first+notable+breakthrough+to+come+out+of+a+face-to-face+meeting+Tuesday+between+the+neighboring+nations.;
Related issues
Updated by Andrea Dell'Amico almost 7 years ago
I see that the mongo instance on that server is down.
So: we do not have monitoring of those instances, it seems. @tommaso.piccioli@isti.cnr.it they were upgraded as well as the other mongo instances. I see that mongo killed itself on Jul 19th at 07:23 because of a full disk.
Updated by Andrea Dell'Amico almost 7 years ago
- Blocked by Task #12215: Increase disk space on mongo-p-vol.d4science.org added
Updated by Andrea Dell'Amico almost 7 years ago
- Status changed from New to In Progress
Updated by Andrea Dell'Amico almost 7 years ago
- Status changed from In Progress to Feedback
- % Done changed from 0 to 100
The disk space on the volatile mongo instance has been incremented and the server restarted. Does it solve the dataminers problem?
Updated by Giancarlo Panichi almost 7 years ago
- Due date set to Jul 20, 2018
- Status changed from Feedback to Closed
Yes, it seems that everything works correctly. I close this ticket.