Task #12788
closedDMPoolManager - The dm-pool-manager-pre.d4science.org has problems when an algorithm is republished
100%
Description
The dm-pool-manager-pre.d4science.org has problems when an algorithm is republished.
Perhaps it might even be necessary to upgrade the machine.
Please can you check?
Thanks
An error occurred while deploying your algorithm Here are the error details: Installation failed. Return code=2 Algorithm details: User: Giancarlo Panichi Algorithm name: PARAMETERSCHECKER Staging DataMiner Host: dataminer-ghost-t.pre.d4science.org Caller VRE: /gcube/preprod/preVRE Target VRE: /gcube/preprod/preVRE 16:32:44.444 [catalina-exec-1] INFO RequestAccounting: REQUEST START ON dataminer-pool-manager:DataAnalysis(/api/monitor) CALLED FROM giancarlo.panichi@146.48.122.240 IN SCOPE /gcube/preprod/preVRE 16:32:44.446 [catalina-exec-1] INFO RequestAccounting: REQUEST SERVED ON dataminer-pool-manager:DataAnalysis(/api/monitor) CALLED FROM giancarlo.panichi@146.48.122.240 IN SCOPE /gcube/preprod/preVRE FINISHED IN 2 millis 16:32:46.026 [Thread-23] ERROR DMPMJob: Operation failed: Ansible work failed 16:32:46.026 [Thread-23] ERROR DMPMJob: Exception: org.gcube.dataanalysis.dataminer.poolmanager.service.exceptions.AnsibleException: Ansible work failed at org.gcube.dataanalysis.dataminer.poolmanager.service.DMPMJob.installation(DMPMJob.java:172) at org.gcube.dataanalysis.dataminer.poolmanager.service.DMPMJob.execute(DMPMJob.java:207) at org.gcube.dataanalysis.dataminer.poolmanager.service.StagingJob.execute(StagingJob.java:32) at org.gcube.dataanalysis.dataminer.poolmanager.service.DMPMJob$1.run(DMPMJob.java:88) at java.lang.Thread.run(Thread.java:748) 16:32:46.410 [Thread-23] ERROR SendMail: Error in the IO process java.io.IOException: Server returned HTTP response code: 400 for URL: https://socialnetworking-t.pre.d4science.org/social-networking-library-ws/rest/messages/writeMessageToUsers?gcube-token=04269c7d-dab7-498a-841d-8d38ae2d482b-98187548 at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1894) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492) at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:263) at org.gcube.dataanalysis.dataminer.poolmanager.util.SendMail.sendPostRequest(SendMail.java:165) at org.gcube.dataanalysis.dataminer.poolmanager.util.SendMail.sendNotification(SendMail.java:94) at org.gcube.dataanalysis.dataminer.poolmanager.service.DMPMJob.execute(DMPMJob.java:223) at org.gcube.dataanalysis.dataminer.poolmanager.service.StagingJob.execute(StagingJob.java:32) at org.gcube.dataanalysis.dataminer.poolmanager.service.DMPMJob$1.run(DMPMJob.java:88) at java.lang.Thread.run(Thread.java:748) 16:32:46.410 [Thread-23] ERROR DMPMJob: Unable to send notification email org.gcube.dataanalysis.dataminer.poolmanager.util.exception.EMailException: Unable to send email notification at org.gcube.dataanalysis.dataminer.poolmanager.util.SendMail.sendNotification(SendMail.java:97) at org.gcube.dataanalysis.dataminer.poolmanager.service.DMPMJob.execute(DMPMJob.java:223) at org.gcube.dataanalysis.dataminer.poolmanager.service.StagingJob.execute(StagingJob.java:32) at org.gcube.dataanalysis.dataminer.poolmanager.service.DMPMJob$1.run(DMPMJob.java:88) at java.lang.Thread.run(Thread.java:748) 16:32:46.503 [catalina-exec-2] INFO RequestContextRetriever: retrieving context using token 04269c7d-dab7-498a-841d-8d38ae2d482b-98187548 16:32:46.504 [catalina-exec-2] INFO RequestContextRetriever: retrieved request authorization info org.gcube.common.authorization.library.utils.Caller@684bdaf0 in scope /gcube/preprod/preVRE
Related issues
Updated by Giancarlo Panichi over 6 years ago
- Assignee changed from Ciro Formisano to Lucio Lelii
I tried to run ansible from the command line and I noticed these logs:
gcube@dm-pool-manager-pre:~/dataminer-pool-manager/work/62b21472-63b8-4c72-874e-504aad83d55d$ ansible-playbook -v -i inventory.yaml playbook.yaml Using /etc/ansible/ansible.cfg as config file /home/gcube/dataminer-pool-manager/work/62b21472-63b8-4c72-874e-504aad83d55d/inventory.yaml did not meet host_list requirements, check plugin documentation if this is unexpected /home/gcube/dataminer-pool-manager/work/62b21472-63b8-4c72-874e-504aad83d55d/inventory.yaml did not meet script requirements, check plugin documentation if this is unexpected PLAY [universe] ************************************************************************************************ TASK [Gathering Facts] ***************************************************************************************** ok: [dataminer-ghost-t.pre.d4science.org] TASK [gcube-algorithm-DMPOOLMANAGERCHECK : Install algorithm DMPOOLMANAGERCHECK] ******************************* fatal: [dataminer-ghost-t.pre.d4science.org]: FAILED! => {"changed": true, "cmd": "/home/gcube/algorithmInstaller/addAlgorithm.sh DMPOOLMANAGERCHECK BLACK_BOX org.gcube.dataanalysis.executor.rscripts.DMPoolManagerCheck /gcube/preprod/preVRE transducerers N https://data1-d.d4science.net/shub/bc867d01-ba3c-427a-9abf-3a542a7916c6 \"DM Pool Manager Check {Published by Giancarlo Panichi (giancarlo.panichi) on 2018/10/26 12:21 GMT}\"", "delta": "0:00:01.509639", "end": "2018-10-26 15:06:12.546733", "msg": "non-zero return code", "rc": 1, "start": "2018-10-26 15:06:11.037094", "stderr": "SLF4J: Class path contains multiple SLF4J bindings.\nSLF4J: Found binding in [jar:file:/home/gcube/tomcat/webapps/wps/WEB-INF/lib/logback-classic-1.1.11.jar!/org/slf4j/impl/StaticLoggerBinder.class]\nSLF4J: Found binding in [jar:file:/home/gcube/tomcat/webapps/wps/WEB-INF/lib/slf4j-nop-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]\nSLF4J: Found binding in [jar:file:/home/gcube/tomcat/lib/logback-classic-1.1.11.jar!/org/slf4j/impl/StaticLoggerBinder.class]\nSLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.\nSLF4J: Actual binding is of type [ch.qos.logback.classic.util.ContextSelectorStaticBinder]\nException in thread \"main\" java.lang.ClassNotFoundException: org.gcube.dataanalysis.executor.rscripts.DMPoolManagerCheck\n\tat java.net.URLClassLoader.findClass(URLClassLoader.java:381)\n\tat java.lang.ClassLoader.loadClass(ClassLoader.java:424)\n\tat sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)\n\tat java.lang.ClassLoader.loadClass(ClassLoader.java:357)\n\tat java.lang.Class.forName0(Native Method)\n\tat java.lang.Class.forName(Class.java:264)\n\tat org.gcube.dataanalysis.wps.mapper.ClassGenerator.generateEcologicalEngineClasses(ClassGenerator.java:67)\n\tat org.gcube.dataanalysis.wps.mapper.ClassGenerator.<init>(ClassGenerator.java:29)\n\tat org.gcube.dataanalysis.wps.mapper.DataMinerUpdater.Update(DataMinerUpdater.java:283)\n\tat org.gcube.dataanalysis.wps.mapper.DataMinerUpdater.main(DataMinerUpdater.java:130)", "stderr_lines": ["SLF4J: Class path contains multiple SLF4J bindings.", "SLF4J: Found binding in [jar:file:/home/gcube/tomcat/webapps/wps/WEB-INF/lib/logback-classic-1.1.11.jar!/org/slf4j/impl/StaticLoggerBinder.class]", "SLF4J: Found binding in [jar:file:/home/gcube/tomcat/webapps/wps/WEB-INF/lib/slf4j-nop-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]", "SLF4J: Found binding in [jar:file:/home/gcube/tomcat/lib/logback-classic-1.1.11.jar!/org/slf4j/impl/StaticLoggerBinder.class]", "SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.", "SLF4J: Actual binding is of type [ch.qos.logback.classic.util.ContextSelectorStaticBinder]", "Exception in thread \"main\" java.lang.ClassNotFoundException: org.gcube.dataanalysis.executor.rscripts.DMPoolManagerCheck", ...} to retry, use: --limit @/home/gcube/.ansible_retry/playbook.retry PLAY RECAP ***************************************************************************************************** dataminer-ghost-t.pre.d4science.org : ok=1 changed=0 unreachable=0 failed=1 Playbook run took 0 days, 0 hours, 0 minutes, 3 seconds
I and @roberto.cirillo@isti.cnr.it have seen that the .jar file is downloaded by adding undescore at the beginning and at the end of the file:
https://data1-d.d4science.net/shub/bc867d01-ba3c-427a-9abf-3a542a7916c6 _DMPoolManagerCheck.jar_
Please @lucio.lelii@isti.cnr.it , can you check what happens to the .jar files?
Thanks
Updated by Ciro Formisano over 6 years ago
I am not an expert on ansible. However I don't remember if the re-pubblication should have been supported. Are we sure that Ansible does not prevent the re-installation of the same algorithm?
Updated by Giancarlo Panichi over 6 years ago
Hi @ciro.formisano@eng.it , yes the republishing and updating of the algorithms is a requirement, and it has always worked. Now, we have realized that perhaps the problem is due to the file that is downloaded through UriResolver and StorageHub, for this @lucio.lelii@isti.cnr.it will investigate the matter.
Updated by Roberto Cirillo over 6 years ago
- Priority changed from Normal to High
Updated by Roberto Cirillo over 6 years ago
- Related to Upgrade #12739: /gcube/preprod upgrade to gCube 4.13 added
Updated by Giancarlo Panichi over 6 years ago
- Related to Task #12794: Workspace - Download problem with some file extension added
Updated by Giancarlo Panichi over 6 years ago
After the last corrections, I have done another test.
Now in case of update of the algorithm the file is written with the right name, but the content seems wrong:
Here the logs:
gcube@dm-pool-manager-pre:~/dataminer-pool-manager/work/db4e55da-1519-4b26-a5ec-ec92ca0809b5$ ansible-playbook -v -i inventory.yaml playbook.yaml Using /etc/ansible/ansible.cfg as config file /home/gcube/dataminer-pool-manager/work/db4e55da-1519-4b26-a5ec-ec92ca0809b5/inventory.yaml did not meet host_list requirements, check plugin documentation if this is unexpected /home/gcube/dataminer-pool-manager/work/db4e55da-1519-4b26-a5ec-ec92ca0809b5/inventory.yaml did not meet script requirements, check plugin documentation if this is unexpected PLAY [universe] ************************************************************************************************************** TASK [Gathering Facts] ******************************************************************************************************* ok: [dataminer-ghost-t.pre.d4science.org] TASK [gcube-algorithm-DMPM_UPDATE_CHECKER : Install algorithm DMPM_UPDATE_CHECKER] ******************************************* fatal: [dataminer-ghost-t.pre.d4science.org]: FAILED! => {"changed": true, "cmd": "/home/gcube/algorithmInstaller/addAlgorithm.sh DMPM_UPDATE_CHECKER BLACK_BOX org.gcube.dataanalysis.executor.rscripts.DMPMUpdateChecker /gcube/preprod/preVRE transducerers N https://data-d.d4science.org/shub/da558346-7d1a-4159-8780-a22e63f3c7dc \"DMPM Update Checker {Published by Giancarlo Panichi (giancarlo.panichi) on 2018/10/31 15:04 GMT}\"", "delta": "0:00:01.208403", "end": "2018-10-31 16:15:44.739404", "msg": "non-zero return code", "rc": 1, "start": "2018-10-31 16:15:43.531001", "stderr": "SLF4J: Class path contains multiple SLF4J bindings.\nSLF4J: Found binding in [jar:file:/home/gcube/tomcat/webapps/wps/WEB-INF/lib/slf4j-nop-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]\nSLF4J: Found binding in [jar:file:/home/gcube/tomcat/webapps/wps/WEB-INF/lib/logback-classic-1.1.11.jar!/org/slf4j/impl/StaticLoggerBinder.class]\nSLF4J: Found binding in [jar:file:/home/gcube/tomcat/lib/logback-classic-1.1.11.jar!/org/slf4j/impl/StaticLoggerBinder.class]\nSLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.\nSLF4J: Actual binding is of type [org.slf4j.helpers.NOPLoggerFactory]\nException in thread \"main\" java.lang.ClassNotFoundException: org.gcube.dataanalysis.executor.rscripts.DMPMUpdateChecker\n\tat java.net.URLClassLoader.findClass(URLClassLoader.java:381)\n\tat java.lang.ClassLoader.loadClass(ClassLoader.java:424)\n\tat sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)\n\tat java.lang.ClassLoader.loadClass(ClassLoader.java:357)\n\tat java.lang.Class.forName0(Native Method)\n\tat java.lang.Class.forName(Class.java:264) .... to retry, use: --limit @/home/gcube/.ansible_retry/playbook.retry PLAY RECAP ******************************************************************************************************************* dataminer-ghost-t.pre.d4science.org : ok=1 changed=0 unreachable=0 failed=1 Playbook run took 0 days, 0 hours, 0 minutes, 3 seconds
Here the wrong file:
https://data-d.d4science.org/shub/da558346-7d1a-4159-8780-a22e63f3c7dc
This file matches on storagehub with:
"@class": "org.gcube.common.storagehub.model.items.GenericFileItem", "id": "da558346-7d1a-4159-8780-a22e63f3c7dc", "name": "DMPMUpdateChecker.jar", "path": "/Home/giancarlo.panichi/Workspace/TestSAI/BlackBox/DMPMUpdateChecker/Target/Deploy/DMPMUpdateChecker.jar", "parentId": "af27bbba-49d5-4192-abd1-7571fc4902c2", "parentPath": "/Home/giancarlo.panichi/Workspace/TestSAI/BlackBox/DMPMUpdateChecker/Target/Deploy",
By downloading this file from the Workspace we get the same error.
Therefore, when the SAI updates the algorithm(DMPMUpdateChecker.jar) some error occurs communicating with StorageHub.
So, the operations during this phase must be monitored to catch the error:
FileContainer.copy(folderContainer, "DMPMUpdateChecker.jar");
Updated by Giancarlo Panichi almost 6 years ago
- Due date set to May 02, 2019
- Status changed from New to Closed
- % Done changed from 0 to 100
Yes, the ticket is 6 months old. Now, there is a new Preproduction infrastructure.