Incident #9757
closedlow upload speed to thredds-pre-d4s.d4science.org
100%
Description
Shortly, we're experiencing very low upload network speed when transferring data from oscar-import.pre.d4science.org to thredds-pre-d4s.d4science.org. Conversely, the same usage scenario between oscar-import.pre.d4science.org and thredds-d-d4s.d4science.org works fine. Only the endpoint of the thredds server changes; in principle, both hosts should be configured the same way.
Here are the details, for a working usage session (i.e. with the thredds server in dev):
From the client host (oscar-import.pre.d4science.org) I'm uploading a big file (~10GB) to the thredds server using data transfer facilities. I'm using the following call:
curl --verbose -F uploadedFile=@/tmp/oscar-merger/test.nc --header gcube-token:[application-token] "http://thredds-pre-d4s.d4science.org:80/data-transfer-service/gcube/service/REST/FileUpload/thredds/public/netcdf/Oscar?on-existing-file=REWRITE&on-existing-dir=APPEND&create-dirs=true"
Hereafter what happens, seen from the client side:
- curl starts to transfer data at about ~100MB/second, for about 1~2 minutes. We've seen this with the "dstat" command.
- for about further 5', nothing can be noticed on the client. We guess nginx is putting the file somewhere (/tmp?).
- then, we guess the data transfer gets the request and starts writing the file to its final destination. Infact, the file appears in the thredds catalogue (http://thredds-pre-d4s.d4science.org/thredds/catalog/public/netcdf/Oscar/catalog.html); and the size increases constantly, up to completion. A 'Success' message is returned to the client. The response includes a submission time, which corresponds to the start of step 3, i.e. the exact time when data transfer receives the request.
The whole upload process takes about 12' in dev.
Now, moving to preprod (thredds-pre-d4s.d4science.org), we see a different behaviour for step 1:
- client starts to transfer data at ~100MB/second. After 15~20", speed drops to ~100KB. This means 1/1000x transfer speed. netstat shows Send-Q values always around 2100000; whereas in the working scenario Send-Q regularly drops to zero.
I don't think we've ever reached steps 2 and 3 in preprod; I've waited for 30' and saw no evidence of them (curl is still uploading data at 100k/sec), nor an error on the client.
Is there anything wrong/misconfigured on the preprod thredds server?
Your help is much appreciated.
Thank you.