Project

General

Profile

Actions

Task #1504

closed

Task #1502: Automate the dataminer installation as a loadbalanced service

Install a haproxy instance in front of the dev dataminer services

Added by Andrea Dell'Amico over 9 years ago. Updated over 9 years ago.

Status:
Closed
Priority:
Normal
Assignee:
_InfraScience Systems Engineer
Category:
System Application
Start date:
Nov 23, 2015
Due date:
% Done:

100%

Estimated time:
Infrastructure:
Development

Description

The haproxy instance can use the round robin balancer. It seems that nor cookies or sessions need to be provided.


Files

TestDevTokenLoadBalance.txt (29.3 KB) TestDevTokenLoadBalance.txt Gianpaolo Coro, Dec 10, 2015 06:07 PM
Actions #1

Updated by Andrea Dell'Amico over 9 years ago

  • Target version changed from 197 to Computational Infrastructure upgrade to smartgears
Actions #2

Updated by Andrea Dell'Amico over 9 years ago

  • Status changed from New to In Progress

Hostname and IP will be: dataminer-d-d4s.d4science.org 146.48.123.63

Actions #3

Updated by Andrea Dell'Amico over 9 years ago

  • Subject changed from Install a haproxy instance in front of the dataminer services to Install a haproxy instance in front of the dev dataminer services
Actions #4

Updated by Andrea Dell'Amico over 9 years ago

  • % Done changed from 0 to 40

Created the VM, starting the configuration.

Actions #5

Updated by Andrea Dell'Amico over 9 years ago

  • Status changed from In Progress to Feedback
  • % Done changed from 40 to 100

The haproxy balancer is ready.
The nginx logging configuration on the target host (dataminer2-d-d4s.d4science.org only) has been changed to log both the haproxy IP and the original one. The access_log line is now, for example:

146.48.123.63 forwarded for 146.48.123.149 - - [02/Dec/2015:14:46:54 +0100]  "GET /wps/ HTTP/1.1" 304 0 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.86 Safari/537.36"
Actions #6

Updated by Andrea Dell'Amico over 9 years ago

The dataminer[1:2]-d-d4s.d4science.org nodes are still directly reachable on port 80, as requested.

Actions #7

Updated by Gianpaolo Coro over 9 years ago

Calls to dataminer-d-d4s.d4science.org are always sent to dataminer1. @andrea.dellamico@isti.cnr.it could you please give it a look?

Actions #8

Updated by Andrea Dell'Amico over 9 years ago

Can you try from a different IP? Even if it's a round robin configuration haproxy tries to deliver the requests to the same backend when the source does not change.

And a request: what URL should the load balance check to be sure that the service is working? I'm now testing /wps, but it's a 302.

Actions #9

Updated by Gianpaolo Coro over 9 years ago

I attach all the http links to test the balancer and the algorithms on Dataminer.

Actions #10

Updated by Andrea Dell'Amico over 9 years ago

Well, you need to chose one for the load balancer :). A light one possibly, they're executed once per second. We can use some of the others to produce a detailed nagios check, maybe.

Actions #12

Updated by Gianpaolo Coro over 9 years ago

According to the WPS specifications, timeout should be decided by the user's client. Processing can last also days and theoretically could be executed also in "synchronous" mode by the client. Is it possible to either set the timeout to infinite or to 10/20 days?

Actions #13

Updated by Andrea Dell'Amico over 9 years ago

There is no 'infinite timeout' in tcp or http.
We only need to ensure that the proxy keep alive timeout is longer than the client (or the standard tcp) one.

Actions #14

Updated by Gianpaolo Coro over 9 years ago

<>
Not according to Apache: https://hc.apache.org/httpcomponents-client-4.2.x/tutorial/html/connmgmt.html

Anyway, if infinite is not contemplated by haproxy, is it possible to set it to 20 days at least?

Actions #15

Updated by Andrea Dell'Amico over 9 years ago

Gianpaolo Coro wrote:

<>
Not according to Apache: https://hc.apache.org/httpcomponents-client-4.2.x/tutorial/html/connmgmt.html

Anyway, if infinite is not contemplated by haproxy, is it possible to set it to 20 days at least?

That only says that the client will wait indefinitely. Servers have timeouts too. At http level, and tcp level. What we are interested in for this case is called http persistent connections and the way for them to work reliably is to have keepalive doing the job correctly.
Yesterday I raised the haproxy keepalive timeouts so that they should be much longer of the keepalive interval, and your tests that weren't reliable now are always completing.
We now need tests that last longer, and only if they fail because of timeouts we can try and raise them to an unreasonble high level.

Actions #16

Updated by Gianpaolo Coro over 9 years ago

Since we are talking about a computational service, the longest run I have lasts 7 days in the development environment. I don't know the computational time of all the possible algorithms we will integrate in the future.
Is it possible to quantify the current timeout or shall we wait 7 days to understand if another tuning is needed?

Actions #17

Updated by Andrea Dell'Amico over 9 years ago

Gianpaolo Coro wrote:

Since we are talking about a computational service, the longest run I have lasts 7 days in the development environment. I don't know the computational time of all the possible algorithms we will integrate in the future.
Is it possible to quantify the current timeout or shall we wait 7 days to understand if another tuning is needed?

I only have to check the client behaviour. A test lasting some then minutes is sufficient. If it fails I have a last haproxy configuration options to try, before falling back to increase the timeouts.

Actions #18

Updated by Andrea Dell'Amico over 9 years ago

  • Status changed from In Progress to Feedback
  • % Done changed from 80 to 100

It turned out that the http library used by the R jobs does not behave correctly with keep alive. So we are going to use a huge keepalive timeout (60 days) for the time being.

Actions #19

Updated by Gianpaolo Coro over 9 years ago

  • Status changed from Feedback to Closed
Actions

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 8.91 MB)