HaProxy Setup in Ubuntu

Introduction

The following instructions apply to deploying FileCloud HA in an Ubuntu 22.04 environment. This can easily be adapted to other Linux flavors as well. This example uses HTTP but can be expanded easily to use HTTPS as well. Some of the instructions will have to be adapted to your specific environment. Every HA setup has a load balancer as its core component.

Load Balancer

The load balancer is the component that distributes incoming requests among a group of servers.  In this case, the load balancer of choice is HaProxy (http://www.haproxy.org/). HaProxy is a high performance and battle tested load balancer and allows you to scale your FileCloud deployment quickly as well.

NOTE: Before starting the install, ensure the servers are already available and their IP addresses are known.

Setting up Ha-Proxy

  1. Use the apt-get command to install HAProxy

    apt-get install haproxy
  2. Enable HAProxy to be started by the init script

    vi /etc/default/haproxy

    set the ENABLED option to 1

    ENABLED=1
  3. Move the default config file to create a new default configuration file

    mv /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.save
    vi /etc/haproxy/haproxy.cfg
    

     

  4. Create a file named haproxy.cfg and add the following in the empty haproxy.cfg file

    global
        log 127.0.0.1 local0 notice
        maxconn 2000
        user haproxy
        group haproxy

     

Logging

The log directive mentions a syslog server to which log messages will be sent. On Ubuntu, rsyslog is already installed and running but it doesn't listen on any IP address. We'll modify the config files of rsyslog later.

The maxconn directive specifies the number of concurrent connections on the frontend. The default value is 2000 and should be tuned according to your VPS' configuration.

The user and group directives changes the HAProxy process to the specified user/group. These shouldn't be changed.

defaults
    log global
    mode http
    option httplog
    option dontlognull
    retries 3
    option redispatch
    timeout connect 5000
    timeout client 10000
    timeout server 10000

Host Configuration

This section demonstrates how to specify default values. The values to be modified are the various timeout directives. The connect option specifies the maximum time to wait for a connection attempt to a VPS to succeed.

The client and server timeouts apply when the client or server is expected to acknowledge or send data during the TCP process. HAProxy recommends setting the client and server timeouts to the same value.

The retries directive sets the number of retries to perform on a VPS after a connection failure.

The option redispatch enables session redistribution in case of connection failures. So session stickiness is overridden if a VPS goes down.

The names used for the three webservers in these instructions are Ha-WS1, Ha-WS2, Ha-WS3.

listen filecloud 
    bind 0.0.0.0:80
    mode http
    stats enable
    stats uri /haproxy?stats
    stats realm Strictly\ Private
    stats auth  proxyuser:proxypassword
    balance roundrobin
    option http-server-close
    timeout http-keep-alive 3000
    option forwardfor
    server Ha-WS1 xx.xx.xx.xx:80 check
    server Ha-WS2 xx.xx.xx.xx:80 check
    server Ha-WS3 xx.xx.xx.xx:80 check


Additional Notes

This contains the configuration for both the frontend and backend and shows how to configure HAProxy to listen on port 80 for filecloud (which is just a name for identifying the application).

The stats directives enable the connection statistics page and protect it with HTTP Basic authentication using the credentials specified by the stats auth directive.

This page can be viewed with the URL mentioned in stats uri, so in this case, it is http://<loadbalancerip>/haproxy?stats;

The balance directive specifies the load balancing algorithm to use. Options available are Round Robin (roundrobin), Static Round Robin (static-rr), Least Connections (leastconn), Source (source), URI (uri) and URL parameter (url_param).

Information about each algorithm can be obtained from the official documentation.

The server directive declares a backend server with the syntax:

server <name> <address>[:port] [param*]

In the directive server Ha-WS1 xx.xx.xx.xx:80 , replace xx.xx.xx.xx with the actual IP address of the app server nodes.

Starting Ha-Proxy

From command line, start haproxy, using the following command:

service haproxy start