High Availability Best Practices
The following image identifies two requirements:
- Do NOT add any special characters in the hostname.
- Do NOT add the hostname on the same line as the localhost. Add the hostname on a separate line in the /etc/hosts file
The hostname cannot contain space, underscore, hyphen, or any other special characters for the database servers as it causes the authentication to fail.
Changing the hostnames after you install and configure the VMs may cause unknown issues.
Be sure to set up each highly available VM in a separate zone to avoid situations that may contribute to a single point of failure (for example, a power outage that affects both VMs if they are situated in the same zone).
The HA setup is not supported with CentOS6 and RHEL6 images for all CloudCenter releases.
The CloudCenter platform only requires one and one database for each deployment. If your enterprise requires High Availability (HA) and data replication:
- You must place the two servers and the two database servers on the same cloud or datacenter.
- The CCM servers are dependent on the database servers being set up and running. Be sure to configure the database servers before configuring the CCM servers.
- Configure the load balancer to dispatch traffic to one of the instances and the enable the session stickiness policy.
- If the primary server fails at any point, the secondary sever seamlessly takes over.
- When a failed server comes back online, data is synchronized from the new primary server automatically. The server that was offline then becomes the secondary server.
When used in the CloudCenter context the PostgreSQL databases follow a master-slave high availability (HA) setup:
- The master server asynchronously sends data changes to the slave server.
- The master server responds to read-only queries while the master server is running.
Twodatabase servers work together to allow a second server to take over quickly if the master server fails:
- Master Server: A server that modifies data. This server carries the load.
- Slave Server: A server that responds to or replicates changes made in the master server.
The PostgreSQL instances launched in the cloud must be capable of handling network routing for your enterprise through the Virtual IP configured by your cloud administrator.
- Make sure that the Virtual IP configuration is accurately routed to the appropriate PostgreSQL instance.
- Review the information provided in Phase 1: Prepare Infrastructure to review the CloudCenter requirements as this process differs for each cloud.
High Availability Considerations
The CloudCenter High Availability (HA) solution:
- Is Synchronous — the transaction is not considered committed until all servers have completed the transaction.
- Ensures that a failover does not lose any data. If one of the CCM servers goes offline, the other server takes over as the primary server and continues to handle all required activities.
- Returns consistent results to both servers as the database and configuration changes on the servers are kept in sync.
To ensure HA, the CloudCenter CCM and PostgreSQL database servers work together to allow a second server to take over quickly if the primary server fails.
To ensure better performance for a HA setup, the network link between the two CCMs in a HA setup should have a minimum requirement of 100Mbps bandwidth and a network latency of less than 20ms.
The CloudCenter platform uses Hazelcast in CCM HA setup. Hazelcast recommends clustered VMs be deployed within one region. To adhere to the Hazelcast recommendation, the CloudCenter platform should ideally be deployed in the same region with multiple availability zones.
For two servers (CCM setup or database setup) to work together, the CloudCenter platform only allows one of the servers to modify the data.
- Primary Server: A server that modifies data. This server carries the load.
- Secondary Server: A server that responds to or replicates changes made in the primary server.
To provide HA for the CCM server in a CloudCenter deployment, you must install the following servers and configure them for a primary-secondary replication setup.
- Two database servers: Master database and slave database for replication setup.
- Two CCM servers: Primary CCM and secondary CCM for high availability setup.
When the failed server comes back online, data is synchronized from the new primary server automatically. The server that was offline then becomes the secondary server.
CCO High Availability
The CCO HA procedure is intricate and requires deeper DevOps knowledge. First consult with your company's DevOps team and ensure that each requirement in this section is addressed.
To ensure high availability, at least onemust be installed and configured in each cloud (region) or datacenter in a CloudCenter deployment. The CCO servers run concurrently behind a load balancer. Each server is active and all servers in the cluster perform orchestration tasks in parallel. If one of the servers in the cluster goes offline, the other active servers continue to handle orchestration tasks. When the offline server comes back online, data is synchronized from the active servers automatically.
The CCO HA setup is done on a per-region basis.
CloudCenter does not support cross-region configuration for CCO HA.
CCO HA directly on the CCO_PRIMARY wizard for a 3-node MongoDb cluster setup.
CCO HA Setup now requires 3 VMs to support a 3-node MongoDB cluster. In addition to CCO_PRIMARY & CCO_SECONDARY, a new role CCO_TERTIARY has been added. See Component Modes and Roles for additional details.
You can configure CCO HA by running the configuration wizard on any one of the 3-nodes.
- See Per CloudCenter Region Installation (Required) > 4. Install CCO (Required) > > CloudCenter 4.7 for additional context.
You need three CCO servers for a CCO HA setup:
- CCO1 = Configure CCO HA and MongoDB HA
- CCO2 = Configure CCO HA and MongoDB HA
- CCO3 = MongoDB HA (MongoDB minimal requirement)
Be sure to launch all three CCO servers in different availability zones to ensure HA. See Availability Sets and Zones for additional context.
Using a CCO Load Balancer?Icon
SSL certificates are essential for communication between the CCM and CCO. Be sure to use the TCP protocol for CCO Load Balancer listeners. For example, you can configure a generic load balancer application to use TCP and ensure that the certificate exchange procedure is transparent.
- Use the CloudCenter UI and configure the CCO IP field with the IP address of the load balancer.
If you are adding new CCOs to an existing deployment, replace the previously configured CCO IP address with the IP address of the load balancer.
AMQP High Availability
A load balancer must be placed on top of a RabbitMQ cluster to provide true HA support for a CloudCenter deployment.
The CloudCenter Management Agent retrieves messages from the AMQP server. This is not an instantaneous process. If the AMPQ server fails while a message retrieval is in progress, that message may be lost.
The out-of-box CloudCenter solution does not provide clustering or HA. As such, you need to manually add a load balancer in a clustered environment for a CloudCenter deployment to provide true HA. Only using a load balancer without implementing clustering does not ensure a queue exchange between multiple AMQP servers.
Multiple, clustered AMQP servers on the CCO side (without a load balancer) only ensures a clustered environment and does NOT establish any replication in an out-of-box CloudCenter deployment – it only provides a manually-manged, fail over mechanism, if required in case of disaster.
The CloudCenter 4.6 platform does not support HA for Guacamole. Guacamole is a separate component that resides within each AMQP server. See Component Modes and Roles for additional context.
If each AMQP server is placed in a separate region, be aware that the HA solution does not work across regions.
To ensure end-to-end AMQP HA, a CloudCenter deployment requires:
- A clustered AMQP server setup with mirroring – see https://www.rabbitmq.com/clustering.html for additional context on clustering servers.
- A load balancer to manage all AMQP servers – see the relevant documentation for your respective load balancers to configure this setup for your environment.
Verify these requirements before you begin the high available process:
Configured and setup the database.
- Set up the AMQP cluster using the DNS name – not the IP address.
- Configured a load balancer to manage the AMQP servers.
- Configured a clustered AMQP server setup with a mirroring policy applied to all CCO queues.
- After setting up a load balancer on top of a RabbitMQ cluster, configure the CCO server to read the AMQP server's IP address.
Guacamole High Availability
Be aware that Guacamole is a separate component. The CloudCenter 4.6 platform does not support HA for Guacamole. See Component Modes and Roles for additional context.
Effective CloudCenter 4.7.0, Cisco supports HA for GUAC. To use this feature, be aware that the connection broker's IP must be published to the CCO so that the same instance is used for reverse connection.
- No labels