How to Upgrade Large Clusters with Minimal Downtime

When updating large clusters, one key consideration is the possibility of downtime. Taking clusters offline, even for much-needed improvements, can negatively affect availability, user experience, business operations, and more. As a result, it is critical to avoid downtime—and its associated organizational impacts—during the upgrade process.

This blog will describe how we upgraded one customer from an older, unsupported variant of Imply to the latest version. This customer deployed their Imply Private clusters through Kubernetes on Google Cloud.

In a typical Imply cluster, we have:

Master services (Overlord, Coordinator)
Data services (Middle-Manager, Historical)
Query services (Broker, Router)
Zookeeper servers
Deep storage (S3, HDFS, NFS, etc)
A metadata database (MySQL/PostgreSQL)

Of these components, the most critical data resides in deep storage and the metadata database. Data related to the other services automatically gets populated once the new cluster starts.

In order to safely upgrade versions with minimal downtime, try this procedure.

Set up a new cluster at the desired version.
Copy the metadata database from the existing cluster onto the new cluster’s metadata database.
Use the same deep storage used in the current cluster—or copy the data from the existing deep storage to the new one.

However, there are a few things that we need to consider:

If we use a different deep storage, we must ensure that data between the existing and new deep storage are synched correctly.
There should NOT be any compaction jobs running on the existing cluster that could make changes to the druid_segment table during the transition between metadata databases.
There could be a possibility, especially for large clusters, that there are high numbers of segments in the druid_segments table, which might increase the generation of SQL Dump files and result in high downtime. In that scenario, there are several potential solutions:

Run the following query to find out the number of used and unused segments:

select used, count(1) from druid_segments group by used;

If there are a large number of unused segments (represented by 0), we can generate a separate dump file ONLY for the druid_segments table with only used segments

mysqldump -h <host_name> -u <user_name> -p --single-transaction --skip-add-drop-table  --no-create-info  --no-create-db <db_name> druid_segments --where="used='1'" > output_file.sql

However, if there are still a large number of segments present, we can use a CDC tool that can perform data transfer in near real time between the two metadata databases, thus reducing the required migration time.

During the entire process, one of the two following scenarios will be in play.

Scenario 1

Old and new clusters use different deep storage and database servers. This scenario would apply if:

The old deep storage is accessible from the new Imply cluster.
The old metadata database and the new metadata database both run on MySQL.
Both the new deep storage and the new metadata database are clean—and devoid of data.

Steps:

Configure the new clusters with a different deep storage path and a new database server address.
Copy tables “druid_config, druid_datasource, druid_supervisors, druid_segments” from the old database to the new one by following the next step:
- Execute a mysqldump on the source metadata database (and Pivot if in use) and then import the mysqldump into the new target metadata database. Output file is given a .sql extension because it contains sql commands. The -p command in mysqlimport asks for a password.

mysqldump -h <host_name> -u <user_name> -p --single-transaction --skip-add-drop-table  --no-create-info  --no-create-db <db_name> druid_config druid_dataSource druid_supervisors druid_segments > output_file.sql

Start the new cluster.
Coordinator automatically starts reading the old segments’ metadata in the new database, and then historical nodes load them from the old deep storage. Note that the data in old deep storage is kept intact.
The old cluster will keep writing to the old metadata database and the old deep storage.
The new cluster will write to the new metadata database and the new deep storage.

Once you do this migration, the old and new clusters will share the same data segment files in deep storage for any data ingested before the migration. (Data ingested after the migration will go to different files.) It is essential to avoid running kill tasks (permanent data deletion) on data sources that may have segments between two clusters, because it will cause the clusters to delete each others’ data.

If the new Imply cluster shares the same ZooKeeper quorum as the old, it must use a different base znode path, by configuring the property druid.zk.paths.base in the common.runtime.properties to a different name, such as /druid-newcluster. The default value is /druid.

Scenario 2

The old and new clusters use different deep storage and database servers. This scenario would apply if:

The old deep storage is not accessible from the new Imply cluster.
The old metadata database and the new metadata database both run on MySQL.
Both the new deep storage and the new metadata database are clean—and devoid of data.

Steps:

Copy the data from the old deep storage to the new deep storage (consider using a staging area as an intermediate location).
Configure the new clusters with different deep storage paths and a new database server address.
Utilize mysqldump on the source metadata database.

mysqldump -h <host_name> -u <user_name> -p --single-transaction --skip-add-drop-table  --no-create-info  --no-create-db <db_name> druid_config druid_dataSource druid_supervisors druid_segments > output_file.sql

In the above mysqldump file, change the location of the segments in the druid_segments table to point to the new deep storage location.

sed -i .bak 's/\\"bucket\\":\\"<old_bucket_name>\\"/\\"bucket\\":\\"<new_bucket_name>\\"/' /tmp/output_file.sql

Copy tables “druid_config, druid_datasource, druid_supervisors, druid_rules, druid_segments” from the old database to the new database by following the next step.
Import the modified source mysqldump file (as seen above) into the new target metadata database.

mysql -h <host_name> -u <user_name> -p <db_name>   <  /dir/output_file.sql

Drop druid_rules table from target mysql. This will be recreated once the cluster is started.
Start the new cluster.
Coordinator automatically starts reading the old segments’ metadata in the new database, and then historical nodes load them from the new deep storage. Note that the data in old deep storage is kept intact.
The old cluster will keep writing to the old database and old deep storage.
The new cluster will write to the new database and new deep storage.

Important Considerations

Review Imply Release notes for changes in the new version.
After reviewing release notes for parameter changes, be sure to use the same configuration parameters for the services as the Imply version’s configuration.
if the `druid.segmentCache.locations` are changed, copy over the segment cache from the existing Imply cluster before restarting services using the new cluster.
While starting the new cluster, the following steps are preferable when the Segment Count is very high:
1. Start all the Historicals and wait for the Lifecycle to start.
2. Start all the master services (Co-ordinator / Overlord) and wait for the Lifecycle to start.
3. Start query node services (Broker / Router) and wait for Lifecycle to be started.
4. Start Middle-Manager service and wait for Lifecycle to be started.
5. Resume supervisors and tasks.

The ultimate goal is to upgrade large Imply clusters while minimizing any potential for disruption or data loss among customers. Don’t remain on outdated versions—instead, update your Imply environment safely and smoothly with this procedure.