Migrating Data from ClickHouse to Imply Polaris

May 02, 2022
Sergio Ferragut

Polaris is Imply’s fully managed DBaaS built from Druid to deliver fast OLAP on both streaming and batch data. In this blog post we describe how to move tables from ClickHouse into Polaris. Since Clickhouse is also an OLAP engine, it is likely that the data is already in a form that can be directly ingested into Polaris. But Polaris also enables Rollup and Secondary Partitioning at ingestion time if additional optimization is desired. We’ll review how to export data from ClickHouse in a format that is easy to ingest into Polaris. 

Exporting from ClickHouse 

Polaris can import data in JSON format. It requires that each JSON object appear in a single line in the data file. From ClickHouse, such a file can be created using a SQL statement in the clickhouse-client CLI by using FORMAT JSONEachRow: 

# clickhouse-client

ClickHouse client version 22.4.1.1.
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 22.4.1 revision 54455.

🙂 SELECT * FROM tutorial.kttm INTO OUTFILE 'data.json' FORMAT JSONEachRow;

SELECT *
FROM tutorial.kttm
INTO OUTFILE 'data.json'
FORMAT JSONEachRow
 
Query id: b4a5301b-09d5-4cb3-b10d-c7494f4eae7b
 
 
202862 rows in set. Elapsed: 0.696 sec. Processed 202.86 thousand rows, 108.91 MB (291.51 thousand rows/s., 156.50 MB/s.)

It is a good idea to compress the data and then transfer it to the Polaris platform. Polaris can ingest the data in compressed form. 

# gzip data.json

Importing Data into Polaris

Now that we have the data, let’s bring it into Polaris. From the Polaris Home screen, click on “Create a table and load data”.

Give the table a name and click Continue:

Click on “+ Add data”

Drag your file into the window to upload it.

You can also use the “Upload by API” button to get the prebuilt curl API call to submit the data file.  Just replace @YOURFILE.json with @data.json.gz and once it is uploaded select “Choose from uploaded files” to continue.

Once you’ve selected the file to load, click “Continue” and Polaris will sample the data, parse the columns and bring up the ingestion configuration screen:

For the most part it will get the mapping of data types correct on its own. String and Numeric types will be properly identified and JSON arrays will be converted to multi-value columns. The one thing that is imperative to verify is that the column it has selected for the __time column is the correct one. The __time column in Polaris is used to partition the timeline into segments and __time it is the primary pruning strategy in Polaris, so you’ll want to pick the timestamp in the source that best indicates the time of the events you are modeling. In this case there is a single timestamp in the source data and it is correctly mapped to __time automatically.

If we want to change data types on any of the imported columns, or the source column mapping, simply click on the column header to edit it and click Save:

There is no need to change any mapping or data types in this example, so just click on “Start Ingestion” to start the load.

You can view the progress of the ingestion from the Home screen, by selecting “Ingestion Jobs”:

Querying the Data

Once the job completes, we are ready to query. From the Home screen select “SQL”:

Conclusion

Moving data from ClickHouse into Polaris is easy. Now you can do more with your data on Polaris. Check out the following links for visualization functions and event streaming in Polaris:

Other blogs you might find interesting

No records found...
Feb 25, 2026

Imply Lumi Product Preview:  Removing the Cost–Performance Tradeoff in Observability

If you caught our recent product update, you’ve already seen the pace of development on Imply Lumi has been relentless. Last quarter, we delivered major performance and usability improvements to data...

Learn More
Feb 03, 2026

Imply Lumi product update: what’s new

Since releasing Imply Lumi in September 2025 as a decoupled data layer for observability, the Imply R&D team has been hard at work to make it easier and more economical to retain, query, and analyze observability...

Learn More
Dec 19, 2025

The Most-Read Imply Blogs of 2025 (and what they signal for 2026)

Before we take on 2026, let’s rewind. 2025 was the year observability teams stopped asking, “How do we reduce data?” and started asking the real question: “How do we build an architecture that can keep...

Learn More

Ready to decouple your observability stack?
No workflow changes. No migrations. More data, less spend.

Request a Demo