A First Look at Lumi Loglake: Query Logs Where They Live

May 21, 2026
Matt Morrissey

At Databricks Data + AI Summit, we will preview Imply Lumi Loglake, a new step toward a more decoupled model for observability and machine data.

The idea is simple:

Point Lumi at your logs. Start querying.

Fully separated compute and storage for unstructured logs.

No heavy preprocessing pipelines to build upfront.

No rigid schemas to define before data becomes usable.

No need to move or duplicate data before you can work with it.

With Loglake, Lumi can query unstructured logs directly where they already live, including logs stored in Amazon S3, Delta Lake, Apache Iceberg, and other open storage environments. Today, the hardest part of observability is not storing data. It is deciding what to keep, what to index, and whether the data will still be operationally usable later when you actually need it.

The problem

Modern observability architectures force teams to make decisions about their telemetry long before they know which questions they will eventually need to answer.

Before you can even run a query, teams are often expected to decide:

  • What data should be retained
  • What data should be dropped
  • What fields should be indexed
  • What pipelines should be built
  • What schemas should be enforced

Those decisions are usually driven by infrastructure cost.

As telemetry volumes continue to grow, fully indexing everything inside always-on observability infrastructure becomes increasingly difficult to justify economically.

So teams compromise.

They reduce retention windows.

They selectively index data.

Or they move historical logs into object storage and lakehouse environments to lower costs and extend retention.

That shift makes sense economically.

But operationally, it introduces a new challenge.

The broader industry trend is moving toward open storage, decoupled compute, and lakehouse architectures for machine data retention. But most observability workflows still depend on rigid preprocessing, schema enforcement, and operational infrastructure designed around indexed data.

The challenge is no longer simply storing telemetry.

The challenge is making that telemetry operationally usable across different tools, workflows, and query models without rebuilding pipelines for every platform.

Introducing Lumi Loglake

Loglake changes that model.

With Lumi Loglake, queries run directly where the data already lives.

Imply Lumi provides a shared query layer across observability tools, lakehouse platforms, and open storage environments.

Lumi combines real-time indexing, elastic compute, and in-place querying into a shared machine data architecture that spans operational observability workloads and open lakehouse environments.

The data does not need to be rehydrated.

Teams do not need to fully structure data or pre-define schemas before it becomes queryable.

And organizations no longer need separate storage architectures for every observability workflow, including raw and unstructured logs stored in object storage, open lakehouse tables, and cloud-native storage environments.

Instead of forcing teams to prepare data before it becomes operationally useful, Loglake makes large-scale telemetry immediately queryable for observability and investigation workflows. Lumi combines schema-on-read query execution with transparent indexing and caching to make large-scale unstructured telemetry interactive without requiring heavy upfront indexing.

Most lakehouse architectures still assume data should be structured before it becomes operationally useful. But logs have resisted that model for decades. Loglake shifts observability from a pipeline-first model to a query-first model. Instead of preparing data before it becomes usable, teams can query first and optimize later.

Extending decoupled architectures to machine data

Modern lakehouse platforms already embrace the separation of compute and storage for structured analytics workloads.

Lumi extends that model to operational machine data and unstructured logs.

For teams continuing to operate Splunk environments, Loglake expands retention and lowers infrastructure costs without requiring workflow changes.

For organizations moving toward lakehouse-centric observability architectures, it provides a way to operationalize log data already stored in open storage environments.

This allows organizations to evolve their observability architectures incrementally instead of forcing a complete rip-and-replace transition.

A shared query layer across observability ecosystems

Different observability platforms increasingly operate against the same underlying telemetry.

Lumi allows organizations to retain data once while supporting multiple operational interfaces and workflows across observability and lakehouse ecosystems.

The same underlying datasets can be queried through:

  • Splunk using SPL
  • Grafana using LogQL
  • SQL and lakehouse tooling
  • Additional interfaces over time

Without duplicating storage or rebuilding ingestion pipelines for every tool, organizations can work from a shared telemetry foundation while preserving existing operational workflows.

What this enables

This fundamentally changes how teams think about observability retention, investigations, and infrastructure cost.

Teams can:

  • Retain significantly larger telemetry datasets
  • Query historical data without archive recovery workflows
  • Work directly with unstructured logs
  • Reduce duplicate pipelines and storage copies
  • Delay indexing and optimization decisions until they are actually needed
  • Use the same datasets across multiple observability tools
  • Extend observability workflows into open lakehouse environments

Instead of deciding upfront how every dataset must be structured, teams can ingest first and optimize later based on actual usage patterns.

Interested in a demo?

If you are interested in learning more about Loglake, we can help you walk through:

  • Connecting Lumi to existing log datasets in object storage
  • Querying unstructured logs without predefined schemas
  • Running investigative workflows without rehydration
  • Querying the same datasets across multiple tools
  • Using open storage as a scalable observability environment

All in just a few minutes.

What’s next

Check out a deep dive on some of the other topics related to Loglake:

  • Why schema first approaches break down for logs
  • The cost of pipelines, reprocessing, and duplicate storage
  • How performance changes when you do not rely on heavy indexing
  • What workflows look like with and without upfront structuring

If you are attending Databricks Data + AI Summit, stop by the Imply booth to see Lumi Loglake in action.

Other blogs you might find interesting

No records found...
May 11, 2026

Imply Lumi Major Release Preview: Continuing the Journey Towards Decoupled Observability/SIEM

We are getting ready to introduce the next major expansion of Imply Lumi and the observability warehouse. When we introduced the industry’s first observability warehouse, the goal was clear: decouple the...

Learn More
May 04, 2026

Query Lumi from Grafana: Now in Private Preview

Imply Lumi's Grafana Loki integration is now in Private Preview. The same logs you've loaded into Lumi for Splunk are now queryable natively in Grafana using LogQL with no second pipeline, no duplicate storage,...

Learn More
Apr 27, 2026

BTG Pactual + Imply Lumi: More Data. Longer Retention. Lower Cost. Without Replacing Splunk.

BTG Pactual, a global financial institution, ran into a familiar problem. As their Splunk environment scaled, so did the pressure: But the underlying system hadn’t changed. Detection and investigation were...

Learn More

Ready to decouple your observability stack?
No workflow changes. No migrations. More data, less spend.

Request a Demo