Skip to main content

Real-Time Intelligence with Microsoft Fabric Eventhouse and KQL

April 21, 2026 8 min read

Most data platforms are built for batch. You ingest data, transform it overnight, and deliver yesterday's insights in the morning. But some problems can't wait — fraud detection, IoT monitoring, live user analytics, operational dashboards. These need data flowing continuously with sub-second query response times.

Microsoft Fabric's Real-Time Intelligence workload (centered on Eventhouse) is purpose-built for this. It's essentially Azure Data Explorer (Kusto) integrated natively into Fabric, with Eventstream for ingestion and Power BI for visualization. Here's how to build a production real-time analytics pipeline with it.

The Real-Time Intelligence Architecture

┌──────────────────────────────────────────────────┐
│  Data Sources                                     │
│  ┌─────────┐ ┌──────────┐ ┌───────────────────┐ │
│  │Event Hub│ │IoT Hub   │ │Custom App (REST)  │ │
│  └────┬────┘ └────┬─────┘ └───────┬───────────┘ │
└───────┼───────────┼───────────────┼──────────────┘
        │           │               │
┌───────▼───────────▼───────────────▼──────────────┐
│  Eventstream (Ingestion & Routing)                │
│  ┌──────────────────────────────────────────────┐│
│  │ Transformations: Filter, Project, Aggregate  ││
│  └──────────────┬───────────────────────────────┘│
└─────────────────┼────────────────────────────────┘
                  │
┌─────────────────▼────────────────────────────────┐
│  Eventhouse                                       │
│  ┌──────────────────────────────────────────────┐│
│  │ KQL Database                                  ││
│  │  ├── Raw Events Table (hot: 30d, cold: 1y)   ││
│  │  ├── Materialized View (5-min aggregates)     ││
│  │  └── Function Library (reusable queries)      ││
│  └──────────────────────────────────────────────┘│
└─────────────────┬────────────────────────────────┘
                  │
┌─────────────────▼────────────────────────────────┐
│  Consumption                                      │
│  ┌──────────┐  ┌───────────┐  ┌───────────────┐ │
│  │Power BI  │  │KQL        │  │Fabric         │ │
│  │Real-Time │  │Queryset   │  │Notebooks      │ │
│  │Dashboard │  │           │  │(Spark + KQL)  │ │
│  └──────────┘  └───────────┘  └───────────────┘ │
└──────────────────────────────────────────────────┘

Setting Up Eventhouse

An Eventhouse in Fabric is a container for one or more KQL Databases. Think of it like a SQL Server instance containing databases. Create one in your Fabric workspace, then create a KQL Database inside it.

Table Design for Streaming Data

KQL databases use a columnar storage engine optimized for append-heavy, time-series workloads. Design tables accordingly:

// Create the raw events table
.create table RawEvents (
    EventId: string,
    EventType: string,
    Timestamp: datetime,
    UserId: string,
    SessionId: string,
    DeviceType: string,
    Country: string,
    Properties: dynamic,
    NumericValue: real
)

// Set retention: hot cache for fast queries, total retention for compliance
.alter-merge table RawEvents policy retention 
{ "SoftDeletePeriod": "365.00:00:00", "Recoverability": "Enabled" }

// Hot cache for recent data (queries on cached data are 10x faster)
.alter table RawEvents policy caching 
    hot = 30d

// Enable streaming ingestion for sub-second latency
.alter table RawEvents policy streamingingestion enable

// Create ingestion mapping for JSON sources
.create table RawEvents ingestion json mapping 'RawEventsMapping' 
    '[{"column":"EventId","path":"$.eventId","datatype":"string"},'
    '{"column":"EventType","path":"$.type","datatype":"string"},'
    '{"column":"Timestamp","path":"$.timestamp","datatype":"datetime"},'
    '{"column":"UserId","path":"$.userId","datatype":"string"},'
    '{"column":"SessionId","path":"$.sessionId","datatype":"string"},'
    '{"column":"DeviceType","path":"$.device","datatype":"string"},'
    '{"column":"Country","path":"$.country","datatype":"string"},'
    '{"column":"Properties","path":"$.properties","datatype":"dynamic"},'
    '{"column":"NumericValue","path":"$.value","datatype":"real"}]'

Materialized Views for Pre-Aggregation

Raw event tables grow fast. Materialized views continuously aggregate data as it arrives — no batch job needed:

// 5-minute event aggregations, updated automatically
.create materialized-view EventAggregates on table RawEvents {
    RawEvents
    | summarize 
        EventCount = count(),
        UniqueUsers = dcount(UserId),
        AvgValue = avg(NumericValue),
        P95Value = percentile(NumericValue, 95)
      by EventType, Country, bin(Timestamp, 5m)
}

// Hourly session metrics
.create materialized-view SessionMetrics on table RawEvents {
    RawEvents
    | summarize 
        SessionCount = dcount(SessionId),
        AvgEventsPerSession = count() / dcount(SessionId),
        AvgSessionDuration = max(Timestamp) - min(Timestamp)
      by bin(Timestamp, 1h), DeviceType, Country
}

Materialized views are incremental — they only process new data since the last materialization. This makes them extremely efficient for dashboards.

KQL: The Query Language You Didn't Know You Needed

If you've been writing SQL your whole career, KQL feels weird for about an hour. Then you realize it's better for analytics. The pipe-based syntax reads top-to-bottom like a data pipeline:

Basic Patterns

// Events in the last hour by type
RawEvents
| where Timestamp > ago(1h)
| summarize Count = count() by EventType
| order by Count desc

// Time series with 5-minute bins
RawEvents
| where Timestamp > ago(24h)
| summarize EventCount = count() by bin(Timestamp, 5m)
| render timechart

// Top 10 users by activity
RawEvents
| where Timestamp > ago(7d)
| summarize EventCount = count(), 
            LastSeen = max(Timestamp)
  by UserId
| top 10 by EventCount
| project UserId, EventCount, LastSeen, 
          DaysSinceLastActive = datetime_diff('day', now(), LastSeen)

Advanced: Anomaly Detection

KQL has built-in time series analysis functions — no Python needed:

// Detect anomalies in event volume
let baseline = 
    RawEvents
    | where Timestamp between (ago(14d) .. ago(1d))
    | summarize ExpectedCount = count() by bin(Timestamp, 1h);

let current = 
    RawEvents
    | where Timestamp > ago(1d)
    | summarize ActualCount = count() by bin(Timestamp, 1h);

current
| join kind=leftouter baseline on Timestamp
| extend Deviation = (ActualCount - ExpectedCount) / ExpectedCount * 100
| where abs(Deviation) > 50
| project Timestamp, ActualCount, ExpectedCount, 
          Deviation = round(Deviation, 1),
          Status = iff(Deviation > 0, "SPIKE", "DROP")
| order by abs(Deviation) desc
// Built-in anomaly detection with series_decompose_anomalies
RawEvents
| where Timestamp > ago(7d)
| summarize Count = count() by bin(Timestamp, 1h)
| order by Timestamp asc
| project Timestamp, Count
| extend (anomalies, score, baseline) = 
    series_decompose_anomalies(pack_array(Count), 1.5)

Advanced: Sessionization

Turn raw events into user sessions without pre-processing:

// Sessionize events (30-min inactivity gap = new session)
RawEvents
| where Timestamp > ago(24h)
| order by UserId, Timestamp asc
| extend TimeSincePrev = iff(
    prev(UserId) == UserId, 
    Timestamp - prev(Timestamp), 
    timespan(null))
| extend IsNewSession = iff(
    TimeSincePrev > 30m or isnull(TimeSincePrev), 1, 0)
| extend SessionNumber = row_cumsum(IsNewSession)
| summarize 
    SessionStart = min(Timestamp),
    SessionEnd = max(Timestamp),
    EventCount = count(),
    EventTypes = make_set(EventType)
  by UserId, SessionNumber
| extend SessionDuration = SessionEnd - SessionStart
| summarize 
    AvgSessionDuration = avg(SessionDuration),
    AvgEventsPerSession = avg(EventCount),
    SessionCount = count()
  by bin(SessionStart, 1h)

Eventstream: Zero-Code Ingestion

Fabric Eventstream connects sources to Eventhouse without writing code. But for programmatic ingestion, you can also push directly:

// C# — Push events to Eventhouse via queued ingestion
using Kusto.Data;
using Kusto.Ingest;

var connectionString = new KustoConnectionStringBuilder(
    "https://your-eventhouse.kusto.fabric.microsoft.com",
    "YourKQLDatabase")
    .WithAadTokenProviderAuthentication(() => GetAccessToken());

using var ingestClient = KustoIngestFactory.CreateQueuedIngestClient(connectionString);

var properties = new KustoIngestionProperties("YourKQLDatabase", "RawEvents")
{
    Format = DataSourceFormat.json,
    IngestionMapping = new IngestionMapping { IngestionMappingReference = "RawEventsMapping" }
};

// Ingest from a stream
using var stream = new MemoryStream(Encoding.UTF8.GetBytes(jsonPayload));
await ingestClient.IngestFromStreamAsync(stream, properties);
# Python — batch ingest from a Fabric notebook
from azure.kusto.data import KustoClient, KustoConnectionStringBuilder
from azure.kusto.ingest import QueuedIngestClient, IngestionProperties

kcsb = KustoConnectionStringBuilder.with_aad_managed_identity_authentication(
    "https://your-eventhouse.kusto.fabric.microsoft.com")

ingest_client = QueuedIngestClient(kcsb)
properties = IngestionProperties(
    database="YourKQLDatabase",
    table="RawEvents",
    data_format="json",
    ingestion_mapping_reference="RawEventsMapping"
)

ingest_client.ingest_from_file("events_batch.json", properties)

Building Real-Time Dashboards

Option 1: KQL Queryset → Power BI

Create a KQL Queryset in Fabric, write your queries, and pin visuals directly to a Power BI dashboard. The connection is live — no scheduled refresh:

// Dashboard query: Live event rate
RawEvents
| where Timestamp > ago(1h)
| summarize EventsPerMinute = count() by bin(Timestamp, 1m)
| render timechart with (title="Live Event Rate (per minute)")

Option 2: Power BI DirectQuery to Eventhouse

For more complex dashboards, create a Power BI semantic model with DirectQuery to the KQL Database:

// Create a KQL function for the Power BI model
.create-or-alter function with (folder="PowerBI") GetHourlyMetrics() {
    EventAggregates
    | where Timestamp > ago(30d)
    | summarize 
        TotalEvents = sum(EventCount),
        UniqueUsers = sum(UniqueUsers),
        AvgValue = avg(AvgValue)
      by EventType, Country, bin(Timestamp, 1h)
}

Power BI connects to this function and translates DAX to KQL behind the scenes. Users get an interactive dashboard that refreshes on every interaction.

Monitoring and Operations

Ingestion Health

// Check ingestion latency and failures
.show ingestion failures 
| where FailedOn > ago(24h)
| summarize FailureCount = count() by Table, ErrorCode
| order by FailureCount desc

// Monitor ingestion volume
.show commands-and-queries
| where StartedOn > ago(1h)
| where CommandType == "DataIngestPull"
| summarize IngestCount = count(), 
            AvgDuration = avg(Duration)
  by bin(StartedOn, 5m)

Data Freshness

// How fresh is each table?
.show tables details
| project TableName, TotalRowCount, TotalExtentSize, 
          HotRowCount, HotExtentSize,
          LastStatsUpdate = CachingPolicyEffectiveDateTime

Capacity Management

KQL Databases in Fabric consume capacity units. Monitor your consumption:

// Query cost estimation
RawEvents
| where Timestamp > ago(7d)
| count
// Check the "Query resource consumption" in the response header

Architecture Decisions

After building several real-time pipelines in Fabric, here are the key decisions:

When to use Eventhouse vs Lakehouse:

  • Eventhouse: Sub-second queries on streaming/time-series data, anomaly detection, log analytics
  • Lakehouse: Batch ETL, complex joins across many tables, ML feature engineering

When to use materialized views vs Gold tables:

  • Materialized views: Continuous aggregation, real-time dashboards, operational metrics
  • Gold tables (Lakehouse): Complex multi-source joins, ML training data, regulatory reporting

When to use Eventstream vs direct ingestion:

  • Eventstream: Multiple destinations, filtering/routing, no-code requirement
  • Direct ingestion: High throughput (>100K events/sec), custom batching logic, existing Kusto SDK integration

Retention strategy:

  • Hot cache: 7-30 days (depending on query patterns)
  • Total retention: Based on compliance requirements (90 days to 7 years)
  • Use update policy to transform data on ingest and store only aggregated versions long-term

Putting It All Together

A complete real-time intelligence pipeline in Fabric might look like:

  1. Source systems → Event Hub (or IoT Hub)
  2. Eventstream routes events to Eventhouse + archives raw to Lakehouse Bronze
  3. Eventhouse stores events with 30-day hot cache, materialized views update continuously
  4. KQL functions expose clean interfaces for Power BI
  5. Power BI DirectQuery dashboard auto-refreshes from Eventhouse
  6. Fabric notebook runs hourly to move cold data aggregations into Lakehouse Gold

The magic is that all of this lives in one Fabric workspace, governed by one security model, billed on one capacity. No Kafka clusters to manage, no Spark streaming jobs to babysit, no separate monitoring stacks.

Real-time analytics used to be the most operationally expensive part of any data platform. Fabric Eventhouse doesn't eliminate all that complexity — but it absorbs enough of it into the platform that a small team can actually run it in production.

Comments