Skip to content

IoT in Azure (Telemetry table/raw zone)

(figure 1.1) example of real-time (IoT) data architecture

EVENT HUB #1:
In pure SQL terminology:
The first Event Hub can be thought of as as a RawEvents table, where each row inserted into the table represents an event recorded by our IoT device (there may be hundreds or even tens of thousands of IoT devices sending events every second).
This RawEvents table might have the following characteristics:
1. Every device inserts rows as they arrive
2. No validation – this is a raw or LandingZone/table for our data to land in
3. No schema enforcement beyond basic JSON structure
4. High volume, append-only – note: this is append only with no overrides.

A way to think of this as a SQL statement:
INSERT INTO RawEvents (DeviceId, PayloadJson, IngestedAt)
VALUES (…);

Every device must have a unique DeviceId, an associated JsonPayload carrying the data of the event, and the time the JsonPayload arrived in our Event Hub #1.

Let’s talk a bit more about the IngestedAt field because there can be a lot of ambiguity about what “Ingested At” actually means.
For example – is this field recording the time an event was recorded on the device? i.e. the literal time an event occurred, or, is this the time the event landed in our RawEvents (landing zone) table? In our case, IngestedAt represents the time our database actually stored the event, not when the event actually occurred.

To identify when the event actually occurred, we would refer to the JsonPayload where we would find a timestamp with a name like deviceTime.

Example:
{
“deviceTime”: “2026-02-21T23:59:12Z”,
“temperature”: 72.5
}

An important item of note here: IngestedAt field in our landing table is a timestamp field and in pure SQL Server terms, we can think of this as a field we would have SQL Server automatically populate. Example:
ALTER TABLE RawEvents
ADD IngestedAt DATETIME2 NOT NULL
CONSTRAINT DF_RawEvents_IngestedAt DEFAULT (SYSUTCDATETIME());
Which leads to:
INSERT INTO RawEvents (DeviceId, PayloadJson)
VALUES (…);

It is paramount we have both deviceTime and IngestedAt fields for reasons beyond the obvious:

  1. Detect if devices are/were offline
  2. Events can arrive late
  3. Network issues/jitter can and does occur
  4. Measure ingestion latency
  5. Detect out-of-order events

In Azure Event Hub, these fields are automatically recorded and may default to EnqueuedTime and SystemProperties.

We would take the difference between deviceTime and IngestedAt to arrive at a measured difference or calculated latency between our device and ingestion times, which would tell us if our device is lagging behind ingestion.

To help handle the possibility of latency in scenarios where device timestamps are not reliable, we can rely on our ingestion time and index. SQL example:
CREATE CLUSTERED INDEX IX_RawEvents_IngestedAt
ON RawEvents (IngestedAt);

This is a typical method for partitioning and indexing a telemetry table (in this case, our RawEvents table): to cluster by ingestion time.

Join the conversation

Your email address will not be published. Required fields are marked *