Learn IoT data pipeline vocabulary: time-series data, TSDB (InfluxDB, TimescaleDB), data ingestion, telemetry, event streaming from IoT devices, and data retention policies.
0 / 5 completed
1 / 5
A time-series database (TSDB) like InfluxDB is optimised differently from a relational database for IoT data because:
IoT data is almost always append-only (you record a sensor reading, not update it). TSDBs exploit this: data is written in time order, stored in columnar time-partitioned blocks, and automatically compressed. Features like continuous queries, retention policies, and Flux/InfluxQL make downsampling and windowed aggregation first-class operations.
2 / 5
Telemetry in an IoT context refers to:
Telemetry (from Greek: tele = remote, metron = measure) is the one-way flow of data from device to cloud. Contrast with: commands/actuation (cloud → device), twin/shadow documents (desired vs. reported state). IoT platforms like AWS IoT Core, Azure IoT Hub, and Google Cloud IoT Core separate these concerns into distinct message channels.
3 / 5
Event streaming from IoT devices using Apache Kafka or AWS Kinesis is preferred over direct database writes because:
IoT device fleets can generate millions of messages per second. A streaming platform acts as a durable buffer: devices write to Kafka/Kinesis topics; downstream consumers (TSDB ingestion, ML feature pipelines, alerting systems, data lakes) read at their own pace. If a consumer fails, it can replay from a checkpoint — unlike lost direct database writes.
4 / 5
A data retention policy in an IoT TSDB serves to:
IoT data at full resolution (e.g., 1-second sensor readings) becomes expensive to store over years. Retention policies automate lifecycle management: keep raw data for 30 days, hourly averages for 1 year, daily summaries forever. InfluxDB calls these 'retention policies'; TimescaleDB uses 'data retention policies' with the add_retention_policy() function and continuous aggregates.
5 / 5
TimescaleDB differs from InfluxDB in that:
TimescaleDB's PostgreSQL foundation is a major advantage for teams already familiar with SQL — joins, window functions, and full PostgreSQL ecosystem compatibility work out of the box. Hypertables automatically partition data by time (and optionally space). InfluxDB's purpose-built engine can be more efficient for pure time-series workloads, but lacks relational capabilities.