Data Normalization
Overview of how the API and MCP tools normalize sensor data for cross-service comparison and LLM-friendly queries.
Overview
Sensors from different providers (Purple Air, AirNow, NDBC) report data in different formats. The API normalizes data so that:
- Temperature from Purple Air (Celsius) and AirNow (often Fahrenheit) both appear as
air_temperature_cin celsius - PM2.5 from Purple Air and AirNow use the same parameter
pm2_5_ug_per_m3and unit µg/m³ - Timestamps are ISO 8601 UTC
- Measurements use a canonical array format:
[{parameter, value, unit, unit_original}]
This enables LMs and clients to compare readings across sensors without extra parsing or unit conversion.
Data Types (Sensors and Feeds)
Each sensor and feed has a data_types array listing the canonical parameter IDs it collects.
Examples:
| sensor_type | Typical data_types |
|---|---|
| Purple Air | pm2_5_ug_per_m3, pm10_ug_per_m3, air_temperature_c, relative_humidity_percent, air_pressure_hpa |
| NDBC buoy | wind_speed_ms, wind_direction_deg, air_temperature_c, water_temperature_c, air_pressure_hpa, wave_height_m |
| AirNow | pm2_5_ug_per_m3, ozone_ppb, pm10_ug_per_m3, aqi, air_temperature_c |
Filtering: Use ?data_types=air_temperature_c,pm2_5_ug_per_m3 to list only sensors that collect those parameters.
Raw vs. Clean Data
| Format | Use case | Source |
|---|---|---|
| cleaned (default) | Cross-service comparison, LLM queries | clean_records — normalized measurements |
| raw | Debugging, audit, original source | raw_records — original payload from upstream API |
Query params: data_format=cleaned or data_format=raw on:
GET /api/v1/sensors/:id/readings/latestGET /api/v1/sensors/:id/readings- MCP
sensor_cleaned_streamargument
Canonical Measurement Format
Cleaned measurements follow:
[
{ "parameter": "air_temperature_c", "value": 22.5, "unit": "celsius", "unit_original": "celsius" },
{ "parameter": "pm2_5_ug_per_m3", "value": 12.3, "unit": "µg/m³", "unit_original": "µg/m³" }
]Legacy responses may still use a flat object: {"air_temperature_c": 22.5, "pm2_5_ug_per_m3": 12.3}.
Parameter Filter
For time-series readings, use ?parameter=air_temperature_c to return only that parameter in each reading. Comma-separated for multiple: ?parameter=air_temperature_c,pm2_5_ug_per_m3.
Provenance
Readings include provenance for traceability:
{
"provenance": {
"sensor_id": "uuid",
"service_id": "ndbc",
"source_id": "46108"
}
}Common Canonical Parameters
| Parameter | Unit | Description |
|---|---|---|
air_temperature_c | celsius | Air temperature |
water_temperature_c | celsius | Water temperature |
pm2_5_ug_per_m3 | µg/m³ | PM2.5 concentration |
pm10_ug_per_m3 | µg/m³ | PM10 concentration |
ozone_ppb | ppb | Ozone |
aqi | dimensionless | Air Quality Index |
wind_speed_ms | m/s | Wind speed |
wind_direction_deg | degrees | Wind direction |
air_pressure_hpa | hPa | Atmospheric pressure |
relative_humidity_percent | percent | Relative humidity |