reference
Data Normalization

Data Normalization

Overview of how the API and MCP tools normalize sensor data for cross-service comparison and LLM-friendly queries.


Overview

Sensors from different providers (Purple Air, AirNow, NDBC) report data in different formats. The API normalizes data so that:

  • Temperature from Purple Air (Celsius) and AirNow (often Fahrenheit) both appear as air_temperature_c in celsius
  • PM2.5 from Purple Air and AirNow use the same parameter pm2_5_ug_per_m3 and unit µg/m³
  • Timestamps are ISO 8601 UTC
  • Measurements use a canonical array format: [{parameter, value, unit, unit_original}]

This enables LMs and clients to compare readings across sensors without extra parsing or unit conversion.


Data Types (Sensors and Feeds)

Each sensor and feed has a data_types array listing the canonical parameter IDs it collects.

Examples:

sensor_typeTypical data_types
Purple Airpm2_5_ug_per_m3, pm10_ug_per_m3, air_temperature_c, relative_humidity_percent, air_pressure_hpa
NDBC buoywind_speed_ms, wind_direction_deg, air_temperature_c, water_temperature_c, air_pressure_hpa, wave_height_m
AirNowpm2_5_ug_per_m3, ozone_ppb, pm10_ug_per_m3, aqi, air_temperature_c

Filtering: Use ?data_types=air_temperature_c,pm2_5_ug_per_m3 to list only sensors that collect those parameters.


Raw vs. Clean Data

FormatUse caseSource
cleaned (default)Cross-service comparison, LLM queriesclean_records — normalized measurements
rawDebugging, audit, original sourceraw_records — original payload from upstream API

Query params: data_format=cleaned or data_format=raw on:

  • GET /api/v1/sensors/:id/readings/latest
  • GET /api/v1/sensors/:id/readings
  • MCP sensor_cleaned_stream argument

Canonical Measurement Format

Cleaned measurements follow:

[
  { "parameter": "air_temperature_c", "value": 22.5, "unit": "celsius", "unit_original": "celsius" },
  { "parameter": "pm2_5_ug_per_m3", "value": 12.3, "unit": "µg/m³", "unit_original": "µg/m³" }
]

Legacy responses may still use a flat object: {"air_temperature_c": 22.5, "pm2_5_ug_per_m3": 12.3}.


Parameter Filter

For time-series readings, use ?parameter=air_temperature_c to return only that parameter in each reading. Comma-separated for multiple: ?parameter=air_temperature_c,pm2_5_ug_per_m3.


Provenance

Readings include provenance for traceability:

{
  "provenance": {
    "sensor_id": "uuid",
    "service_id": "ndbc",
    "source_id": "46108"
  }
}

Common Canonical Parameters

ParameterUnitDescription
air_temperature_ccelsiusAir temperature
water_temperature_ccelsiusWater temperature
pm2_5_ug_per_m3µg/m³PM2.5 concentration
pm10_ug_per_m3µg/m³PM10 concentration
ozone_ppbppbOzone
aqidimensionlessAir Quality Index
wind_speed_msm/sWind speed
wind_direction_degdegreesWind direction
air_pressure_hpahPaAtmospheric pressure
relative_humidity_percentpercentRelative humidity