Bassam Ismail
Engineering

When Health Data Freshness Became a Pipeline Problem

11 min read

Health data freshness is a contract, not a button

I pressed Sync Now on the watch, then on the phone, and the dashboard still insisted my step count was an hour old. Every layer could claim it had done its job: the watch had data, the phone had synced, Home Assistant had state, and the dashboard had refreshed. None of that explained why the panel was wrong.

The fix was to stop treating Home Assistant as the health source of truth. I kept it for smart-home telemetry, moved health metrics onto a direct API path, and made every panel display the captured timestamp instead of pretending the render time was the data time.

TL;DR

Health data freshness is not a UI refresh problem. It is a contract across the device that produced the metric, the app that captured it, the API that stored it, and the widget that displayed it. The practical fix was to route health metrics directly from the health sync service to the application API, keep Home Assistant for non-health telemetry, and expose both captured_at and received_at so stale data could not masquerade as current.

The operational symptom was worse than the visible one. Nothing was down. Nothing threw. You are surrounded by successful logs and wrong answers, which is a special kind of office lighting.

The first mistake was asking, "How often does the dashboard update?" That mattered, but it was too late in the chain. A display cannot invent freshness. It can only reveal, cache, or obscure whatever timestamp discipline exists upstream.

The useful mental model became a five-part contract:

LayerQuestion it must answerFailure smell
ProducerWhen was the measurement taken?Step count changes, timestamp does not
RelayWhen did the phone receive it?Watch and phone disagree
IngestWhen did the server accept it?API has older data than device
CacheWhat does the UI read from?Manual sync appears ignored
DisplayWhich timestamp is shown?"27m ago" after a fresh send

Once I wrote it that way, Home Assistant stopped looking like a harmless middle layer for health data. It was useful infrastructure, just in the wrong job.

Freshness bugs are miserable for a specific reason: every component has a defensible excuse, and the actual problem lives in the handoffs between them.

OLD PATHwatchphonesyncerhaglance[ Every hop can cache or delay ]

The mechanism that made stale data look fresh

There were three clocks in play, and the UI was treating them as one.

The measurement clock

This is when the health value was actually observed. For steps, stand minutes, sleep, heart rate, or device battery, the only honest freshness timestamp is close to the device's sample time. If the watch counted 8,234 steps at 10:02, that timestamp matters more than when a dashboard later painted the number.

HealthKit and watch-to-phone propagation do not behave like a synchronous database write. Some metrics arrive quickly. Some are batched. Some require the phone to be the relay point before watch data can be forwarded. Pressing a manual sync button can start a send, but it cannot force every upstream HealthKit sample to have already crossed from watch to phone.

One confusing report made sense once I understood this: the phone showed 99 percent battery, the watch showed 68 percent, and the app still had no watch battery reading. The phone app could send phone state immediately, but watch battery required a separate watch-side capture or a relay that had actually received it. "No option to get watch battery?" was not a UI gap. It was a producer gap.

The ingest clock

This is when the API receives a payload. It answers a different question: did the pipeline move recently?

I wanted both clocks in storage because they detect different failures. If received_at is new and captured_at is old, sync is working but the producer is stale. If captured_at is new and the UI still shows old data, the cache or display layer is the suspect.

The ingestion payload ended up shaped like this:

{
  "source": "phone",
  "metrics": {
    "steps": 8234,
    "stand_minutes": 42,
    "phone_battery_pct": 99,
    "watch_battery_pct": 68
  },
  "captured_at": "2026-06-23T10:02:11Z",
  "sent_at": "2026-06-23T10:02:14Z"
}

The server adds received_at. The client does not get to choose it. That constraint matters: a misbehaving or clock-skewed client can supply a bad captured_at, but it cannot backdate ingestion. Validating that captured_at is within a reasonable window of received_at is worth adding once you have seen what a clock-skewed device sends.

create table health_snapshot (
  id bigserial primary key,
  source text not null check (source in ('phone', 'watch')),
  metrics jsonb not null,
  captured_at timestamptz not null,
  sent_at timestamptz,
  received_at timestamptz not null default now()
);
 
create index health_snapshot_latest_idx
  on health_snapshot (captured_at desc, received_at desc);

This is boring schema work. That is usually a sign the shape is correct.

Two other edge cases worth naming here. First, the "latest sample wins" model works well for instantaneous metrics like steps or battery, but breaks down for sleep, where multiple samples can describe overlapping intervals and the most recently received record is not necessarily the most complete one. Sleep reconstruction belongs in a separate query or a separate table. Second, duplicate snapshots are possible if the client retries a send. Adding a unique constraint on (source, captured_at) or a deduplication key in the insert prevents the ingest layer from inflating row counts without adding information.

The display clock

This is when the dashboard refreshed. A 30-second dashboard refresh does not mean 30-second health freshness. It means the dashboard asked its source again. If the source is Home Assistant and Home Assistant is holding a two-hour-old state, the panel can refresh faithfully every 30 seconds and still be faithfully wrong.

A fast dashboard over a stale store is just a more responsive lie.

The person who built the store path does not enjoy that line, but it holds.

Why Home Assistant stayed, but not for health

Home Assistant was not the villain. It was doing the thing it is good at: representing home state, entities, sensors, automations, and integrations with a consistent state model. For smart-home panels in the dashboard, that was still useful.

Health metrics had different pressure:

RequirementHome Assistant pathDirect API path
Latest sample winsPossible, but indirectNatural query model
Manual sync feedbackHard to traceDirect response
Multiple timestampsAwkward entity stateFirst-class fields
Health historyNeeds extra modelingNormal table/query
Smart-home telemetryStrong fitNot the point

The rejected option was supporting both paths for health indefinitely. That sounded flexible until I wrote down the failure modes. Two sources meant two answers to "what is latest?" It also meant every stale-data ticket would begin with source arbitration instead of diagnosis.

I kept a compatibility read for a short transition window, but the rule was plain: the dashboard health panels read the application API. Home Assistant remains for lights, sensors, and other smart-home state.

NEW SPLITDATAROUTEhealthhomebatteryapiha

The API became the freshness boundary

The endpoint needed to answer one question without theater: what is the latest health snapshot the application should show?

import express from "express";
import { Pool } from "pg";
 
const app = express();
const db = new Pool({ connectionString: process.env.DATABASE_URL });
 
app.use(express.json());
 
app.post("/api/health/snapshots", async (req, res) => {
  const { source, metrics, captured_at, sent_at } = req.body;
 
  await db.query(
    `insert into health_snapshot (source, metrics, captured_at, sent_at)
     values ($1, $2::jsonb, $3::timestamptz, $4::timestamptz)`,
    [source, JSON.stringify(metrics), captured_at, sent_at ?? null]
  );
 
  res.status(202).json({ ok: true });
});
 
app.get("/api/health/latest", async (_req, res) => {
  const result = await db.query(
    `select source, metrics, captured_at, sent_at, received_at
     from health_snapshot
     order by captured_at desc, received_at desc
     limit 1`
  );
 
  res.json(result.rows[0] ?? null);
});

The corresponding dashboard widget stopped reading health from a Home Assistant entity and started calling the API directly.

- type: custom-api
  title: Health
  cache: 30s
  url: https://health.example.com/api/health/latest
  template: |
    {{ $age := durationSince (.JSON.String "captured_at") }}
    Steps: {{ .JSON.Int "metrics.steps" }}
    Stand: {{ .JSON.Int "metrics.stand_minutes" }}m
    Watch: {{ .JSON.Int "metrics.watch_battery_pct" }}%
    Updated: {{ $age }} ago

A 30-second cache here means the panel asks the API about twice a minute. It does not promise that HealthKit will produce new values every 30 seconds. That distinction went directly into the UI copy by showing captured_at age, not widget refresh age.

For validation, I wanted commands that separated the layers:

cd ~/health-panel
curl -s https://health.example.com/api/health/latest | jq '{source, metrics, captured_at, received_at}'
cd ~/health-panel
curl -s -X POST https://health.example.com/api/health/snapshots \
  -H 'content-type: application/json' \
  -d '{"source":"phone","metrics":{"steps":8234,"stand_minutes":42,"phone_battery_pct":99,"watch_battery_pct":68},"captured_at":"2026-06-23T10:02:11Z","sent_at":"2026-06-23T10:02:14Z"}'

If the POST returned 202 and the GET immediately showed the sample, the API path was clean. If the dashboard stayed old after its cache window, the problem was dashboard configuration or network access. If the API stayed old after pressing Sync Now, the problem was the app-to-API send or the device producer.

Obvious in print. Not obvious while staring at a panel that said "1h ago" after two manual syncs and a fresh build install.

Deep-dive: The cache rule I used

I used short cache windows only where the upstream query was cheap and the user expected recency. For health, 30 seconds was acceptable because /api/health/latest returns one indexed row. For heavier panels such as recent transactions, I would keep the fetch button but avoid pretending background refresh is free.

- type: custom-api
  title: Recent items
  cache: 2m
  url: https://health.example.com/api/items/recent

The important bit is not the number. It is that the panel shows the data timestamp and the cache policy is chosen from the cost of the source, not from hope.

What this did not solve

The direct API path did not make Apple Watch data instant. It made staleness attributable.

Sharp edges remain. Watch-to-phone propagation can lag. Background delivery can be constrained by OS policy. Some watch-only values require explicit watch app support. Sleep data involves merge logic because multiple samples can describe overlapping intervals. A manual sync button sends whatever the app currently has; it cannot manufacture samples the device has not exposed yet.

The auth surface on the ingest endpoint also deserves attention before this goes anywhere public. A health payload POST that accepts client-supplied captured_at needs at minimum a shared secret or token, plus server-side validation that the timestamp is not absurdly far in the past or future. Without that, a misconfigured or replayed request can quietly poison the latest snapshot without any error in the logs.

All of this is why the UI needs to show the age of the actual measurement. If steps are wrong, the panel should make it possible to see whether the wrongness came from a stale sample, a failed send, or a stale read. Staleness is not eliminated here. It is made attributable, which is the only honest goal for a system with this many independent moving parts.

I also would not push every health event individually without restraint. A snapshot model is easier to reason about for a dashboard. For analytics or sleep reconstruction, event history matters. Those are different reads and should not be squeezed through the same endpoint just because both contain the word health.

FAQ

Why is health data freshness unreliable in dashboards?

Because the dashboard is usually several hops away from the device that measured the data. Watch capture, phone relay, API ingest, cache policy, and widget refresh can each be working while the final value is still old.

Where should I capture the freshness timestamp?

Capture captured_at on the device or app that observed the metric, then add received_at on the server. Display the age of captured_at, and use received_at for pipeline diagnostics.

Should Home Assistant store Apple Watch health data?

It can, but I would not make it the source of truth for a health dashboard that cares about freshness and history. Home Assistant remains a good fit for smart-home telemetry, while health metrics are cleaner through a direct API.

How often should the dashboard refresh health panels?

A 30-second dashboard cache is a reasonable polling interval for a cheap latest-snapshot endpoint. It only controls how often the dashboard asks, not how often HealthKit or the watch produce new samples.

Why did Sync Now still show old health data?

Sync Now can send the latest data available to the app at that moment. If the watch has not propagated a sample to the phone, or the watch app has not captured a watch-only metric, the send can succeed while the displayed measurement remains old.