Tunas Akara
Back to Blog

Grouping Multi-Timezone Data in PostgreSQL Without Getting It Wrong

by Rayhan2026-06-2411 min read
postgresqltimezonedashboarddata engineeringdstanalytics

Grouping Multi-Timezone Data in PostgreSQL Without Getting It Wrong

I build dashboards that aggregate events from places that do not share a clock: a venue in Adelaide, a site in Kathmandu, a server in New York, an office in Jakarta. The dashboard asks a simple-looking question — "how many events per day, per city?" — and the tricky word is day.

A "day" is not a property of an instant. It is a property of an instant as seen from a timezone. The same instant is Tuesday in one city and Wednesday in another. So before you write GROUP BY ... day, you have to answer: whose day?

This post is the design I use in production, with the SQL that proves it. Every result table below was produced on PostgreSQL 17 — copy the queries and reproduce them yourself. The same trap shows up whether you group by local day or by local hour, and both are covered here.

The one rule: a timestamp is an instant, not a calendar day

Store the moment something happened as timestamptz. Despite the name, timestamptz does not store a timezone — it stores an absolute instant (internally UTC) and renders it in whatever zone you ask for. That is exactly what you want for ordering, durations, and "what happened before what."

The local calendar day is a projection of that instant through a specific zone. The projection is where dashboards go wrong, because the obvious shortcuts for computing it are subtly incorrect.

The shortcuts that look right and aren't

Here is a small events table. Every row knows the instant (occurred_at) and the IANA zone where it happened (event_tz):

CREATE TABLE events (
  id          bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  city        text NOT NULL,
  event_tz    text NOT NULL,          -- IANA zone where it happened
  occurred_at timestamptz NOT NULL,   -- source of truth (an instant)
  local_date  date NOT NULL,          -- resolved ONCE at write time
  local_time  time NOT NULL
);

I insert by stating the local wall-clock time plus the zone — which is how an app actually knows the event — and let Postgres derive the UTC instant:

INSERT INTO events (city, event_tz, occurred_at, local_date, local_time)
SELECT city, tz,
       (wall AT TIME ZONE tz),   -- local wall clock -> UTC instant
       wall::date,               -- stored local_date (frozen, correct)
       wall::time
FROM (VALUES
  ('Adelaide','Australia/Adelaide', timestamp '2026-12-15 00:15'),  -- DST, +10:30
  ('Adelaide','Australia/Adelaide', timestamp '2026-12-15 23:50'),
  ('Adelaide','Australia/Adelaide', timestamp '2026-12-16 00:30'),
  ('Adelaide','Australia/Adelaide', timestamp '2026-06-15 00:15'),  -- standard, +9:30
  ('Kathmandu','Asia/Kathmandu',    timestamp '2026-12-15 00:20'),  -- +5:45
  ('New York','America/New_York',   timestamp '2026-07-04 23:30'),
  ('New York','America/New_York',   timestamp '2026-11-10 00:30'),
  ('Jakarta','Asia/Jakarta',        timestamp '2026-12-15 00:10')
) v(city, tz, wall);

Here's the trap. To bucket Adelaide events into local days, you might reach for one of these:

  • (a) the UTC dateoccurred_at::date
  • (b) a hardcoded numeric offsetoccurred_at AT TIME ZONE interval '9:30' (Adelaide is "UTC+9:30," right?)
  • (c) a timezone abbreviationAT TIME ZONE 'ACST' / 'EST'

Let's compute the day each strategy assigns and compare it to the local_date we stored at write time:

SELECT id,
  to_char(occurred_at,'YYYY-MM-DD HH24:MI')               AS occurred_utc,
  local_date                                              AS stored_local_date,
  (occurred_at AT TIME ZONE 'Australia/Adelaide')::date   AS iana_correct,
  (occurred_at AT TIME ZONE interval '9:30')::date        AS fixed_plus_0930,
  (occurred_at AT TIME ZONE 'UTC')::date                  AS bucket_utc,
  (occurred_at AT TIME ZONE interval '10:00')::date       AS whole_hour_plus10
FROM events WHERE city = 'Adelaide' ORDER BY id;
occurred_utcstored_local_dateiana_correctfixed_plus_0930bucket_utcwhole_hour_plus10
2026-12-14 13:452026-12-152026-12-15❌ 2026-12-14❌ 2026-12-14❌ 2026-12-14
2026-12-15 13:202026-12-152026-12-152026-12-152026-12-152026-12-15
2026-12-15 14:002026-12-162026-12-16❌ 2026-12-15❌ 2026-12-152026-12-16
2026-06-14 14:452026-06-152026-06-152026-06-15❌ 2026-06-142026-06-15

Read row 1 against row 4. The hardcoded +9:30 offset is correct in June and wrong in December for the same city. The UTC bucket is wrong whenever the local time is near midnight. Only AT TIME ZONE 'Australia/Adelaide' (the IANA name) and the stored local_date agree on every row.

Two separate things go wrong here. Let's take them one at a time.

Half an hour is enough to land on the wrong day

Adelaide is not UTC+10. In standard time it is +9:30, and during daylight saving it is +10:30. That extra half hour is not cosmetic. An event at 00:15 local time is only fifteen minutes into the day — push it by thirty minutes and it falls into yesterday.

That is exactly why whole_hour_plus10 got row 1 wrong above: rounding the offset to a whole hour moved a just-after-midnight event back across the date line.

Kathmandu makes the point unavoidable. It runs at +5:45 — a forty-five-minute offset, with no DST to blame:

WITH k(label, occurred_at) AS (VALUES
  ('local 2026-12-15 00:20', timestamp '2026-12-15 00:20' AT TIME ZONE 'Asia/Kathmandu'),
  ('local 2026-12-15 23:50', timestamp '2026-12-15 23:50' AT TIME ZONE 'Asia/Kathmandu')
)
SELECT label,
  to_char(occurred_at,'YYYY-MM-DD HH24:MI')         AS utc,
  (occurred_at AT TIME ZONE interval '5:00')::date  AS round_down_p5,
  (occurred_at AT TIME ZONE interval '6:00')::date  AS round_up_p6,
  (occurred_at AT TIME ZONE 'Asia/Kathmandu')::date AS iana_correct
FROM k;
labelutcround_down_p5round_up_p6iana_correct
local 2026-12-15 00:202026-12-14 18:35❌ 2026-12-142026-12-152026-12-15
local 2026-12-15 23:502026-12-15 18:052026-12-15❌ 2026-12-162026-12-15

Rounding down to +5:00 breaks the first row. Rounding up to +6:00 breaks the second. There is no whole-hour offset that buckets a +5:45 zone correctly, because the real offset lives between the hours. Any code that reduces a zone to an integer number of hours is wrong for roughly a sixth of the world's population.

IANA offsets can sit on any fifteen-minute boundary — Nepal at +5:45, the Chatham Islands at +12:45 — which is why Tinybird's database timestamp guide lands on the same number from the other direction: fifteen minutes is the coarsest bucket you can safely pre-aggregate before some zone needs a per-zone fixup at query time. If your rollups are hourly, half-hour and 45-minute zones are already silently misfiled.

DST makes any fixed number wrong half the year

Even if you store the exact half-hour offset, daylight saving moves it twice a year. So "the city's offset" is not a constant you can hardcode at all.

Watch it surface as the dashboard symptom — orders per local day for Adelaide — computed four ways:

-- (A) CORRECT: the stored local_date
SELECT local_date AS day, count(*) FROM events WHERE city='Adelaide' GROUP BY 1 ORDER BY 1;
-- (B) WRONG: hardcoded +9:30 offset
SELECT (occurred_at AT TIME ZONE interval '9:30')::date AS day, count(*) FROM events WHERE city='Adelaide' GROUP BY 1 ORDER BY 1;
-- (C) WRONG: bucket by UTC date
SELECT (occurred_at AT TIME ZONE 'UTC')::date AS day, count(*) FROM events WHERE city='Adelaide' GROUP BY 1 ORDER BY 1;
-- (D) CORRECT: IANA zone name
SELECT (occurred_at AT TIME ZONE 'Australia/Adelaide')::date AS day, count(*) FROM events WHERE city='Adelaide' GROUP BY 1 ORDER BY 1;
day(A) stored(B) fixed +9:30(C) UTC(D) IANA
2026-06-141
2026-06-15111
2026-12-1411
2026-12-152222
2026-12-1611

Look at what the hardcoded offset (B) does to the chart: it invents a December 14 that had zero local orders and deletes December 16 entirely. The UTC bucketing (C) is even further off. An operator comparing yesterday to today would be reading numbers attributed to the wrong days — during exactly the weeks (DST transitions) when on-call attention is highest.

The abbreviation route fails the same way, because abbreviations like EST are fixed offsets that ignore DST:

WITH n(label, occurred_at) AS (VALUES
  ('summer 2026-07-05 00:30 EDT', timestamp '2026-07-05 00:30' AT TIME ZONE 'America/New_York'),
  ('winter 2026-11-10 00:30 EST', timestamp '2026-11-10 00:30' AT TIME ZONE 'America/New_York')
)
SELECT label,
  to_char(occurred_at,'YYYY-MM-DD HH24:MI')           AS utc,
  (occurred_at AT TIME ZONE 'EST')::date              AS abbrev_est_fixed,
  (occurred_at AT TIME ZONE 'America/New_York')::date AS iana_correct
FROM n;
labelutcabbrev_est_fixediana_correct
summer 2026-07-05 00:30 EDT2026-07-05 04:30❌ 2026-07-042026-07-05
winter 2026-11-10 00:30 EST2026-11-10 05:302026-11-102026-11-10

In summer, New York runs on EDT (-4), but 'EST' is frozen at -5, so it pushes a just-past-midnight event back to the previous day. The lesson: never feed an abbreviation or a number to AT TIME ZONE. Feed it an IANA name.

The clock-hour is wrong for the same reasons — plus two more

Everything above was about the calendar day. Hourly views — "busiest hour," "events per hour of day," shift reports — inherit every one of those bugs and add two that only exist at the hour level.

A sub-hour offset splits one local hour into two

Group by the UTC hour (or any whole-hour offset) and a +5:45 or +9:30 zone never lines up with the bucket edges. Here are four Kathmandu events that all happened inside the local 18:00 hour:

WITH e(occurred_at) AS (VALUES
  (timestamp '2026-12-15 18:05' AT TIME ZONE 'Asia/Kathmandu'),
  (timestamp '2026-12-15 18:20' AT TIME ZONE 'Asia/Kathmandu'),
  (timestamp '2026-12-15 18:35' AT TIME ZONE 'Asia/Kathmandu'),
  (timestamp '2026-12-15 18:50' AT TIME ZONE 'Asia/Kathmandu')
)
SELECT extract(hour from occurred_at)::int                               AS utc_hour,
       extract(hour from occurred_at AT TIME ZONE 'Asia/Kathmandu')::int AS local_hour,
       count(*)
FROM e GROUP BY 1,2 ORDER BY 1;
utc_hourlocal_hourcount
12183
13181

One real peak hour becomes two shorter bars, at UTC 12 and 13. A "busiest hour" query reads the wrong answer — and a 30-minute zone like Adelaide does the same thing, because every local hour straddles two UTC hours.

DST fall-back: one local hour holds two hours of events

When clocks fall back, a local hour happens twice. Adelaide ends DST on 2026-04-05 local — the instant is 2026-04-04 16:30 UTC, where the clock runs 03:00 → 02:00. Drop an event every fifteen minutes across the transition and bucket by local hour:

WITH inst AS (
  SELECT g AS occurred_at
  FROM generate_series(timestamptz '2026-04-04 14:30+00',
                       timestamptz '2026-04-04 18:15+00', interval '15 min') g
)
SELECT extract(hour from occurred_at AT TIME ZONE 'Australia/Adelaide')::int AS local_hour,
       count(*) AS events
FROM inst GROUP BY 1 ORDER BY 1;
local_hourevents
14
28
34

Local hour 02 holds twice the events of its neighbours — two real hours collapsed into one bucket. The spike is genuine, but it appears in your chart every autumn and reads like a traffic surge. Only the timestamptz can tell the two 02:xx hours apart.

DST spring-forward: a local hour that cannot exist

The opposite transition deletes an hour. When Adelaide starts DST, the clock jumps 01:59 → 03:00, so local 02:00–02:59 never happens:

SELECT to_char(g,'HH24:MI') AS utc,
       to_char(g AT TIME ZONE 'Australia/Adelaide','HH24:MI') AS local
FROM generate_series(timestamptz '2026-10-03 16:00+00',
                     timestamptz '2026-10-03 17:00+00', interval '15 min') g;
 utc   | local
 16:00 | 01:30
 16:15 | 01:45
 16:30 | 03:00   <- 02:xx is skipped entirely
 16:45 | 03:15
 17:00 | 03:30

No event can carry local hour 02 that day. And a system that keeps only a local string can store the impossible 02:30 — Postgres silently springs it forward:

SELECT (timestamp '2026-10-04 02:30' AT TIME ZONE 'Australia/Adelaide');
-- 2026-10-03 17:00:00+00  -> that instant is 03:30 local: the event moved an hour

Hourly reporting needs the same fix as daily, with the instant doing extra work: resolve and store the local hour (it lives in local_time) at write time for the histogram, and keep the timestamptz so the duplicated and missing hours stay distinguishable.

The correct query-time way — and why I still don't lean on it

So the "right" query-time answer is to convert each row through its own IANA zone: occurred_at AT TIME ZONE event_tz. That is genuinely correct, and it is the baseline every other approach is measured against above.

A claim you'll see repeated is that this expression can't be indexed. It can — and the reason is worth understanding, because it comes down to the volatility of the underlying timezone() function:

SELECT pg_get_function_identity_arguments(oid) AS signature,
       pg_get_function_result(oid)             AS returns,
       provolatile
FROM pg_proc WHERE proname = 'timezone' ORDER BY provolatile, 1;
signaturereturnsvolatility
text, timestamp with time zonetimestamp without time zoneIMMUTABLE
interval, timestamp with time zonetimestamp without time zoneIMMUTABLE
timestamp with time zonetimestamp without time zoneSTABLE
time with time zonetime with time zoneSTABLE

The form called by occurred_at AT TIME ZONE event_tz is timezone(text, timestamptz) — marked IMMUTABLE. The STABLE variants are the ones that read the session's TimeZone setting instead of taking the zone as an argument. Because the explicit-zone form is immutable, you can build an expression index over it — even with a per-row zone column:

CREATE INDEX idx_expr_localdate
  ON events (((occurred_at AT TIME ZONE event_tz)::date));   -- succeeds

So why don't I rely on query-time conversion as the primary design? Three reasons, in increasing order of importance:

  1. It's easy to get wrong in a hurry. The correct expression and the three broken ones from the top of this post look almost identical in a code review. The next person to touch the query has four plausible ways to write it and only one is right.
  2. You must store the per-row zone anyway. The conversion is only correct if every row carries its true IANA zone. Once that column exists, computing the local day once costs nothing extra.
  3. IMMUTABLE here isn't strictly true. Timezone rules genuinely change — governments add, drop, and shift DST. Postgres marks the function immutable so it can be indexed, but a tzdata update can leave an expression index (or a STORED generated column) reflecting the old rules until you REINDEX. A value resolved and stored at write time, by contrast, freezes the civil date the user actually experienced — which is usually the answer you want.

What I actually store: the instant plus resolved local fields

The design is boring on purpose. Keep timestamptz as the source of truth, store the IANA zone, and resolve the local date and time once, at write time, when the originating zone is unambiguous:

-- written by the ingestion path, exactly once per event:
occurred_at  timestamptz   -- the instant; for ordering & duration math
event_tz     text          -- IANA zone, e.g. 'Australia/Adelaide'
local_date   date          -- (occurred_at AT TIME ZONE event_tz)::date
local_time   time          -- the wall-clock time the user saw

Why date and time, not a string? These types store no timezone. A date is just a calendar date and a time is a wall-clock time, so they keep exactly the value you write — in any session timezone (only timestamptz ever shifts). A string works too, especially ISO 2026-12-15, which sorts correctly. But a typed column is stronger: it rejects invalid dates, supports range scans and date math (day + 7, BETWEEN, weekday extraction), and is smaller and indexes cleanly. The timezone danger only ever lived in timestamptz and its conversions — never in date or time.

If you'd rather the database own the derivation than trust every code path, PostgreSQL 17 will materialize it for you with a generated column — and it works with a per-row zone column:

ALTER TABLE events
  ADD COLUMN gen_local_date date
  GENERATED ALWAYS AS ((occurred_at AT TIME ZONE event_tz)::date) STORED;
-- gen_local_date matched the app-written local_date on every row.

There's a subtlety worth flagging, because you'll find blog posts that say Postgres won't let you do this. They're describing the opposite direction: generating a timestamptz (a timestamp → timestamptz conversion leans on the session's TimeZone and is STABLE, so Postgres refuses it). Our column goes timestamptz → date through an explicit zone, which calls the IMMUTABLE overload, so it's allowed. The same tzdata caveat still applies: a STORED generated column is computed at write time and not retroactively recomputed when the rules change. For a "what day did the user see" column, that is a feature, not a bug.

Now grouping is simple and readable, and the practical payoff is that it uses an ordinary B-tree index. I loaded 500,000 events across four zones, indexed local_date, and compared it to the query-time conversion with no matching index:

CREATE INDEX idx_local_date ON ev_big (local_date);

-- (A) stored column
EXPLAIN ANALYZE
SELECT count(*) FROM ev_big WHERE local_date = DATE '2026-07-15';

-- (B) query-time conversion, no matching index
EXPLAIN ANALYZE
SELECT count(*) FROM ev_big WHERE (occurred_at AT TIME ZONE event_tz)::date = DATE '2026-07-15';
(A)  Index Only Scan using idx_local_date
       Index Cond: (local_date = '2026-07-15'::date)
       Buffers: shared hit=20 read=4
     Execution Time: 0.188 ms

(B)  Seq Scan on ev_big
       Filter: (((occurred_at AT TIME ZONE event_tz))::date = '2026-07-15'::date)
       Rows Removed by Filter: 498628
       Buffers: shared hit=4425
     Execution Time: 106.248 ms

Same answer, ~560× apart. To be fair to the query-time approach: adding the expression index from the previous section brings (B) back down to 0.169 ms. But that re-introduces the immutability caveat, and it is one more index a teammate has to know exists. The stored column is the version a new engineer will use by default.

The caveats you must respect

Storing the local fields is a denormalization, so treat it with the usual discipline:

  • timestamptz stays the source of truth. Order, diff, and do all "real time" math on the instant. Never sort across cities by local_date / local_time00:30 in Jakarta and 00:30 in New York are nowhere near each other in real time. The local fields are for grouping and display, not chronology.
  • Write it through one path. A single ingestion function (or the generated column) should own the derivation. If two code paths compute it differently, you get a denormalization bug that no GROUP BY will reveal.
  • The DST "fall-back" hour is genuinely ambiguous, and it's the reason you keep the instant. When Adelaide ends DST on 2026-04-05, the wall clock goes 03:00 → 02:00, so local 02:30 happens twice:
SELECT
  (timestamp '2026-04-05 02:30' AT TIME ZONE 'Australia/Adelaide') AS what_pg_picks,
  (timestamptz '2026-04-05 02:30+10:30')                           AS if_still_dst,
  (timestamptz '2026-04-05 02:30+09:30')                           AS if_standard;
 what_pg_picks          | if_still_dst           | if_standard
 2026-04-04 17:00:00+00  | 2026-04-04 16:00:00+00  | 2026-04-04 17:00:00+00

The two interpretations are an hour apart, and a bare local string cannot tell them apart — only the timestamptz can. (Postgres resolves the ambiguous literal to standard time here, but you should not depend on which one it picks; capture the instant at the source.)

Takeaways

  • A "day" — and an "hour" — belongs to a zone, not to an instant. Decide whose day before you GROUP BY.
  • For bucketing, never use the UTC date/hour, a hardcoded numeric offset, or a timezone abbreviation. They break on half-hour/45-minute zones and on every DST transition.
  • Hourly views have two extra traps: a sub-hour offset splits one local hour across two UTC buckets, and DST gives one day a doubled hour and another a missing one. The timestamptz is what keeps those hours distinguishable.
  • Query-time occurred_at AT TIME ZONE '<IANA>' with a per-row zone is correct, and — contrary to a common belief — it is indexable.
  • I still resolve and store local_date / local_time at write time: correct by construction, immune to later rule changes, sargable with a plain B-tree, and impossible for the next engineer to get subtly wrong.
  • Keep the timestamptz and the IANA zone as the source of truth; the local fields are a fast, honest projection for the dashboard.

How this lines up with the documentation

These references corroborate the approach and go deeper on the parts a single post can't:

  • PostgreSQL — Date/Time Types is the primary source for the abbreviation trap: EST "specifies UTC-5, regardless of whether daylight savings was nominally in effect," and constant numeric offsets can't follow DST. That's exactly the failure in the New York table.
  • Tinybird — 10 best practices for timestamps and time zones recommends storing UTC plus the offset, the IANA zone name, and the local timestamp as a string — the same instinct as keeping event_tz + local_date beside occurred_at. It's also where the fifteen-minute pre-aggregation limit comes from.
  • Crunchy Data — British Columbia, Time Zones, and Postgres is the real-world version of the tzdata caveat: when a region changes its rules, values derived through AT TIME ZONE shift, and the fix is a deliberate recompute of the stored local columns — not silently trusting an index.
  • Cybertec — Time zone management in PostgreSQL is a good companion on how timestamptz and the session TimeZone interact.
  • The Postgres mailing-list thread "shouldn't timezone(text, timestamp[tz]) be STABLE?" is where the "IMMUTABLE is a pragmatic fib" claim comes from — the function's volatility was questioned precisely because timezone rules change, yet the immutable marking stays for indexability.

It's the same idea behind keeping each source system's original reference visible when you build dashboards that connect different systems: store the precise truth, and keep the readable projection next to it, so queries don't have to recompute it.