Grouping Multi-Timezone Data in PostgreSQL Without Getting It Wrong
Grouping Multi-Timezone Data in PostgreSQL Without Getting It Wrong
I build dashboards that aggregate events from places that do not share a clock: a venue in Adelaide, a site in Kathmandu, a server in New York, an office in Jakarta. The dashboard asks a simple-looking question — "how many events per day, per city?" — and the tricky word is day.
A "day" is not a property of an instant. It is a property of an instant as seen
from a timezone. The same instant is Tuesday in one city and Wednesday in
another. So before you write GROUP BY ... day, you have to answer: whose day?
This post is the design I use in production, with the SQL that proves it. Every result table below was produced on PostgreSQL 17 — copy the queries and reproduce them yourself. The same trap shows up whether you group by local day or by local hour, and both are covered here.
The one rule: a timestamp is an instant, not a calendar day
Store the moment something happened as timestamptz. Despite the name,
timestamptz does not store a timezone — it stores an absolute instant
(internally UTC) and renders it in whatever zone you ask for. That is exactly
what you want for ordering, durations, and "what happened before what."
The local calendar day is a projection of that instant through a specific zone. The projection is where dashboards go wrong, because the obvious shortcuts for computing it are subtly incorrect.
The shortcuts that look right and aren't
Here is a small events table. Every row knows the instant (occurred_at) and the
IANA zone where it happened (event_tz):
CREATE TABLE events (
id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
city text NOT NULL,
event_tz text NOT NULL, -- IANA zone where it happened
occurred_at timestamptz NOT NULL, -- source of truth (an instant)
local_date date NOT NULL, -- resolved ONCE at write time
local_time time NOT NULL
);
I insert by stating the local wall-clock time plus the zone — which is how an app actually knows the event — and let Postgres derive the UTC instant:
INSERT INTO events (city, event_tz, occurred_at, local_date, local_time)
SELECT city, tz,
(wall AT TIME ZONE tz), -- local wall clock -> UTC instant
wall::date, -- stored local_date (frozen, correct)
wall::time
FROM (VALUES
('Adelaide','Australia/Adelaide', timestamp '2026-12-15 00:15'), -- DST, +10:30
('Adelaide','Australia/Adelaide', timestamp '2026-12-15 23:50'),
('Adelaide','Australia/Adelaide', timestamp '2026-12-16 00:30'),
('Adelaide','Australia/Adelaide', timestamp '2026-06-15 00:15'), -- standard, +9:30
('Kathmandu','Asia/Kathmandu', timestamp '2026-12-15 00:20'), -- +5:45
('New York','America/New_York', timestamp '2026-07-04 23:30'),
('New York','America/New_York', timestamp '2026-11-10 00:30'),
('Jakarta','Asia/Jakarta', timestamp '2026-12-15 00:10')
) v(city, tz, wall);
Here's the trap. To bucket Adelaide events into local days, you might reach for one of these:
- (a) the UTC date —
occurred_at::date - (b) a hardcoded numeric offset —
occurred_at AT TIME ZONE interval '9:30'(Adelaide is "UTC+9:30," right?) - (c) a timezone abbreviation —
AT TIME ZONE 'ACST'/'EST'
Let's compute the day each strategy assigns and compare it to the local_date we
stored at write time:
SELECT id,
to_char(occurred_at,'YYYY-MM-DD HH24:MI') AS occurred_utc,
local_date AS stored_local_date,
(occurred_at AT TIME ZONE 'Australia/Adelaide')::date AS iana_correct,
(occurred_at AT TIME ZONE interval '9:30')::date AS fixed_plus_0930,
(occurred_at AT TIME ZONE 'UTC')::date AS bucket_utc,
(occurred_at AT TIME ZONE interval '10:00')::date AS whole_hour_plus10
FROM events WHERE city = 'Adelaide' ORDER BY id;
| occurred_utc | stored_local_date | iana_correct | fixed_plus_0930 | bucket_utc | whole_hour_plus10 |
|---|---|---|---|---|---|
| 2026-12-14 13:45 | 2026-12-15 | 2026-12-15 | ❌ 2026-12-14 | ❌ 2026-12-14 | ❌ 2026-12-14 |
| 2026-12-15 13:20 | 2026-12-15 | 2026-12-15 | 2026-12-15 | 2026-12-15 | 2026-12-15 |
| 2026-12-15 14:00 | 2026-12-16 | 2026-12-16 | ❌ 2026-12-15 | ❌ 2026-12-15 | 2026-12-16 |
| 2026-06-14 14:45 | 2026-06-15 | 2026-06-15 | 2026-06-15 | ❌ 2026-06-14 | 2026-06-15 |
Read row 1 against row 4. The hardcoded +9:30 offset is correct in June and
wrong in December for the same city. The UTC bucket is wrong whenever the
local time is near midnight. Only AT TIME ZONE 'Australia/Adelaide' (the IANA
name) and the stored local_date agree on every row.
Two separate things go wrong here. Let's take them one at a time.
Half an hour is enough to land on the wrong day
Adelaide is not UTC+10. In standard time it is +9:30, and during daylight
saving it is +10:30. That extra half hour is not cosmetic. An event at
00:15 local time is only fifteen minutes into the day — push it by thirty
minutes and it falls into yesterday.
That is exactly why whole_hour_plus10 got row 1 wrong above: rounding the offset
to a whole hour moved a just-after-midnight event back across the date line.
Kathmandu makes the point unavoidable. It runs at +5:45 — a forty-five-minute offset, with no DST to blame:
WITH k(label, occurred_at) AS (VALUES
('local 2026-12-15 00:20', timestamp '2026-12-15 00:20' AT TIME ZONE 'Asia/Kathmandu'),
('local 2026-12-15 23:50', timestamp '2026-12-15 23:50' AT TIME ZONE 'Asia/Kathmandu')
)
SELECT label,
to_char(occurred_at,'YYYY-MM-DD HH24:MI') AS utc,
(occurred_at AT TIME ZONE interval '5:00')::date AS round_down_p5,
(occurred_at AT TIME ZONE interval '6:00')::date AS round_up_p6,
(occurred_at AT TIME ZONE 'Asia/Kathmandu')::date AS iana_correct
FROM k;
| label | utc | round_down_p5 | round_up_p6 | iana_correct |
|---|---|---|---|---|
| local 2026-12-15 00:20 | 2026-12-14 18:35 | ❌ 2026-12-14 | 2026-12-15 | 2026-12-15 |
| local 2026-12-15 23:50 | 2026-12-15 18:05 | 2026-12-15 | ❌ 2026-12-16 | 2026-12-15 |
Rounding down to +5:00 breaks the first row. Rounding up to +6:00
breaks the second. There is no whole-hour offset that buckets a +5:45 zone
correctly, because the real offset lives between the hours. Any code that
reduces a zone to an integer number of hours is wrong for roughly a sixth of the
world's population.
IANA offsets can sit on any fifteen-minute boundary — Nepal at +5:45, the
Chatham Islands at +12:45 — which is why Tinybird's database timestamp guide
lands on the same number from the other direction: fifteen minutes is the
coarsest bucket you can safely pre-aggregate before some zone needs a per-zone
fixup at query time. If your rollups are hourly, half-hour and 45-minute zones
are already silently misfiled.
DST makes any fixed number wrong half the year
Even if you store the exact half-hour offset, daylight saving moves it twice a year. So "the city's offset" is not a constant you can hardcode at all.
Watch it surface as the dashboard symptom — orders per local day for Adelaide — computed four ways:
-- (A) CORRECT: the stored local_date
SELECT local_date AS day, count(*) FROM events WHERE city='Adelaide' GROUP BY 1 ORDER BY 1;
-- (B) WRONG: hardcoded +9:30 offset
SELECT (occurred_at AT TIME ZONE interval '9:30')::date AS day, count(*) FROM events WHERE city='Adelaide' GROUP BY 1 ORDER BY 1;
-- (C) WRONG: bucket by UTC date
SELECT (occurred_at AT TIME ZONE 'UTC')::date AS day, count(*) FROM events WHERE city='Adelaide' GROUP BY 1 ORDER BY 1;
-- (D) CORRECT: IANA zone name
SELECT (occurred_at AT TIME ZONE 'Australia/Adelaide')::date AS day, count(*) FROM events WHERE city='Adelaide' GROUP BY 1 ORDER BY 1;
| day | (A) stored | (B) fixed +9:30 | (C) UTC | (D) IANA |
|---|---|---|---|---|
| 2026-06-14 | – | – | 1 | – |
| 2026-06-15 | 1 | 1 | – | 1 |
| 2026-12-14 | – | 1 | 1 | – |
| 2026-12-15 | 2 | 2 | 2 | 2 |
| 2026-12-16 | 1 | – | – | 1 |
Look at what the hardcoded offset (B) does to the chart: it invents a December 14 that had zero local orders and deletes December 16 entirely. The UTC bucketing (C) is even further off. An operator comparing yesterday to today would be reading numbers attributed to the wrong days — during exactly the weeks (DST transitions) when on-call attention is highest.
The abbreviation route fails the same way, because abbreviations like EST are
fixed offsets that ignore DST:
WITH n(label, occurred_at) AS (VALUES
('summer 2026-07-05 00:30 EDT', timestamp '2026-07-05 00:30' AT TIME ZONE 'America/New_York'),
('winter 2026-11-10 00:30 EST', timestamp '2026-11-10 00:30' AT TIME ZONE 'America/New_York')
)
SELECT label,
to_char(occurred_at,'YYYY-MM-DD HH24:MI') AS utc,
(occurred_at AT TIME ZONE 'EST')::date AS abbrev_est_fixed,
(occurred_at AT TIME ZONE 'America/New_York')::date AS iana_correct
FROM n;
| label | utc | abbrev_est_fixed | iana_correct |
|---|---|---|---|
| summer 2026-07-05 00:30 EDT | 2026-07-05 04:30 | ❌ 2026-07-04 | 2026-07-05 |
| winter 2026-11-10 00:30 EST | 2026-11-10 05:30 | 2026-11-10 | 2026-11-10 |
In summer, New York runs on EDT (-4), but 'EST' is frozen at -5, so it
pushes a just-past-midnight event back to the previous day. The lesson: never
feed an abbreviation or a number to AT TIME ZONE. Feed it an IANA name.
The clock-hour is wrong for the same reasons — plus two more
Everything above was about the calendar day. Hourly views — "busiest hour," "events per hour of day," shift reports — inherit every one of those bugs and add two that only exist at the hour level.
A sub-hour offset splits one local hour into two
Group by the UTC hour (or any whole-hour offset) and a +5:45 or +9:30 zone
never lines up with the bucket edges. Here are four Kathmandu events that all
happened inside the local 18:00 hour:
WITH e(occurred_at) AS (VALUES
(timestamp '2026-12-15 18:05' AT TIME ZONE 'Asia/Kathmandu'),
(timestamp '2026-12-15 18:20' AT TIME ZONE 'Asia/Kathmandu'),
(timestamp '2026-12-15 18:35' AT TIME ZONE 'Asia/Kathmandu'),
(timestamp '2026-12-15 18:50' AT TIME ZONE 'Asia/Kathmandu')
)
SELECT extract(hour from occurred_at)::int AS utc_hour,
extract(hour from occurred_at AT TIME ZONE 'Asia/Kathmandu')::int AS local_hour,
count(*)
FROM e GROUP BY 1,2 ORDER BY 1;
| utc_hour | local_hour | count |
|---|---|---|
| 12 | 18 | 3 |
| 13 | 18 | 1 |
One real peak hour becomes two shorter bars, at UTC 12 and 13. A "busiest hour" query reads the wrong answer — and a 30-minute zone like Adelaide does the same thing, because every local hour straddles two UTC hours.
DST fall-back: one local hour holds two hours of events
When clocks fall back, a local hour happens twice. Adelaide ends DST on
2026-04-05 local — the instant is 2026-04-04 16:30 UTC, where the clock runs
03:00 → 02:00. Drop an event every fifteen minutes across the transition and
bucket by local hour:
WITH inst AS (
SELECT g AS occurred_at
FROM generate_series(timestamptz '2026-04-04 14:30+00',
timestamptz '2026-04-04 18:15+00', interval '15 min') g
)
SELECT extract(hour from occurred_at AT TIME ZONE 'Australia/Adelaide')::int AS local_hour,
count(*) AS events
FROM inst GROUP BY 1 ORDER BY 1;
| local_hour | events |
|---|---|
| 1 | 4 |
| 2 | 8 |
| 3 | 4 |
Local hour 02 holds twice the events of its neighbours — two real hours
collapsed into one bucket. The spike is genuine, but it appears in your chart
every autumn and reads like a traffic surge. Only the timestamptz can tell the
two 02:xx hours apart.
DST spring-forward: a local hour that cannot exist
The opposite transition deletes an hour. When Adelaide starts DST, the clock jumps
01:59 → 03:00, so local 02:00–02:59 never happens:
SELECT to_char(g,'HH24:MI') AS utc,
to_char(g AT TIME ZONE 'Australia/Adelaide','HH24:MI') AS local
FROM generate_series(timestamptz '2026-10-03 16:00+00',
timestamptz '2026-10-03 17:00+00', interval '15 min') g;
utc | local
16:00 | 01:30
16:15 | 01:45
16:30 | 03:00 <- 02:xx is skipped entirely
16:45 | 03:15
17:00 | 03:30
No event can carry local hour 02 that day. And a system that keeps only a local
string can store the impossible 02:30 — Postgres silently springs it forward:
SELECT (timestamp '2026-10-04 02:30' AT TIME ZONE 'Australia/Adelaide');
-- 2026-10-03 17:00:00+00 -> that instant is 03:30 local: the event moved an hour
Hourly reporting needs the same fix as daily, with the instant doing extra work:
resolve and store the local hour (it lives in local_time) at write time for the
histogram, and keep the timestamptz so the duplicated and missing hours stay
distinguishable.
The correct query-time way — and why I still don't lean on it
So the "right" query-time answer is to convert each row through its own IANA
zone: occurred_at AT TIME ZONE event_tz. That is genuinely correct, and it is
the baseline every other approach is measured against above.
A claim you'll see repeated is that this expression can't be indexed. It can —
and the reason is worth understanding, because it comes down to the volatility of
the underlying timezone() function:
SELECT pg_get_function_identity_arguments(oid) AS signature,
pg_get_function_result(oid) AS returns,
provolatile
FROM pg_proc WHERE proname = 'timezone' ORDER BY provolatile, 1;
| signature | returns | volatility |
|---|---|---|
| text, timestamp with time zone | timestamp without time zone | IMMUTABLE |
| interval, timestamp with time zone | timestamp without time zone | IMMUTABLE |
| timestamp with time zone | timestamp without time zone | STABLE |
| time with time zone | time with time zone | STABLE |
The form called by occurred_at AT TIME ZONE event_tz is
timezone(text, timestamptz) — marked IMMUTABLE. The STABLE variants are
the ones that read the session's TimeZone setting instead of taking the zone as
an argument. Because the explicit-zone form is immutable, you can build an
expression index over it — even with a per-row zone column:
CREATE INDEX idx_expr_localdate
ON events (((occurred_at AT TIME ZONE event_tz)::date)); -- succeeds
So why don't I rely on query-time conversion as the primary design? Three reasons, in increasing order of importance:
- It's easy to get wrong in a hurry. The correct expression and the three broken ones from the top of this post look almost identical in a code review. The next person to touch the query has four plausible ways to write it and only one is right.
- You must store the per-row zone anyway. The conversion is only correct if every row carries its true IANA zone. Once that column exists, computing the local day once costs nothing extra.
IMMUTABLEhere isn't strictly true. Timezone rules genuinely change — governments add, drop, and shift DST. Postgres marks the function immutable so it can be indexed, but atzdataupdate can leave an expression index (or aSTOREDgenerated column) reflecting the old rules until youREINDEX. A value resolved and stored at write time, by contrast, freezes the civil date the user actually experienced — which is usually the answer you want.
What I actually store: the instant plus resolved local fields
The design is boring on purpose. Keep timestamptz as the source of truth, store
the IANA zone, and resolve the local date and time once, at write time, when
the originating zone is unambiguous:
-- written by the ingestion path, exactly once per event:
occurred_at timestamptz -- the instant; for ordering & duration math
event_tz text -- IANA zone, e.g. 'Australia/Adelaide'
local_date date -- (occurred_at AT TIME ZONE event_tz)::date
local_time time -- the wall-clock time the user saw
Why date and time, not a string? These types store no timezone. A date
is just a calendar date and a time is a wall-clock time, so they keep exactly
the value you write — in any session timezone (only timestamptz ever shifts). A
string works too, especially ISO 2026-12-15, which sorts correctly. But a typed
column is stronger: it rejects invalid dates, supports range scans and date math
(day + 7, BETWEEN, weekday extraction), and is smaller and indexes cleanly.
The timezone danger only ever lived in timestamptz and its conversions — never
in date or time.
If you'd rather the database own the derivation than trust every code path, PostgreSQL 17 will materialize it for you with a generated column — and it works with a per-row zone column:
ALTER TABLE events
ADD COLUMN gen_local_date date
GENERATED ALWAYS AS ((occurred_at AT TIME ZONE event_tz)::date) STORED;
-- gen_local_date matched the app-written local_date on every row.
There's a subtlety worth flagging, because you'll find blog posts that say
Postgres won't let you do this. They're describing the opposite direction:
generating a timestamptz (a timestamp → timestamptz conversion leans on the
session's TimeZone and is STABLE, so Postgres refuses it). Our column goes
timestamptz → date through an explicit zone, which calls the IMMUTABLE
overload, so it's allowed. The same tzdata caveat still applies: a STORED
generated column is computed at write time and not retroactively recomputed when
the rules change. For a "what day did the user see" column, that is a feature, not
a bug.
Now grouping is simple and readable, and the practical payoff is that it uses
an ordinary B-tree index. I loaded 500,000 events across four zones, indexed
local_date, and compared it to the query-time conversion with no matching
index:
CREATE INDEX idx_local_date ON ev_big (local_date);
-- (A) stored column
EXPLAIN ANALYZE
SELECT count(*) FROM ev_big WHERE local_date = DATE '2026-07-15';
-- (B) query-time conversion, no matching index
EXPLAIN ANALYZE
SELECT count(*) FROM ev_big WHERE (occurred_at AT TIME ZONE event_tz)::date = DATE '2026-07-15';
(A) Index Only Scan using idx_local_date
Index Cond: (local_date = '2026-07-15'::date)
Buffers: shared hit=20 read=4
Execution Time: 0.188 ms
(B) Seq Scan on ev_big
Filter: (((occurred_at AT TIME ZONE event_tz))::date = '2026-07-15'::date)
Rows Removed by Filter: 498628
Buffers: shared hit=4425
Execution Time: 106.248 ms
Same answer, ~560× apart. To be fair to the query-time approach: adding the expression index from the previous section brings (B) back down to 0.169 ms. But that re-introduces the immutability caveat, and it is one more index a teammate has to know exists. The stored column is the version a new engineer will use by default.
The caveats you must respect
Storing the local fields is a denormalization, so treat it with the usual discipline:
timestamptzstays the source of truth. Order, diff, and do all "real time" math on the instant. Never sort across cities bylocal_date/local_time—00:30in Jakarta and00:30in New York are nowhere near each other in real time. The local fields are for grouping and display, not chronology.- Write it through one path. A single ingestion function (or the generated
column) should own the derivation. If two code paths compute it differently, you
get a denormalization bug that no
GROUP BYwill reveal. - The DST "fall-back" hour is genuinely ambiguous, and it's the reason you
keep the instant. When Adelaide ends DST on 2026-04-05, the wall clock goes
03:00 → 02:00, so local02:30happens twice:
SELECT
(timestamp '2026-04-05 02:30' AT TIME ZONE 'Australia/Adelaide') AS what_pg_picks,
(timestamptz '2026-04-05 02:30+10:30') AS if_still_dst,
(timestamptz '2026-04-05 02:30+09:30') AS if_standard;
what_pg_picks | if_still_dst | if_standard
2026-04-04 17:00:00+00 | 2026-04-04 16:00:00+00 | 2026-04-04 17:00:00+00
The two interpretations are an hour apart, and a bare local string cannot tell
them apart — only the timestamptz can. (Postgres resolves the ambiguous literal
to standard time here, but you should not depend on which one it picks; capture
the instant at the source.)
Takeaways
- A "day" — and an "hour" — belongs to a zone, not to an instant. Decide
whose day before you
GROUP BY. - For bucketing, never use the UTC date/hour, a hardcoded numeric offset, or a timezone abbreviation. They break on half-hour/45-minute zones and on every DST transition.
- Hourly views have two extra traps: a sub-hour offset splits one local hour
across two UTC buckets, and DST gives one day a doubled hour and another a
missing one. The
timestamptzis what keeps those hours distinguishable. - Query-time
occurred_at AT TIME ZONE '<IANA>'with a per-row zone is correct, and — contrary to a common belief — it is indexable. - I still resolve and store
local_date/local_timeat write time: correct by construction, immune to later rule changes, sargable with a plain B-tree, and impossible for the next engineer to get subtly wrong. - Keep the
timestamptzand the IANA zone as the source of truth; the local fields are a fast, honest projection for the dashboard.
How this lines up with the documentation
These references corroborate the approach and go deeper on the parts a single post can't:
- PostgreSQL — Date/Time Types
is the primary source for the abbreviation trap:
EST"specifies UTC-5, regardless of whether daylight savings was nominally in effect," and constant numeric offsets can't follow DST. That's exactly the failure in the New York table. - Tinybird — 10 best practices for timestamps and time zones
recommends storing UTC plus the offset, the IANA zone name, and the local
timestamp as a string — the same instinct as keeping
event_tz+local_datebesideoccurred_at. It's also where the fifteen-minute pre-aggregation limit comes from. - Crunchy Data — British Columbia, Time Zones, and Postgres
is the real-world version of the
tzdatacaveat: when a region changes its rules, values derived throughAT TIME ZONEshift, and the fix is a deliberate recompute of the stored local columns — not silently trusting an index. - Cybertec — Time zone management in PostgreSQL
is a good companion on how
timestamptzand the sessionTimeZoneinteract. - The Postgres mailing-list thread
"shouldn't timezone(text, timestamp[tz]) be STABLE?"
is where the "
IMMUTABLEis a pragmatic fib" claim comes from — the function's volatility was questioned precisely because timezone rules change, yet the immutable marking stays for indexability.
It's the same idea behind keeping each source system's original reference visible when you build dashboards that connect different systems: store the precise truth, and keep the readable projection next to it, so queries don't have to recompute it.