PostgreSQL – Time Stored Using now()::timestamp Function Stores Wrong Value

postgresqltimezone

I'm using AWS RDS for a PostgreSQL 11 database with PGAdmin. I store creation date of every row using now()::timestamp to store date and time without time zone. The problem is the time stored is never the value I want (UTC+05:30).

If I set parameter timezone as UTC in RDS, then the time stored is 5:30 hours behind India's time but if I set UTC+05:30, it stores either 11 hours behind time or 1 hour more, this happens randomly in random rows. I haven't changed anything else, just this timezone parameter.

I searched a lot but didn't find any similar answer.

Best Answer

tl;dr

Instead of the data type TIMESTAMP WITHOUT TIME ZONE, define your column for the data type TIMESTAMP WITH TIME ZONE.
Set the default time zone of your session within PgAdmin to UTC.
To adjust the retrieved UTC value into a particular time zone in SQL, call AT TIME ZONE 'Asia/Kolkata'. (But better to leave that zone adjustment to your application code rather than your SQL.)

`TIMESTAMP WITHOUT TIME ZONE`

to store date and time without time zone.

You are using the wrong data type when defining your column.

The TIMESTAMP WITHOUT TIME ZONE data type cannot represent a moment.

This type purposely lacks the context of a time zone or offset-from-UTC. So you have a date and a time-of-day, such as noon on the 23rd of next January. But we have no way to know if you meant noon in Tokyo Japan, noon in Toulouse France, or noon in Toledo Ohio US, all very different moments several hours apart.

Any time zone or offset information passed along with an input is ignored when reaching a column of type TIMESTAMP WITHOUT TIME ZONE.

`TIMESTAMP WITH TIME ZONE`

The TIMESTAMP WITH TIME ZONE type can track a moment, a specific point on the timeline.

Any time zone or offset information passed along with the date and time-of-day is used to adjust into UTC. The zone/offset is then discarded. So if you care about the original zone/offset, you need to explicitly store that in an extra column.

When retrieving a TIMESTAMP WITH TIME ZONE value, the value will always be in UTC (an offset of zero hours-minutes-seconds). However, beware of tools, middleware, or drivers carrying the anti-feature of injecting some default time zone or offset onto the retrieved value. This creates the illusion of that zone/offset having been stored when in fact Postgres always stores in UTC values of the data type TIMESTAMP WITH TIME ZONE.

PgAdmin is one of the tools with this unfortunate anti-feature. I suggest always setting the default zone of the session to UTC to see your retrieved values as they were stored.

Adjusting to time zone

I suggest you learn to think and work as a programmer in UTC. Most of your business logic in programming should be in UTC. Adjust into a time zone only for presentation to the user or where business logic demands. So this means that in most cases you should adjust to time zone only in your app code rather than in your SQL and database admin tools.

`AT TIME ZONE` function

But if you insist on adjusting to time zone with your SQL, use the AT TIME ZONE function.

You mentioned the offset of five and a half hours ahead of UTC, +05:30. I assume you mean time in India. Always prefer a real time zone name rather than a particular offset. Politicians frequently change the offset used by their jurisdictions. So hard-coding an offset could lead to faulty results.

Keep the tzdata file in your Postgres installation up-to-date, and then let Postgres determine the offset appropriate to the named time zone for that particular moment.

… AT TIME ZONE 'Asia/Kolkata' …

Various date-time data types

This chart I made for Java programmers may also be of use here, to better understand the various data types defined by the SQL standard.

Firstly...

Separating Datime/Time into a Date dimension and a Time dimension is definitely the way to go.

To manage multiple time zones you need to duplicate the DateKey and the TimeKey so that you have the following:

LocalDateKey
LocalTimeKey
UtcDateKey
UtcTimeKey

You say...

The problem I am having with all that is that 11:00 PM on Tuesday, December 31, 2013 in UTC is Wednesday, January 1st, 2014 in all time zones that are after UTC+2.

By having the 4 columns I've listed above you, will be able to join the fact table to the Date and/or Time dimension Using Table Aliases (in Kimball terminology these aliased dimension tables are known as "Role Playing Dimensions"), so you would have something like the following:

/*
    Assumes the following:
        - [DateLongName] has the format of this example "Tuesday, December 31, 2013"
        - [TimeShortName] has the format of this example "11:00 PM"
        - Both [DateLongName] & [TimeShortName] are strings
*/
select
    -- Returns a string matching this example  "11:00 PM Tuesday, December 31, 2013"
    localTime.TimeShortName + ' ' + localDate.DateLongName
    ,utcTime.TimeShortName + ' ' + utcDate.DateLongName
    ,f.*
from
    FactTableName  AS f

    -- Local Date and Local Time joins          
    inner join dbo.Date  AS localDate
        on localDate.DateKey = f.LocalDateKey

    inner join dbo.Time  AS localTime
        on localTime.TimeKey = f.LocalTimeKey 

    -- Utc Date and Utc Time joins    
    inner join dbo.Date  AS utcDate
        on utcDate.DateKey = f.UtcDateKey

    inner join dbo.Time  AS utcTime
        on utcTime.TimeKey = f.UtcTimeKey

In closing...

As you're building a data mart, and not an OLTP database, the generation of the Local and Utc times should be performed in your ETL, NOT in any client side applications for the following reasons (apart from localisation of the UTC time to the report reader's perspective):

Having the calculation reside in any queries places an extra performance burden on them, multiplied by the number of times you have to run said query for any reports you have (this matters when reading millions of rows)
Extra burden of ensuring the calculation is maintained correctly in each query (especially when you take daylight savings time into account)
Prevent range scanning of any indexes the column is part of, as you'll be performing a calculation on the column which forces queries to perform index scans instead of seeks (which are typically more expensive as each data page is needed to be read); this is known as being non-sargable.
- Edit due to comments: This applies if you push the conversion down into the actual query.
Using the concept of having the additional UTC dates and times available, there is nothing stopping you from taking this concept and extending it by calling this StandardisedDateKey, or CorporateHQDateKey, where instead of a UTC date table you standardise based on some other business agreed standard
Having the two separate column types (Local and UTC), allows for side-by-side comparison across geographic distance. Think -> someone in Australia enters a record that is timestamped with both Local and UTC, someone in New York reads the report with the Local (Australia) date and time and the New York representation of the UTC date and time, thereby seeing that something their Australian counterpart did during the middle of the day (Australia time) happened in the middle of the night their time (New York time). This comparison of time is indispensable in multi-national businesses.

PostgreSQL – How to Generate Time in Milliseconds in Stored Procedure

I understand one could use something like EXTRACT on CURRENT_TIMESTAMP when selecting rows

You don't need extract in order to get the data:

select *
from your_table
where timestamp_column >= timestamp '2007-08-24 18:34:01';

will happily use an index on moment_in_time and will be very efficient - in fact there will be no difference in performance compared to

select *
from your_table
where milliseconds_column >= 1187906400000;

Now look at the two statements and tell me for which one you immediately understand what the query will return.

If you do need those milliseconds in your application then you can always do something like this:

select some_col, 
       other_col,
       extract(epoch from timestamp_column) * 1000 as millis
from your_table
where timestamp_column >= timestamp '2007-08-24 18:34:01';

And again, I'm almost 100% certain that you won't see any performance difference to:

select some_col, 
       other_col,
       milliseconds_column
from your_table
where  milliseconds_column >= 1187906400000;

Plus: if you do store the milliseconds and you want to display the real date, you always need to apply an additional conversion:

select some_col, 
       other_col,
       to_char(timestamp_column, 'yyyy-mm-dd') as the_date
from your_table
where timestamp_column >= timestamp '2007-08-24 18:34:01';

vs.

select some_col, 
       other_col,
       to_char(to_timestamp(milliseconds_column / 1000), 'yyyy-mm-dd) as the_date
from your_table
where  milliseconds_column >= 1187906400000;

The second one one be slower than the first one, but it's more typing and makes the code harder to read.

Edit:

how to create a stored procedure that will convert this timestamp into the milliseconds I want (and the reverse function too)?

You don't need to write your own function:

convert timestamp to milliseconds:
extract(epoch from current_timestamp) * 1000 + extract(milliseconds from current_timestamp)
convert milliseconds to timestamp: to_timestamp(1187980443530 / 1000)

So to wrap it up:

Go with a timestamp column and don't worry about performance.

Best Answer

tl;dr

TIMESTAMP WITHOUT TIME ZONE

TIMESTAMP WITH TIME ZONE

Adjusting to time zone

AT TIME ZONE function