MySQL Computed Columns – Avoiding Nested REPLACE() with ISO 8601 Timezone Offset

computed-columnMySQLmysql-5.7timestamptimezone

I am using MySQL 5.7 and have a table that has the following columns that are being used to store some data that is source from an Apache common log formatted access log; details extracted from a MySQL schema export:

`timestamp` timestamp NULL DEFAULT NULL,
`offset` varchar(5) CHARACTER SET utf8 COLLATE utf8_unicode_ci DEFAULT NULL
`date` date GENERATED ALWAYS AS ((`timestamp` + interval replace(replace(replace(`offset`,'-0','-'),'+0','+'),'00','') hour)) VIRTUAL
`time` time GENERATED ALWAYS AS ((`timestamp` + interval replace(replace(replace(`offset`,'-0','-'),'+0','+'),'00','') hour)) VIRTUAL
`hour` int(2) GENERATED ALWAYS AS (hour((`timestamp` + interval replace(replace(replace(`offset`,'-0','-'),'+0','+'),'00','') hour))) VIRTUAL

As you can see I am storing timestamp (0000-00-00 00:00:00) and offset (+00:00) and then using generated columns to calculate, date (0000-00-00), time (00:00:00) and hour (0) values. And this is working well: I get to store the timestamp as UTC and then—by storing the offset value—I can dynamically get other info in non-UTC form.

But I don’t think the chained/nested REPLACE items are too hot. The goal is to be able to take an ISO 8601 timezone offset value like -0400 or +1000 as part of an interval [some value] hour calculation.

So is there a better way to approach this? I might be able to adjust the way that the timezone offset is stored initially so it is—essentially—exactly what I need for column calculations, but that seems messy and non-intuitive so I would rather use that +/-[for digit] format if possible.

Best Answer

After reading up on CONVERT_TZ, it seems like it would be the cleaner method of handling a case like this since I can cut down three (!!!) REPLACE statements with one like this:

CONVERT_TZ(`timestamp`,'+00:00',REPLACE(`offset`,'00',':00'));

It still seems a bit sloppy—since I am still dealing with having to convert offset values like -0400 to -04:00—but does work and is cleaner/easier to read and understand.

That said, the above method will fail for offset values such as +0530, +1000 and +0000. So instead of that, using the following method—which uses the INSERT() string function where colon (:) insertion is based on an offset value from the right calculated by the length of the string itself—is cleaner and works for a variety of different offsets:

CONVERT_TZ(`timestamp`,'+00:00',INSERT(`offset`,LENGTH(`offset`)-1,0,':'));

And these are the results using that with the following sample timestamps and related offsets:

SELECT CONVERT_TZ('2018-05-28 02:34:58','+00:00',INSERT('+0300',LENGTH('+0300')-1,0,':'));
SELECT CONVERT_TZ('2018-05-28 07:50:12','+00:00',INSERT('+0400',LENGTH('+0400')-1,0,':'));
SELECT CONVERT_TZ('2018-05-28 09:23:34','+00:00',INSERT('+0530',LENGTH('+0530')-1,0,':'));
SELECT CONVERT_TZ('2018-05-28 12:16:56','+00:00',INSERT('+1000',LENGTH('+1000')-1,0,':'));
SELECT CONVERT_TZ('2018-05-28 16:07:17','+00:00',INSERT('-0200',LENGTH('-0200')-1,0,':'));
SELECT CONVERT_TZ('2018-05-28 20:02:05','+00:00',INSERT('-0700',LENGTH('-0700')-1,0,':'));
SELECT CONVERT_TZ('2018-05-28 23:33:03','+00:00',INSERT('-1000',LENGTH('-1000')-1,0,':'));
SELECT CONVERT_TZ('2018-05-28 23:33:03','+00:00',INSERT('-0000',LENGTH('-0000')-1,0,':'));

And here are the results:

  • 2018-05-28 02:34:58 with an offset of +0300 becomes: 2018-05-28 05:34:58
  • 2018-05-28 07:50:12 with an offset of +0400 becomes: 2018-05-28 11:50:12
  • 2018-05-28 09:23:34 with an offset of +0530 becomes: 2018-05-28 14:53:34
  • 2018-05-28 12:16:56 with an offset of +1000 becomes: 2018-05-28 22:16:56
  • 2018-05-28 16:07:17 with an offset of -0200 becomes: 2018-05-28 14:07:17
  • 2018-05-28 20:02:05 with an offset of -0700 becomes: 2018-05-28 13:02:05
  • 2018-05-28 23:33:03 with an offset of -1000 becomes: 2018-05-28 13:33:03
  • 2018-05-28 23:33:03 with an offset of -0000 becomes: 2018-05-28 23:33:03