PostgreSQL – Extract Week Number from Character Varying Column

postgresqlpostgresql-9.3

I have a table in which the closure_date which is represented as a character varying column. Some of the dates are saved as "43684.5708564815" and also as "2019-05-24 18:51:17". Now I need to extract the week number from the column. I tried the following the query :

 SELECT closure_date, 
 extract('week' from timestamp '1899-12-30' + interval '1 day' * cast(closure_date as  double 
 precision)) as closure_week,
 FROM <table_name>
 LIMIT 10000

This query runs fine for values like 43684.5708564815 however result into an error while trying to extract the week_number from "2019-05-24 18:51:17" stating :

 ERROR:  invalid input syntax for type double precision: "2019-05-24 18:51:17"
 SQL state: 22P02

How to handle the error? Any suggestions?

Best Answer

You can use a CASE expression with a regex to test if the string is formatted like an ISO date or not:

SELECT closure_date, 
       case 
         when closure_date ~ '[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9:]{8}' 
           then extract(week from closure_date::timestamp)
         else
           extract('week' from timestamp '1899-12-30' + (interval '1 day' * cast(closure_date as double precision))) 
        end as closure_week
FROM <table_name>

The ~ operator compares the closure_date with the regular expression to see if it matches that pattern.

Note that if you have different formats, this could still fail and you need to add more checks.

If you have the chance, you should really change that column to be a timestamp rather than keeping this badly formatted string.

`DISTINCT ON()`

Just as a side note, this is precisely what DISTINCT ON() does (not to be confused with DISTINCT)

SELECT DISTINCT ON ( expression [, ...] ) keeps only the first row of each set of rows where the given expressions evaluate to equal. The DISTINCT ON expressions are interpreted using the same rules as for ORDER BY (see above). Note that the "first row" of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first. For example

So if you were to write,

SELECT myFirstAgg(z)
FROM foo
GROUP BY x,y;

It's effectively

SELECT DISTINCT ON(x,y) z
FROM foo;
-- ORDER BY z;

In that it takes the first z. There are two important differences,

You can also select other columns at no cost of further aggregation..

SELECT DISTINCT ON(x,y) z, k, r, t, v
FROM foo;
-- ORDER BY z, k, r, t, v;

Because there is no GROUP BY you can not use (real) aggregates with it.

CREATE TABLE foo AS
SELECT * FROM ( VALUES
  (1,2,3),
  (1,2,4),
  (1,2,5)
) AS t(x,y,z);

SELECT DISTINCT ON (x,y) z, sum(z)
FROM foo;

-- fails, as you should expect.
SELECT DISTINCT ON (x,y) z, sum(z)
FROM foo;

-- would not otherwise fail.
SELECT myFirstAgg(z), sum(z)
FROM foo
GROUP BY x,y;

Don't forget `ORDER BY`

Also, while I didn't bold it then I will now

Note that the "first row" of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first. For example

Always use an ORDER BY with DISTINCT ON

Using an Ordered-Set Aggregate Function

I imagine a lot of people are looking for first_value, Ordered-Set Aggregate Functions. Just wanted to throw that out there. It would look like this, if the function existed:

SELECT a, b, first_value() WITHIN GROUP (ORDER BY z)    
FROM foo
GROUP BY a,b;

But, alas you can do this.

SELECT a, b, percentile_disc(0) WITHIN GROUP (ORDER BY z)   
FROM foo
GROUP BY a,b;

Postgresql – Surprising results for data types with type modifier

This is due to relation attributes (defined in pg_class and pg_attribute, or defined dynamically from a select statement) supporting modifiers (via pg_attribute.atttypmod), whilst function parameters do not. Modifiers are lost when processed through functions, and since all operators are handled via functions, modifiers are lost when processed by operators as well.

Functions with output values, or that return sets of record, or the equivalent returns table(...) are also unable to retain any modifiers included in the definition. However, tables that return setof <type> will retain (actually, probably typecast to) any modifiers defined for type in pg_attribute.

Best Answer

Related Solutions

PostgreSQL – Is There a Type-Safe first() Aggregate Function?

DISTINCT ON()

Don't forget ORDER BY

Using an Ordered-Set Aggregate Function

Postgresql – Surprising results for data types with type modifier

Related Question

`DISTINCT ON()`

Don't forget `ORDER BY`