Data type for a fuzzy date

datatypes

I want to store partial information related to a date. I might know the year and month but not the day. I might know the day and month but not the year. I might know the date lies in an open or closed interval. What are my options for modeling this type of data?

Best Answer

I've done this for attorneys in the past. I used an ISO-style date format (yyyy-mm-dd) stored as char(10). Any missing part used question marks.

2002-01-?? -- some day in January 2002
199?-??-?? -- some day in the 1990s
????-01-08 -- January 8th in an unknown year

Values like these are intuitive to people doing data entry, they sort fairly sensibly, and the format can be controlled with CHECK constraints. You lose date and time arithmetic, but when you have unknown dates that usually doesn't matter too much.

I might know the date lies in an open or closed interval.

That's actually a different kind of information than, say, knowing that something happened on April 1, but not knowing which year. Off the top of my head, you could store four columns.

starting date
ending date
whether the starting date is or is not in the interval
whether the ending date is or is not in the interval

I'd expect to have to store that kind of information separately from the kind of data I mentioned first. "January 8th in an unknown year" would be particularly troublesome to represent in either an open or closed interval.

The latest release of PostgreSQL (9.2) includes support for range types.

Related Solutions

MySQL data type for 128 bit integers

I don't know what the best way necessarily is to store it -- but there's at least a better option than using a varchar(39) (or varchar(40) if you needed it signed) ; instead use a decimal(39,0). From the mysql docs:

Fixed-Point (Exact-Value) Types

The DECIMAL and NUMERIC types store exact numeric data values. These types are used when it is important to preserve exact precision, for example with monetary data. In MySQL, NUMERIC is implemented as DECIMAL, so the following remarks about DECIMAL apply equally to NUMERIC.

MySQL 5.1 stores DECIMAL values in binary format. Before MySQL 5.0.3, they were stored as strings. See Section 11.18, “Precision Math”.

In a DECIMAL column declaration, the precision and scale can be (and usually is) specified; for example:
salary DECIMAL(5,2)
In this example, 5 is the precision and 2 is the scale. The precision represents the number of significant digits that are stored for values, and the scale represents the number of digits that can be stored following the decimal point.

Standard SQL requires that DECIMAL(5,2) be able to store any value with five digits and two decimals, so values that can be stored in the salary column range from -999.99 to 999.99.

In standard SQL, the syntax DECIMAL(M) is equivalent to DECIMAL(M,0). Similarly, the syntax DECIMAL is equivalent to DECIMAL(M,0), where the implementation is permitted to decide the value of M. MySQL supports both of these variant forms of DECIMAL syntax. The default value of M is 10.

If the scale is 0, DECIMAL values contain no decimal point or fractional part.

The maximum number of digits for DECIMAL is 65, but the actual range for a given DECIMAL column can be constrained by the precision or scale for a given column. When such a column is assigned a value with more digits following the decimal point than are permitted by the specified scale, the value is converted to that scale. (The precise behavior is operating system-specific, but generally the effect is truncation to the permissible number of digits.)

It's stored packed, so it'll take up less space than the varchar (18 bytes, if I'm doing my math right), and I'd hope you'd be able to do math on it directly, but I've never tried with that large of a number to see what happens.

Postgresql – Single data type for imprecise date values, as allowed by ISO 8601

No, the interval type supports reduced precision but none of the other date/time types do.

Postgres allows you to roll your own with create type but unfortunately wont allow contraints to be added to the type which limits it's usefulness in this scenario. The best I can come up with requires you to repeat check constraints on every field where the fuzzy type is used:

create type preciseness as enum('day', 'month', 'year');
create type fuzzytimestamptz as (ts timestamptz, p preciseness);
create table t( id serial primary key,
                fuzzy fuzzytimestamptz
                    check( (fuzzy).ts is not null 
                           or ((fuzzy).ts is null and (fuzzy).p is not null) ),
                    check((fuzzy).ts=date_trunc('year', (fuzzy).ts) or (fuzzy).p<'year'),
                    check((fuzzy).ts=date_trunc('month', (fuzzy).ts) or (fuzzy).p<'month'),
                    check((fuzzy).ts=date_trunc('day', (fuzzy).ts) or (fuzzy).p<'day') );

insert into t(fuzzy) values (row(date_trunc('year', current_timestamp), 'year'));
insert into t(fuzzy) values (row(date_trunc('month', current_timestamp), 'month'));
insert into t(fuzzy) values (row(date_trunc('day', current_timestamp), 'day'));

select * from t;

 id |              fuzzy
----+----------------------------------
  1 | ("2011-01-01 00:00:00+00",year)
  2 | ("2011-09-01 00:00:00+01",month)
  3 | ("2011-09-23 00:00:00+01",day)

--edit - an example equality operator:

create function fuzzytimestamptz_equality(fuzzytimestamptz, fuzzytimestamptz)
                returns boolean language plpgsql immutable as $$
begin
  return ($1.ts, $1.ts+coalesce('1 '||$1.p, '0')::interval)
         overlaps ($2.ts, $2.ts+coalesce('1 '||$2.p, '0')::interval);
end;$$;
--
create operator = ( procedure=fuzzytimestamptz_equality, 
                    leftarg=fuzzytimestamptz, 
                    rightarg=fuzzytimestamptz );

sample query:

select *, fuzzy=row(statement_timestamp(), null)::fuzzytimestamptz as equals_now,
          fuzzy=row(statement_timestamp()+'1 day'::interval, null)::fuzzytimestamptz as equals_tomorrow,
          fuzzy=row(date_trunc('month', statement_timestamp()), 'month')::fuzzytimestamptz as equals_fuzzymonth,
          fuzzy=row(date_trunc('month', statement_timestamp()+'1 month'::interval), 'month')::fuzzytimestamptz as equals_fuzzynextmonth
from t;
 id |               fuzzy                | equals_now | equals_tomorrow | equals_fuzzymonth | equals_fuzzynextmonth
----+------------------------------------+------------+-----------------+-------------------+-----------------------
  1 | ("2011-01-01 00:00:00+00",year)    | t          | t               | t                 | t
  2 | ("2011-09-01 00:00:00+01",month)   | t          | t               | t                 | f
  3 | ("2011-09-24 00:00:00+01",day)     | t          | f               | t                 | f
  4 | ("2011-09-24 11:45:23.810589+01",) | f          | f               | t                 | f

Best Answer

Related Solutions

MySQL data type for 128 bit integers

Postgresql – Single data type for imprecise date values, as allowed by ISO 8601

Related Question