Implication is that you need ~1144Mb extra storage to index column time for ~100M rows...
Because the InnoDB engine wiil store the data off an PRIMARY or UNIQUE index within an non PRIMARY or UNIQUE index, as result your secondary index will be become larger
How much larger?? you can calculate it with this formula
int = 4 bytes
datetime = 8 bytes
100000000 records * (4 + 8 bytes) =
100000000 * 12 bytes ~ 1200000000 bytes ( 1144.40918 Mb ) extra storage (note index records/page overhead are not in the calculation)
An larger index size will slow down inserts, delete and only updates when you update an value whats indexed..
An larger index size in thoery can slow down selects because off the InnoDB index page off 16K (read http://www.ovaistariq.net/733/)
But still it depends on innodb configuration and cached data within the innodb buffer pool..
Or maybe you can use your approach by using an lookup table
CREATE TABLE tracker_snapshot_lookup (
tracker_date DATE NOT NULL
, tracker_snapshot_start_id INT UNSIGNED NOT NULL
, tracker_snapshot_end_id INT UNSIGNED NOT NULL
, PRIMARY KEY(tracker_date)
-- Covering index below is overkill...
-- , PRIMARY KEY(tracker_date, tracker_snapshot_start_id, tracker_snapshot_end_id)
) ENGINE = InnoDB;
insert into tracker_snapshot_lookup values('2013-11-13', 1, 10000);
insert into tracker_snapshot_lookup values('2013-11-14', 10001, 20000);
If you use an JOIN or deliverd table the MySQL optimzer can use in worse case
1 index key (Random disk I/O) lookup on tracker_snapshot_lookup.date (assuming with WHERE tracker_date = '2013-11-13' )
1 table (Random disk I/O) record key for tracker_snapshot_start_id and tracker_snapshot_end_id (not necessary when you make it an covering index)
Based on tracker_snapshot_start_id and tracker_snapshot_end_id MySQL will most likly choose an range scan (sequential disk I/O what is low costing with I/O waittime) on the tracker_snapshot table.
Your savings
DATE 3 bytes
INT NOT NULL 4 bytes
So in one year you lose on storage...
Table data
356 days * (3 + 4 bytes)
356 * 12 = 4272 bytes ( 0.004 Mb )
Index data
356 days * (3 bytes) = 1068 bytes ( 0.001 Mb )
It's magic because you use that ~1143Mb storage space for more important data
Since the seconds are from 0 (1970-01-01 00:00:00 UTC
), you should look for every multiple of 60
SELECT * FROM mytable WHERE MOD(TimeStamp,60)=0;
or if TimeStamp is indexed, you can do
SELECT T.* FROM
(SELECT TimeStamp FROM mytable WHERE MOD(TimeStamp,60)=0) M
INNER JOIN mytable T USING (TimeStamp);
Give it a Try !!!
SUGGESTION #1
You should store the timestamp of the minute and index it
ALTER TABLE mytable ADD COLUMN MinuteTimeStamp AFTER TimeStamp;
UPDATE mytable SET MinuteTimeStamp = TimeStamp - MOD(TimeStamp,60);
ALTER TABLE mytable ADD INDEX MinuteTimeStamp_UniqueKey_ndx (MinuteTimeStamp,UniqueKey);
Then, you can do MIN aggregation on MinuteTimeStamp.
SELECT MinuteTimeStamp,MIN(UniqueKey) UniqueKey
FROM mytable GROUP BY MinuteTimeStamp;
and use it get those records
SELECT B.* FROM
(SELECT MinuteTimeStamp,MIN(UniqueKey) UniqueKey
FROM mytable GROUP BY MinuteTimeStamp) A
INNER JOIN mytable B USING (UniqueKey);
It was tactfully pointed out that triggers would degrade performance
Perhaps doing INSERTs like this may help
INSERT INTO mytable (UniqueKey,TimeStamp,MinuteTimeStamp) VALUES
(
uniquevalue,
UNIX_TIMESTAMP(NOW()),
UNIX_TIMESTAMP(NOW() - INTERVAL SECOND(NOW()) SECOND)
);
SUGGESTION #2
Since you have over 1000 columns (Ugh), perhaps a table of those minute timestamps would be better.
CREATE TABLE MinuteKeys
(
MinuteTimeStamp INT UNSIGNED NOT NULL,
UniqueKey INT UNSIGNED NOT NULL,
PRIMARY KEY (UniqueKey)
KEY MinuteTimeStamp_UniqueKey_ndx (MinuteTimeStamp,UniqueKey)
) ENGINE=MyISAM;
ALTER TABLE MinuteKeys DISABLE KEYS;
INSERT INTO MinuteKeys SELECT TimeStamp - MOD(TimeStamp,60),UniqueKey FROM mytable;
ALTER TABLE MinuteKeys ENABLE KEYS;
Then, you could use that table for the aggregation
SELECT B.* FROM
(SELECT MinuteTimeStamp,MIN(UniqueKey) UniqueKey
FROM MinuteKeys GROUP BY MinuteTimeStamp) A
INNER JOIN mytable B USING (UniqueKey);
EPILOGUE
Other suggestions are possible but you should really consider normalization of the table
See my post Too many columns in MySQL as to why
Best Answer
TIMESTAMP
,DATETIME
,INT
,FLOAT
,DECIMAL
, etc are effectively "numbers". Numbers can be ordered/searched/etc with ordinaryBTREE
indexes. There is no need forSPATIAL
for such one-dimensional quantities.For date-like quantities, I like to do:
This avoids dealing with leap years, kludges like '23:59:59', inclusive/exclusive (not the careful choice of inequality operators), etc.