You can definitely keep all your dimensions and measures in one fact table and not use any dimension tables. Make sure your OLAP tool supports this though.
Normalizing out your dimensions into other tables is done mostly to minimize the size of the fact table, which can get large fast.
With no dimension tables you're looking at about 336 MB per year (not counting indexes), which isn't so bad.
With dimension tables, you're looking at about 34 MB per year, plus a couple dozen MB for storing dimension details. Indexes will be smaller too.
You'll want to expand your date column into something more analyzable (year, month, quarter, etc), which will add to the size.
You'll want to index all fields. Drop indexes before insert, add them after.
You can use a tool like Pentaho Aggregation Designer to find useful aggregates and generate them for you.
Since the seconds are from 0 (1970-01-01 00:00:00 UTC
), you should look for every multiple of 60
SELECT * FROM mytable WHERE MOD(TimeStamp,60)=0;
or if TimeStamp is indexed, you can do
SELECT T.* FROM
(SELECT TimeStamp FROM mytable WHERE MOD(TimeStamp,60)=0) M
INNER JOIN mytable T USING (TimeStamp);
Give it a Try !!!
SUGGESTION #1
You should store the timestamp of the minute and index it
ALTER TABLE mytable ADD COLUMN MinuteTimeStamp AFTER TimeStamp;
UPDATE mytable SET MinuteTimeStamp = TimeStamp - MOD(TimeStamp,60);
ALTER TABLE mytable ADD INDEX MinuteTimeStamp_UniqueKey_ndx (MinuteTimeStamp,UniqueKey);
Then, you can do MIN aggregation on MinuteTimeStamp.
SELECT MinuteTimeStamp,MIN(UniqueKey) UniqueKey
FROM mytable GROUP BY MinuteTimeStamp;
and use it get those records
SELECT B.* FROM
(SELECT MinuteTimeStamp,MIN(UniqueKey) UniqueKey
FROM mytable GROUP BY MinuteTimeStamp) A
INNER JOIN mytable B USING (UniqueKey);
It was tactfully pointed out that triggers would degrade performance
Perhaps doing INSERTs like this may help
INSERT INTO mytable (UniqueKey,TimeStamp,MinuteTimeStamp) VALUES
(
uniquevalue,
UNIX_TIMESTAMP(NOW()),
UNIX_TIMESTAMP(NOW() - INTERVAL SECOND(NOW()) SECOND)
);
SUGGESTION #2
Since you have over 1000 columns (Ugh), perhaps a table of those minute timestamps would be better.
CREATE TABLE MinuteKeys
(
MinuteTimeStamp INT UNSIGNED NOT NULL,
UniqueKey INT UNSIGNED NOT NULL,
PRIMARY KEY (UniqueKey)
KEY MinuteTimeStamp_UniqueKey_ndx (MinuteTimeStamp,UniqueKey)
) ENGINE=MyISAM;
ALTER TABLE MinuteKeys DISABLE KEYS;
INSERT INTO MinuteKeys SELECT TimeStamp - MOD(TimeStamp,60),UniqueKey FROM mytable;
ALTER TABLE MinuteKeys ENABLE KEYS;
Then, you could use that table for the aggregation
SELECT B.* FROM
(SELECT MinuteTimeStamp,MIN(UniqueKey) UniqueKey
FROM MinuteKeys GROUP BY MinuteTimeStamp) A
INNER JOIN mytable B USING (UniqueKey);
EPILOGUE
Other suggestions are possible but you should really consider normalization of the table
See my post Too many columns in MySQL as to why
Best Answer
Here's general recommendations for keeping table size small
ROW_FORMAT=COMPRESSED
when creating InnoDB tables. There will be impact on writes.OPTIMIZE TABLE
often if it gets many DELETE/UPDATE'sNormalizing tables would actually result in bigger overall size, but less. It MAY help with caching, though.