The following is just insane ranting and raving...
If you leave all data in one table (no partitioning), you will have O(log n) search times using a key. Let's take the worst index in the world, the binary tree. Each tree node has exactly one key. A perfectly balanced binary tree with 268,435,455 (2^28 - 1) tree nodes would be a height of 28. If you split up this binary tree into 16 separate trees, you get 16 binary trees each with 16,777,215 (2^24 - 1) tree nodes for a height of 24. The search path is reduced by 4 nodes, a 14.2857 % height reduction. If the search time is in microseconds, a 14.2857 % reduction in search time is nil-to-negligible.
Now in the real world, a BTREE index would have treenodes with multiple keys. Each BTREE search would perform binary searching within the page with a possible decent into another page. For example, if each BTREE page contained 1024 keys, a tree height of 3 or 4 would be the norm, a short tree height indeed.
Notice that a partitiioning of a table does not reduce the height of the BTREE which is already small. Given a partitioning of 260 milliion rows, there is even the strong likelihood of having multiple BTREEs with the same height. Searching for a key may pass through all root BTREE pages every time. Only one will fulfill the path of the needed search range.
Now expand on this. All the partitions exist on the same machine. If you do not have separate disks for each partition, you will have disk I/O and spindle rotations as an automatic bottleneck outside of partition search performance.
In this case, paritioning by database does not buy you anything either if id is the only search key being utitlized.
Partitioning of data should serve to group data that are logically and cohesively in the same class. Performance of searching each partition need not be the main consideration as long as the data is correctly grouped. Once you have achieved the logical partitioning, then concentrate on search time. If you are just separating data by id only, it is possible that many rows of data may never be accessed for reads or writes. Now, that should be a major consideration: Locate all ids most frequently accessed and partition by that. All less frequently accessed ids should reside in one big archive table that is still accessible by index lookup for that 'once in a blue moon' query.
The overall impact should be to have at least two partitions: One for frequently accessed ids, and the other paritiion for the rest of the ids. If the frequently accessed ids is fairly large, you could optionally partition that.
Partitioning should work better than index if the number of rows is large. If the query needs to scan only one partition, it is more efficient than finding current rows through index.
If SRC_CURRENT
is updated to 0 when newer rows are loaded, you need to enable row movement for the table. This will allow rows to move from current partition to historical partition.
Alternatively, if current rows are always the most recent rows, you could range-partition the table by date. This would allow you to clear expired rows very fast by running drop partition
for the oldest partition. This will also benefit queries against historical data that access a range of dates but not the whole history.
Best Answer
Partitioning is helpful with large tables and management purposes for your DBA. If you have 50GB of data you should start looking into partitioning just for perfomrance alone, as Kim Tripp stated.
However, as a DBA I find partitioning great for management of large tables and reporting data that comes from a OLTP source, something which I think you are describing. Partitioning will allow you to also create different indexes, different fill factors, and utilize indexed views on top of them to gather massive performance boosts for aggregation type functions utilizng "Indexed Views over Partitioned Tables".
In my recommendation, get familiar with partitioning. It can save you disasters. I once inhereted a table that was 6 billion rows and growing at 2-3GB a day with only 20GB free on the drive. I was able to create a new partition for new data, copy out all the old data to different databases on RAID 5, and update the code to refer to views; all without client downtime in a 24/7 shop. Partitioning is essential as data sets grow larger.