I am not a pro in the database sector so pardon me if i am asking a basic question.
Scenario
I have a table in which all the recorded are stored chronologically.
Number of rows of that table is 1,445,248 (1.4 Million).
Desire
I want to optimize my query which looks like following: –
Select * from SalesInvoice where S_Date Between 20190401 And 20190501
Note
- Yes, i need all the columns in the table and instead of typing out all the columns, i just used the all symbol (astrix).
- S_Date is stored as numeric instead of DateTime (because it was developed a long time ago in this format)
- The total time taken to execute this query is only 4 seconds but because of some calculations performed it is looping through ~80-90 time.
Question
I want to know, is there any way through which i can start the searching of rows from Nth position.
My Approach
Edit – Currently i am using SQL Server 2008 R2 but i will switch to MySQL that is why the query is written in SQL Server 2008 R2 format.
Step 1: Get first index of data by following query: –
Select Top 1 Id from SalesInvoice where S_Date Between 20190401 And
20190501 order by S_Date ASC
Step 2: Start Searching from the returned Id lets say 500,000
My Assumption if this technique exists
According to me, this technique might save table scanning time by starting the search from that ID which removes scanning of 500,000 rows.
BUT
I don't know how to start the lookup from a particular ID.
If anyone can provide me with a code snippet that will be very helpful
Best Answer
As for "all the recorded are stored chronologically", the only control you have over this is the
PRIMARY KEY
in InnoDB.A simple rearrangement will provide the optimal access:
(
TOP 1
is not a MySQL construct; seeLIMIT
. But it is not needed with the above change.)(The technique above is even better than INDEX(S_date).)
Explanation
In MySQL, the data rows are ordered by the
PRIMARY KEY
. Said another way: ThePRIMARY KEY
is "clustered" with the data. This implies that you you fetch the rows in PK order, you will be fetching the data in the most optimal order.With the change I proposed, the data will be ordered by
S_Date
(with dups ordered byid
). This is exactly the order you want.To change the current table definition, do
The auto_increment column must occur first in some index. This allows the 'next' value can be efficiently found after a restart.
Stop after
This performs such, but not efficiently:
will skip (that is, read and discard) 20K rows, then deliver 1000.
It is more efficient to "remember where you left off". With
PRIMARY KEY(id)
, it is straight-forward and efficient:With a composite index it gets messier, but still possible. More discussion: http://mysql.rjweb.org/doc.php/pagination