Currently i have this dataset, i need to return grouped ids that are within the range of 60 seconds and have more than 3.
CREATE TABLE test ( `id` bigint NOT NULL AUTO_INCREMENT, created_date TIMESTAMP(1) NOT NULL, origin_url VARCHAR (200) NOT NULL, client_session_id VARCHAR (50) NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `UC_PRE_CHAT_TALKID_COL` (`id`) );
✓
INSERT INTO test VALUES (1,'2021-01-18 11:02:24.0', 'https://zendes.com/', 'znkjoc3gfth2c3m0t1klii'), (2,'2021-01-18 11:02:35.0', 'https://zendes.com/', 'znkjoc3gfth2c3m0t1klii'), (3,'2021-01-18 11:02:03.0', 'https://zendes.com/', 'znkjoc3gfth2c3m0t1klii'), (4,'2021-01-18 11:11:28.0', 'https://rarara.com/', 'znkjoc3gfth2c3m0t1klii'), (5,'2021-01-18 11:11:36.0', 'https://rarara.com/', 'znkjoc3gfth2c3m0t1klii'), (6,'2021-01-18 11:11:05.0', 'https://rarara.com/', 'znkjoc3gfth2c3m0t1klii');
✓
db<>fiddle here
something like this:
ids origin_url client_session_id
1,2,3 https://testett.com/ znkjoc3gfth2c3m0t1klii
4,5,6 https://rarara.com/ znkjoc3gfth2c3m0t1klii
Edit some context:
I currently developed a cron that runs every 1 minute to analyze the last 60 seconds of the bot records in database, I need to group the conversation ID's that have more than 3 records within 60 seconds in the same url and client_session_id.
follows the SQL I'm running:
select
count(session_id),
client_session_id,
GROUP_CONCAT(id) as talkIds,
origin_url
from
bot_talk
where
created_date > now() - interval 60 second
group by
client_session_id, origin_url
having
count(session_id) >= 3
This query works as I expect, but sometimes my cron service is sometimes down, and I lose those repeated records.
I thought about making an SQL(Cron) at the end of the day to analyze the last 24 hours, and look for the records that are repeated according to the rule I mentioned above?
Best Answer
Here is the answer - see the fiddle. Another answer I wrote to a similar question may provide some clearer background and is a bit simpler - see here.
All I will say is that it gives some idea of the power of window functions.
I noticed in the comments that there was some debate about what constituted a group - in this example, I have constructed the SQL such that it picks out as a group where all subsequent sessions started within 180 seconds (i.e. 3 mins) of the first - you can change the 180 to 60 (or whatever) yourself.
I've added in some records for the purpose of testing and also added
CONSTRAINT
s to the table definition. It's always best to put as much as possible into the DDL - your database is your final bastion of defence for your data!Always use named constraints - these provide more meaningful messages than:
... failed... CONSTRAINT xyz_000_43abc has been violated...
.I populated it as follows:
It's always worth checking out for edge cases - single records at the beginning/end of your dataset + groups that you want to capture at the beginning and end also! I leave it to you to do more exhaustive testing!
I'll give the results first:
There's a bonus - you've got the starts and ends of the multiple close-together sessions thrown in for free!
I do have one word of advice - you really shouldn't be using or gathering data as comma-separated lists - SQL wasn't designed for string manipulation and extracting meaningful information and knowledge from such lists is painful - better to have a single atomic datum in a single field - see
1st Normal Form
!I've left in the various "sub-fiddles" that I used to arrive at the final result - hopefully they'll help you to learn about window functions &c... My own preference for the results would be in this format (see the fiddle - with one record/session - you can prune as you see fit):
Or another way might be (again, see the fiddle):
You have the starts and ends and the number of sessions... anyway, that's up to you. Take a look at this bit (easily missed):
This allows you to have an
ASC
ending and aDESC
edning list.This means that you can then do this:
And the records where this corresponds must be the ones of interest for the bunched sessions.
So now, the SQL, here it is (drum roll, trumpets sound in the distance...)! Go get a cup of coffee to read it - it's a monster (needs pruning - but that's an exercise for the OP and anybody else who's got this far):