Best Way to Save Mouse Clicks in MySQL

database-designMySQLnormalization

I don't have any experience with large amount of data and I'm trying to figure out the best way to save the mouse clicks (navigation) of web visitors into a MySQL database. The DB will be used by multiple websites at the same time with a lot of visitors. Currently I have two options..

Alt 1:

-----------------------------
| visit_id | timestamp | ip |
-----------------------------
|    1     |   2016    | 00 |
-----------------------------

------------------------------------
| visit_id | timestamp |  x  |  y  |
------------------------------------
|    1     |  14:01:00 | 165 | 365 |
------------------------------------
|    1     |  14:01:01 | 166 | 367 |
------------------------------------

Alt 1 is best but it will be huge amount of rows and I don't know if MySQL will be able to handle all this.

Alt 2:

----------------------------------------------------------------------------
| visit_id | timestamp |  ip  |                 cordinates                 |
----------------------------------------------------------------------------
|    1     |    2016   |  00  | [14:01:00, 165, 365], [14:01:01, 166, 366] |
---------------------------------------------------------------------------

Then use PHP to split the column cordinates into an array.

Anybody have any tips on how to do save mouse clicks in the best way with MySQL?

Best Answer

Go for the normalized structure. In your case, that's Alt 1.

Have your tables defined more or less like these:

CREATE TABLE visits
(
    visit_id INTEGER AUTO_INCREMENT PRIMARY KEY,
    visit_datetime DATETIME NOT NULL DEFAULT now(),
    ip VARCHAR(45),     -- This can probably be encoded more efficiently, it is actually a 32bit integer for IPv4, and 128 for IPv6
    cookie_id INTEGER   -- Do we have a cookie for this visitor? If yes, keep it! 
    -- Whatever else you need
) ;

CREATE TABLE visits_clicks
(
    visit_id INTEGER NOT NULL REFERENCES visits(visit_id),
    click_datetime DATETIME NOT NULL DEFAULT now(),
    x INTEGER NOT NULL,
    y INTEGER NOT NULL,

    PRIMARY KEY (visit_id, click_datetime)
) ;

You'll end up having data like these:

INSERT INTO visits (ip, visit_datetime)
VALUES 
   ('127.0.0.1',   now() - interval 90 second),
   ('192.168.1.1', now() - interval 38 second),
   ('192.168.1.2', now() - interval 37 second),
   ('192.168.1.1', now() - interval 26 second) ;  

INSERT INTO visits_clicks
   (visit_id, click_datetime, x, y)
VALUES
   (1, now() - interval 88 second, 1010,  970),
   (1, now() - interval 87 second, 1201,  894),
   (1, now() - interval 86 second, 1123, 1098),
   (1, now() - interval 85 second, 1241, 1020),
   (1, now() - interval 84 second, 1234, 1034),
   (2, now() - interval 36 second, 1000, 1234)
   ;

With this structure, you will be able to perform queries like the following one, which might be very difficult with an unnormalized structure:

-- Find average number of clicks per visit, count, min and max
SELECT
   count(visit_id)  AS visit_count,
   avg(click_count) AS average_clicks,
   min(click_count) AS min_click_count,
   max(click_count) AS max_click_count
FROM
    (SELECT
        visit_id, count(*) AS click_count
    FROM
        visits_clicks
    GROUP BY
        visit_id
    ) AS q
visit_count | average_clicks | min_click_count | max_click_count
----------: | -------------: | --------------: | --------------:
          2 |         3.0000 |               1 |               5

You will find that, except for very specific questions, this structure will let you query virtually anything you can think of, in an easier fashion than with the unnormalized design. And you can do so within the database, without having to move information in and out of it, which makes the process much faster.

When in doubt: normalize. You'll be able later to aggregate or denormalize if you need to. Some very specific scenarios work better with denormalized data. I don't think that's your case

Check the setup at dbfiddle here