MySQL Correlated Subquery (with multiple columns) per row

MySQLsubquery

MySQL documentation states that:

Subqueries in the FROM clause cannot be correlated subqueries. They
are materialized in whole (evaluated to produce a result set) during
query execution, so they cannot be evaluated per row of the outer
query.

Okay, fine, I understand why this is. However, what if you want to repeat a specific query on multiple values?

SELECT COUNT( returning_visitors.per_ip ) AS count, AVG( returning_visitors.per_ip ) AS num_visits
FROM (
    SELECT COUNT( * ) AS per_ip
    FROM site_hits_unique
    WHERE site_hits_unique.site_id = ___INPUT___
    AND site_hits_unique.date >= CURDATE( ) - INTERVAL 30 DAY
    GROUP BY site_hits_unique.site_id, site_hits_unique.ip
    HAVING per_ip >1
) AS returning_visitors

Let's say I want to retrieve the result of this query on multiple values in the form of a result table, using the values from SELECT id FROM sites. Is there a way to do this in a single query without using a stored procedure? Something like:

SELECT id, COUNT( returning_visitors.per_ip ) as readers, AVG( returning_visitors.per_ip ) as avg_visits_pr
FROM sites
SUBQUERY-PER-ROW (
    SELECT COUNT( * ) AS per_ip
    FROM site_hits_unique
    WHERE site_hits_unique.site_id = sites.id
    AND site_hits_unique.date >= CURDATE( ) - INTERVAL 30 DAY
    GROUP BY site_hits_unique.site_id, site_hits_unique.ip
    HAVING per_ip > 1
) AS returning_visitors
WHERE sites.id IN (162888, 42705, 11412)

SUBQUERY-PER-ROW here is pseudo-code. JOIN will not work because of the above quoted restriction.

EDIT:

Another attempt at pseudo-code using info from http://dev.mysql.com/doc/refman/5.7/en/row-subqueries.html:

SELECT id, (
    SELECT COUNT( per_ip ) AS count, AVG( per_ip ) AS num_visits
    FROM (
        SELECT COUNT( * ) AS per_ip
        FROM site_hits_unique
        WHERE site_hits_unique.site_id = sites.id
        AND site_hits_unique.date >= CURDATE( ) - INTERVAL 30 DAY
        GROUP BY site_hits_unique.site_id, site_hits_unique.ip
        HAVING per_ip >1
    ) AS returning_visitors
) as (readers, avg_visits_pr)
FROM sites
WHERE sites.id IN (162888, 42705, 11412)

(it doesn't work, this isn't actual valid MySQL syntax, just a pseudo-code example)

Best Answer

Is this what you are looking for?

SELECT  s.id,
        COUNT( rv.per_ip ) as readers,
        AVG( rv.per_ip ) as avg_visits_pr
    FROM  sites AS s
    JOIN  
    (
        SELECT  u.site_id,
                COUNT( * ) AS per_ip
            FROM  site_hits_unique AS u
            WHERE  u.date >= CURDATE( ) - INTERVAL 30 DAY
            GROUP BY  u.site_id, u.ip
            HAVING  per_ip > 1 
    )  AS rv ON rv.site_id = s.id
    WHERE  s.id IN (162888, 42705, 11412)

Indexes:

site_hits_unique: INDEX(site_id, date, ip)

Or maybe this:

SELECT  site_id,
        COUNT(DISTINCT ip) as readers,
        COUNT(*) / COUNT(DISTINCT ip) as avg_visits_pr
    FROM  
    (
        SELECT  site_id, ip, COUNT(*) AS per_ip
            FROM  site_hits_unique
            WHERE  date >= CURDATE( ) - INTERVAL 30 DAY
              AND  site_id IN (162888, 42705, 11412)
            GROUP BY  site_id, ip
            HAVING  per_ip > 1 
    ) AS rv
    GROUP BY  site_id;