MySQL – Counting Relationship Rows Between Three Tables Without Duplicates

countjoin;MySQL

I have a (local) MySQL database, which is a Drupal 6 export. It has these relevant tables:

node – Contains articles, plus also information about their department and sections. I'm interested in nid and title columns.
content_field_department – Has the relationships between article nodes and department nodes. I'm interested in nid (the article node) and field_department_nid (the department node) columns.
content_field_section – Has the relationships between article nodes and section nodes. I'm interested in nid (the article node) and field_section_nid (the section node) columns.

I'm trying to get an accurate COUNT for how many times a given article node is related to each section or department node (performance is not relevant for me at this time, since it's local and it can take as long as it needs to without causing problems). That is, I'd like data like this:

+-----------------+-------------------+
| Section Name    | Count of Articles |
+-----------------+-------------------+
| Department Name | Count of Articles |
+-----------------+-------------------+

The issue

One problem is when the interface was originally built, a department choice for an article also included all of the sections. A section choice, though, never included the departments. So in cases where an article node is related to another node twice, both as a section and as a department, I would like to COUNT that only once.

My current try

My current attempt is like this:

To get a department COUNT:

SELECT DISTINCT d.field_department_nid as tid, n.title as name, (
        SELECT COUNT(DISTINCT nid, field_department_nid)
        FROM content_field_department d2
        WHERE d2.field_department_nid = d.field_department_nid
        AND nid NOT IN (
            SELECT s.nid from content_field_section s
            LEFT OUTER JOIN node n2 ON s.nid = n2.nid
            WHERE field_section_nid IS NOT NULL
        )
    ) as drupal_department_count
FROM content_field_department d
LEFT OUTER JOIN node n ON d.field_department_nid = n.nid
ORDER BY drupal_department_count DESC

To get a section COUNT:

SELECT DISTINCT s.field_section_nid as tid, n.title as name, (
    SELECT COUNT(DISTINCT nid, field_section_nid)
        FROM content_field_section s2
        WHERE s2.field_section_nid = s.field_section_nid
        AND nid NOT IN (
            SELECT d.nid from content_field_department d
            LEFT OUTER JOIN node n2 ON d.nid = n2.nid
            WHERE field_department_nid IS NOT NULL
        )
    ) as drupal_section_count
FROM content_field_section s
LEFT OUTER JOIN node n ON s.field_section_nid = n.nid
ORDER BY drupal_section_count DESC

I had also tried without the NOT IN additions, but the COUNT is too high, I think because the same node combination can be COUNTed in both tables.

What direction do I need to go with this?

Best Answer

I would need more information about your business restrictions to write an ideal query, but here's my advice given my interpretation of what you've said so far.

First thing you need to do is more clearly define what you are trying to achieve.

So in cases where an article node is related to another node twice, both as a section and as a department, I would like to COUNT that only once.

Since your final result set will be grouped by departments and sections, "counting it only once" means you need to pick where they should be counted. For example:

   content_field_department         content_field_section
+-----+----------------------+   +-----+-------------------+
| nid | field_department_nid |   | nid | field_section_nid |
+-----+----------------------+   +-----+-------------------+
|  1  |          2           |   |  1  |         3         |
+-----+----------------------+   +-----+-------------------+

should nid 1 be counted in the list for department nid 2 or counted for section nid 3? Based on your question, you may want to only count it for departments. In your current query drafts, you are trying to exclude the issue case from both counts, meaning you are effectively counting it 0 times instead of 1 (your goal) or 2 (what you are trying to avoid).

Here's a draft query as a starting point:

select
    DeptOrSect,
    nid,
    title,
    count(1)
from (
    select
        'DEPT' as DeptOrSect,
        dept.nid,
        dept.title,
        link.nid as article_nid
    from node dept
    join content_field_department link on link.field_department_nid = dept.nid
    union all
    select
        'SECT' as DeptOrSect,
        sect.nid,
        sect.title,
        link.nid as article_nid
    from node sect
    join content_field_section link on link.field_section_nid = sect.nid
    where not exists (
        select 1
        from content_field_department dept
        where dept.nid = link.nid
    )
) as x
group by DeptOrSect,
    nid,
    title;

SUGGESTION

Perhaps you should start with a GROUP BY ... HAVING query like this

SELECT table_name,COUNT(1) table_count
FROM information_schema.tables
WHERE table_schema NOT IN
('information_schema','performance_schema','mysql')
GROUP BY table_name HAVING COUNT(1) > 1;

This will definitely give all tables whose name appears in multiple databases

Form that as a subquery and join it to gather all databases the table appears in

SELECT
    A.table_name,
    GROUP_CONCAT(B.table_schema) TheTableAppearsInTheseDatabases
FROM
(
    SELECT table_name,COUNT(1) table_count
    FROM information_schema.tables
    WHERE table_schema NOT IN
    ('information_schema','performance_schema','mysql')
    GROUP BY table_name HAVING COUNT(1) > 1
) A INNER JOIN information_schema.tables B
USING (table_name) GROUP BY A.table_name;

Please notice that I use INNER JOIN rather than LEFT JOIN because it will really form a Cartesian product (2704 X 2704) and then perform comparisons.

I know this works because I tried it out in MySQL 5.5.12 on my Windows7 machine

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 33
Server version: 5.5.12-log MySQL Community Server (GPL)

Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>     SELECT
    ->         A.table_name,
    ->         GROUP_CONCAT(B.table_schema) TheTableAppearsInTheseDatabases
    ->     FROM
    ->     (
    ->         SELECT table_name,COUNT(1) table_count
    ->         FROM information_schema.tables
    ->         WHERE table_schema NOT IN
    ->         ('information_schema','performance_schema','mysql')
    ->         GROUP BY table_name HAVING COUNT(1) > 1
    ->     ) A INNER JOIN information_schema.tables B
    ->     USING (table_name) GROUP BY A.table_name;
+--------------------------+-------------------------------------------------------------------------------------------------------------------+
| table_name               | TheTableAppearsInTheseDatabases                                                                                   |
+--------------------------+-------------------------------------------------------------------------------------------------------------------+
| a                        | junk,test,robottinosino                                                                                           |
| acl                      | weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive                                     |
| blocks                   | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging                                     |
| blog                     | weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive                                     |
| blog_category            | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging                                     |
| blog_entrycat            | weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive                                     |
| blog_meta                | weisci_jaws_staging,weisci_jaws_live,weisci_jaws_archive,weisci_jaws_staging2                                     |
| blog_trackback           | weisci_jaws_archive,weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_staging                                     |
| calendar_events          | weisci_jaws_staging,weisci_jaws_archive,weisci_jaws_live,weisci_jaws_staging2                                     |
| calendar_meta            | weisci_jaws_archive,weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_staging                                     |
| calendar_questions       | weisci_jaws_staging,weisci_jaws_archive,weisci_jaws_live,weisci_jaws_staging2                                     |
| calendar_tickets         | weisci_jaws_archive,weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_staging                                     |
| calendar_transactions    | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_staging,weisci_jaws_archive                                     |
| captcha_complex          | weisci_jaws_staging,weisci_jaws_archive,weisci_jaws_live,weisci_jaws_staging2                                     |
| change_log               | test,junk                                                                                                         |
| chat_staff               | test,junk                                                                                                         |
| comments                 | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_staging,weisci_jaws_archive                                     |
| donations_charities      | weisci_jaws_staging,weisci_jaws_archive,weisci_jaws_live,weisci_jaws_staging2                                     |
| donations_charities_meta | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging                                     |
| donations_donations      | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging                                     |
| foo_reference1           | timpost1,timpost2                                                                                                 |
| foo_reference2           | timpost1,timpost2                                                                                                 |
| foo_reference3           | timpost1,timpost2                                                                                                 |
| groups                   | weisci_jaws_staging,weisci_jaws_archive,weisci_jaws_live,weisci_jaws_staging2                                     |
| ipvisitor                | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging                                     |
| job_post                 | giannosfor,test                                                                                                   |
| layout                   | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging                                     |
| listeners                | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging                                     |
| mediamanager_files       | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_staging                                                         |
| mediamanager_group       | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging                                     |
| mediamanager_photos      | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging                                     |
| mediamanager_video       | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging                                     |
| menus                    | weisci_jaws_archive,weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2                                     |
| menus_groups             | weisci_jaws_archive,weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2                                     |
| mytable                  | ryanzec,user1267617,cabita,dotancohen,johnlocke,neeraj,test,user391986,cool_cs,javier,mathieu                     |
| mytext                   | jakobud,newstuff                                                                                                  |
| occupation_field         | giannosfor,test                                                                                                   |
| policy_agentblock        | weisci_jaws_archive,weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2                                     |
| policy_ipblock           | weisci_jaws_archive,weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2                                     |
| prova                    | veto,vito                                                                                                         |
| registry                 | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging                                     |
| registry_bk              | weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2                                                         |
| session                  | weisci_jaws_archive,weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2                                     |
| static_pages             | weisci_jaws_archive,weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2                                     |
| static_pages_translation | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging                                     |
| t                        | preeti,rollup_test                                                                                                |
| t1                       | abidibo,test                                                                                                      |
| t2                       | test,abidibo                                                                                                      |
| t3                       | test,abidibo                                                                                                      |
| table1                   | table_test,supercoolville,test                                                                                    |
| table2                   | supercoolville,test,table_test                                                                                    |
| tags                     | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging                                     |
| tags_content             | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging                                     |
| tbl_banner_position      | weisci_jaws_staging2,weisci_jaws_live                                                                             |
| tbl_banner_upload        | weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2                                                         |
| tbl_global_banner        | weisci_jaws_staging2,weisci_jaws_live                                                                             |
| tms_authors              | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging                                     |
| tms_repositories         | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging                                     |
| tms_themes               | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging                                     |
| updates                  | test,junk                                                                                                         |
| url_aliases              | weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive                                     |
| url_maps                 | weisci_jaws_archive,weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2                                     |
| users                    | weisci_jaws_archive,weisci_jaws_staging,giannosfor,veto,weisci_jaws_live,weisci_jaws_staging2,friends,sample,vito |
| users_groups             | weisci_jaws_archive,weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2                                     |
| users_meta               | weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging,weisci_jaws_live                                     |
+--------------------------+-------------------------------------------------------------------------------------------------------------------+
65 rows in set (0.91 sec)

mysql>

Give it a Try !!!

Best Answer

Related Solutions

MySQL – Get Column from Multiple Tables

Mysql – Should I use left join to do the job in this scenario

SUGGESTION

Related Question