If you're testing with 50,000 rows in data_dynamic, I'd expect your inner join to return 50,000 rows. It seems to be returning about 10,000. There's a foreign key constraint—this seems unusual to me.
I loaded a bunch of random data, and ran this.
analyze data_dynamic;
analyze data_static;
explain analyze
select count(d.id) from data_static s inner join data_dynamic d on s.id = d.id;
And here's what my box returned. (PostgreSQL 9.0.2)
Aggregate (cost=14941.95..14941.96 rows=1 width=4) (actual time=477.633..477.633 rows=1 loops=1)
-> Merge Join (cost=8409.72..14816.94 rows=50000 width=4) (actual time=237.807..456.682 rows=50000 loops=1)
Merge Cond: (s.id = d.id)
-> Index Scan using data_static_pkey on data_static s (cost=0.00..91190.58 rows=3000000 width=4) (actual time=0.016..195.279 rows=350001 loops=1)
-> Index Scan using data_dynamic_pkey on data_dynamic d (cost=0.00..2342.23 rows=50000 width=4) (actual time=0.010..60.933 rows=50000 loops=1)
Total runtime: 477.706 ms
So it looks like the query planner is capable of coming up with a good plan for these tables and this quantity of data.
Does this mean your problem is either with hardware or with PostgreSQL configuration? I'm not sure it's that simple.
Later . . .
I was right; it's both simpler and not that simple.
Based on these snippets from PostgreSQL mailing lists, I'm going to say that the query planner simply doesn't yet support removing inner joins.
From the pgsql-performance mailing list, 02-Mar-2011
The planner doesn't currently make any deductions whatsoever from the
presence of a foreign key constraint; and even if it did, I'm not sure
that this would help it decide that a join order constraint could
safely be dropped.
And from pgsql-hackers, 12-Dec-2010
I'm not going to argue that careful analysis isn't needed before doing
something like this - and, in particular, if we ever get inner-join
removal, which I'm still hoping to do at some point, . . .
Your last query
mysql> select table_name from tables as a left join (select distinct table_name from tables as b) on a.table_name!=b.table_name;
ERROR 1248 (42000): Every derived table must have its own alias
fails because the as b
must be outside the parentheses like this
mysql> select table_name from tables as a left join (select distinct table_name from tables) as b on a.table_name!=b.table_name;
Your method 2 failed
mysql> select count(table_name) as num, table_name from tables group by table_name where num > 1;
because the group by table_name
must be last like this
mysql> select count(table_name) as num, table_name from tables where num > 1 group by table_name;
SUGGESTION
Perhaps you should start with a GROUP BY ... HAVING
query like this
SELECT table_name,COUNT(1) table_count
FROM information_schema.tables
WHERE table_schema NOT IN
('information_schema','performance_schema','mysql')
GROUP BY table_name HAVING COUNT(1) > 1;
This will definitely give all tables whose name appears in multiple databases
Form that as a subquery and join it to gather all databases the table appears in
SELECT
A.table_name,
GROUP_CONCAT(B.table_schema) TheTableAppearsInTheseDatabases
FROM
(
SELECT table_name,COUNT(1) table_count
FROM information_schema.tables
WHERE table_schema NOT IN
('information_schema','performance_schema','mysql')
GROUP BY table_name HAVING COUNT(1) > 1
) A INNER JOIN information_schema.tables B
USING (table_name) GROUP BY A.table_name;
Please notice that I use INNER JOIN
rather than LEFT JOIN
because it will really form a Cartesian product (2704 X 2704) and then perform comparisons.
I know this works because I tried it out in MySQL 5.5.12 on my Windows7 machine
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 33
Server version: 5.5.12-log MySQL Community Server (GPL)
Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> SELECT
-> A.table_name,
-> GROUP_CONCAT(B.table_schema) TheTableAppearsInTheseDatabases
-> FROM
-> (
-> SELECT table_name,COUNT(1) table_count
-> FROM information_schema.tables
-> WHERE table_schema NOT IN
-> ('information_schema','performance_schema','mysql')
-> GROUP BY table_name HAVING COUNT(1) > 1
-> ) A INNER JOIN information_schema.tables B
-> USING (table_name) GROUP BY A.table_name;
+--------------------------+-------------------------------------------------------------------------------------------------------------------+
| table_name | TheTableAppearsInTheseDatabases |
+--------------------------+-------------------------------------------------------------------------------------------------------------------+
| a | junk,test,robottinosino |
| acl | weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive |
| blocks | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging |
| blog | weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive |
| blog_category | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging |
| blog_entrycat | weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive |
| blog_meta | weisci_jaws_staging,weisci_jaws_live,weisci_jaws_archive,weisci_jaws_staging2 |
| blog_trackback | weisci_jaws_archive,weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_staging |
| calendar_events | weisci_jaws_staging,weisci_jaws_archive,weisci_jaws_live,weisci_jaws_staging2 |
| calendar_meta | weisci_jaws_archive,weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_staging |
| calendar_questions | weisci_jaws_staging,weisci_jaws_archive,weisci_jaws_live,weisci_jaws_staging2 |
| calendar_tickets | weisci_jaws_archive,weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_staging |
| calendar_transactions | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_staging,weisci_jaws_archive |
| captcha_complex | weisci_jaws_staging,weisci_jaws_archive,weisci_jaws_live,weisci_jaws_staging2 |
| change_log | test,junk |
| chat_staff | test,junk |
| comments | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_staging,weisci_jaws_archive |
| donations_charities | weisci_jaws_staging,weisci_jaws_archive,weisci_jaws_live,weisci_jaws_staging2 |
| donations_charities_meta | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging |
| donations_donations | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging |
| foo_reference1 | timpost1,timpost2 |
| foo_reference2 | timpost1,timpost2 |
| foo_reference3 | timpost1,timpost2 |
| groups | weisci_jaws_staging,weisci_jaws_archive,weisci_jaws_live,weisci_jaws_staging2 |
| ipvisitor | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging |
| job_post | giannosfor,test |
| layout | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging |
| listeners | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging |
| mediamanager_files | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_staging |
| mediamanager_group | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging |
| mediamanager_photos | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging |
| mediamanager_video | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging |
| menus | weisci_jaws_archive,weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2 |
| menus_groups | weisci_jaws_archive,weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2 |
| mytable | ryanzec,user1267617,cabita,dotancohen,johnlocke,neeraj,test,user391986,cool_cs,javier,mathieu |
| mytext | jakobud,newstuff |
| occupation_field | giannosfor,test |
| policy_agentblock | weisci_jaws_archive,weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2 |
| policy_ipblock | weisci_jaws_archive,weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2 |
| prova | veto,vito |
| registry | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging |
| registry_bk | weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2 |
| session | weisci_jaws_archive,weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2 |
| static_pages | weisci_jaws_archive,weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2 |
| static_pages_translation | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging |
| t | preeti,rollup_test |
| t1 | abidibo,test |
| t2 | test,abidibo |
| t3 | test,abidibo |
| table1 | table_test,supercoolville,test |
| table2 | supercoolville,test,table_test |
| tags | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging |
| tags_content | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging |
| tbl_banner_position | weisci_jaws_staging2,weisci_jaws_live |
| tbl_banner_upload | weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2 |
| tbl_global_banner | weisci_jaws_staging2,weisci_jaws_live |
| tms_authors | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging |
| tms_repositories | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging |
| tms_themes | weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging |
| updates | test,junk |
| url_aliases | weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2,weisci_jaws_archive |
| url_maps | weisci_jaws_archive,weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2 |
| users | weisci_jaws_archive,weisci_jaws_staging,giannosfor,veto,weisci_jaws_live,weisci_jaws_staging2,friends,sample,vito |
| users_groups | weisci_jaws_archive,weisci_jaws_staging,weisci_jaws_live,weisci_jaws_staging2 |
| users_meta | weisci_jaws_staging2,weisci_jaws_archive,weisci_jaws_staging,weisci_jaws_live |
+--------------------------+-------------------------------------------------------------------------------------------------------------------+
65 rows in set (0.91 sec)
mysql>
Give it a Try !!!
Best Answer
The first query will be faster this way:
It needs an index (
PRIMARY KEY
?) on users.id. It is faster because the subquery quits when finds one row, rather than counting all of them.The query part cannot use that technique. But I think it is much simpler than what you have:
Notice that
users
does not seem to be of any use.entries
will needINDEX(profile_id)
.I assume
tag_user
is a many-to-many mapping table? Have you followed all the advice in this? The advice there may provide further performance boost.If you need to discuss this further, please provide
SHOW CREATE TABLE
.