I've got this query:
SELECT m.vlanID, v.vlan_name, m.interfaceName, count(m.macAddress) as mac, n.nd_name, p.interfaceDescription, p.interfacePortType
FROM macs m, network_devices n, vlans v, ports p
WHERE n.nd_ip_address = m.ipAddr
AND m.vlanID = v.vlan_id
AND p.vlanID = m.vlanID
GROUP BY m.vlanID, v.vlan_name, m.interfaceName, n.nd_name, p.interfaceDescription, p.interfacePortType
ORDER BY m.vlanID
Each of these tables have approximately 50.000 records. The query is taking north of 30min. Can I improve it to make it more efficient?
EDIT: Table definitions:
macs table
ipAddr, vlanID, macAddress, interfaceName
network_devices table
network_device_id, nd_location, nd_name, nd_ip_address, etc
vlans table
vlan_id, vlan_name
ports table
interfaceName, interfacePortType, nodeIP, interfaceDescription
I've limited to mostly what's required from the query.
There's no indexes configured atm nor execution plan as this is not an application per se. This is just a basic script to be run occasionally for internal purposes only and the data will become obsolete within a few days, so we just need to query temporarily to get some answer while we upgrade our network.
Thanks
Best Answer
You can try this:
A few things about the changes:
The cartesian join to inner join switch is purely cosmetic, unless the mysql implementation you're using doesn't support the cartesian inner join syntax, which would be very odd.
Removing the Order By clause should provide a performance bump, but depending on the way you're using the results may not be useful.
If
m.macAddress
is sometimes null,count(1)
!=count(m.macAddress)
These small optimizations stated, there is not a whole lot to optimize about this query itself. There is something about the context of the query that is causing the exceptional runtime.
Indexes on
n.nd_ip_address
,m.ipAddr
,m.vlanID
,v.vlan_id
, andp.vlanID
would certainly help substantially. If you have to run the query multiple times, it would be well worth the time invested in creating the indexes even on temporary data.It's even likely that re-creating the indexes every time the query runs would cause the overall runtime to be much lower.