Mysql – What could be causing strange query timeouts between PHP and MySQL

MySQLperformance

I am the senior developer on a Software-as-a-Service application used by many different customers. Our software runs on a cluster of Apache / PHP application servers, powered by a MySQL backend. On one particular instance of the software, the PHP code to query the list of category names is timing out when the customer has more than 29 categories. I know this makes no sense; there is nothing special about the number 30 that would break this and other customers have a lot more than 30 categories, however, the problem is 100% reproducible when this one installation has 30 or more categories and goes away when there's less than 30 categories.

The table in question is:

CREATE TABLE IF NOT EXISTS `categories` (
  `id` int(10) unsigned NOT NULL auto_increment,
  `name` varchar(64) NOT NULL,
  `title` varchar(128) NOT NULL,
  `parent` int(10) unsigned NOT NULL,
  `keywords` varchar(255) NOT NULL,
  `description` text NOT NULL,
  `status` enum('Active','Inactive','_Deleted','_New') NOT NULL default 'Active',
  `style` enum('_Unknown') default NULL COMMENT 'Autoenum;',
  `order` smallint(5) unsigned NOT NULL,
  `created_at` datetime NOT NULL,
  `modified_at` datetime default NULL,
  PRIMARY KEY  (`id`),
  KEY `name` (`name`),
  KEY `parent` (`parent`),
  KEY `created_at` (`created_at`),
  KEY `modified_at` (`modified_at`),
  KEY `status` (`status`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 COMMENT='R2' AUTO_INCREMENT=33 ;

The code in question recursively queries the table to fetch all the categories. It issues a

SELECT * FROM `categories` WHERE `parent`=0 ORDER BY `order`,`name`

And then repeats this query for each row returned, but using WHERE parent=$category_id each time. (I am sure this procedure could be improved, but that's probably another question)

As far as I can tell, the following query is hanging forever:

SELECT * FROM `categories` WHERE `parent`=22 ORDER BY `order`,`name`

I can execute this query in the mysql client on the server perfectly fine, and I can execute it in PHPMyAdmin without problems as well.

Note that it's not that specific query which is the problem. If I DELETE FROM categories WHERE id=22 then a different query similar to the one above will then hang. Also, the query above returns zero rows when I run it manually.

I suspected that the table may be corrupt, and I tried REPAIR TABLE and OPTIMIZE TABLE but nether of these reported problems nor solved the issue. I dropped the table and recreated, but the problem returned. This is exactly the same table structure and PHP code other customers are using with no issues for anyone else, including customers who have far more than 30 categories.

The PHP code is not recursing forever. (This is not and infinite loop)

MySQL server is running CentOS linux with mysqld Ver 5.0.92-community for pc-linux-gnu on i686 (MySQL Community Edition (GPL))

Load on the MySQL server is low: load average: 0.58, 0.75, 0.73, Cpu(s): 4.6%us, 2.9%sy, 0.0%ni, 92.2%id, 0.0%wa, 0.0%hi, 0.3%si, 0.0%st. Negligible swap being used (448k)

How can I troubleshoot this issue? Any suggestions as to what might be going on?

UPDATE: I TRUNCEed the table and inserted 30 rows of dummy data:

INSERT INTO `categories` (`id`, `name`, `title`, `parent`, `keywords`, `description`, `status`, `style`, `order`, `created_at`, `modified_at`) VALUES
(1, 'New Category', '', 0, '', '', 'Inactive', NULL, 1, '2011-10-25 12:06:30', '2011-10-25 12:06:34'),
(2, 'New Category', '', 0, '', '', 'Inactive', NULL, 2, '2011-10-25 12:06:39', '2011-10-25 12:06:40'),
(3, 'New Category', '', 0, '', '', 'Inactive', NULL, 3, '2011-10-25 12:06:41', '2011-10-25 12:06:42'),
(4, 'New Category', '', 0, '', '', 'Inactive', NULL, 4, '2011-10-25 12:06:46', '2011-10-25 12:06:47'),
(5, 'New Category', '', 0, '', '', 'Inactive', NULL, 5, '2011-10-25 12:06:49', NULL),
(6, 'New Category', '', 0, '', '', 'Inactive', NULL, 6, '2011-10-25 12:06:51', '2011-10-25 12:06:52'),
(7, 'New Category', '', 0, '', '', 'Inactive', NULL, 7, '2011-10-25 12:06:53', '2011-10-25 12:06:54'),
(8, 'New Category', '', 0, '', '', 'Inactive', NULL, 8, '2011-10-25 12:06:56', '2011-10-25 12:06:57'),
(9, 'New Category', '', 0, '', '', 'Inactive', NULL, 9, '2011-10-25 12:06:59', '2011-10-25 12:06:59'),
(10, 'New Category', '', 0, '', '', 'Inactive', NULL, 10, '2011-10-25 12:07:01', '2011-10-25 12:07:01'),
(11, 'New Category', '', 0, '', '', 'Inactive', NULL, 11, '2011-10-25 12:07:03', '2011-10-25 12:07:03'),
(12, 'New Category', '', 0, '', '', 'Inactive', NULL, 12, '2011-10-25 12:07:05', '2011-10-25 12:07:05'),
(13, 'New Category', '', 0, '', '', 'Inactive', NULL, 13, '2011-10-25 12:07:06', '2011-10-25 12:07:07'),
(14, 'New Category', '', 0, '', '', 'Inactive', NULL, 14, '2011-10-25 12:07:08', '2011-10-25 12:07:09'),
(15, 'New Category', '', 0, '', '', 'Inactive', NULL, 15, '2011-10-25 12:07:11', '2011-10-25 12:07:12'),
(16, 'New Category', '', 0, '', '', 'Inactive', NULL, 16, '2011-10-25 12:07:13', '2011-10-25 12:07:14'),
(17, 'New Category', '', 0, '', '', 'Inactive', NULL, 17, '2011-10-25 12:09:41', '2011-10-25 12:09:42'),
(18, 'New Category', '', 0, '', '', 'Inactive', NULL, 18, '2011-10-25 12:09:47', NULL),
(19, 'New Category', '', 0, '', '', 'Inactive', NULL, 19, '2011-10-25 12:09:48', NULL),
(20, 'New Category', '', 0, '', '', 'Inactive', NULL, 20, '2011-10-25 12:09:48', NULL),
(21, 'New Category', '', 0, '', '', 'Inactive', NULL, 21, '2011-10-25 12:09:49', NULL),
(22, 'New Category', '', 0, '', '', 'Inactive', NULL, 22, '2011-10-25 12:09:50', NULL),
(23, 'New Category', '', 0, '', '', 'Inactive', NULL, 23, '2011-10-25 12:09:51', NULL),
(24, 'New Category', '', 0, '', '', 'Inactive', NULL, 24, '2011-10-25 12:09:51', NULL),
(25, 'New Category', '', 0, '', '', 'Inactive', NULL, 25, '2011-10-25 12:09:52', NULL),
(26, 'New Category', '', 0, '', '', 'Inactive', NULL, 26, '2011-10-25 12:09:53', NULL),
(27, 'New Category', '', 0, '', '', 'Inactive', NULL, 27, '2011-10-25 12:09:54', NULL),
(28, 'New Category', '', 0, '', '', 'Inactive', NULL, 28, '2011-10-25 12:09:55', NULL),
(29, 'New Category', '', 0, '', '', 'Inactive', NULL, 29, '2011-10-25 12:09:56', NULL),
(30, 'New Category', '', 0, '', '', 'Inactive', NULL, 30, '2011-10-25 12:09:57', NULL);

No parents at all, all categories are on the top level. problem is still there. The following query, executed by PHP, fails:

SELECT * FROM `categories` WHERE `parent`=22 ORDER BY `order`,`name`

Here's the EXPLAIN:

mysql> EXPLAIN SELECT * FROM `categories` WHERE `parent`=22 ORDER BY `order`,`name`;
+----+-------------+------------+------+---------------+--------+---------+-------+------+-----------------------------+
| id | select_type | table      | type | possible_keys | key    | key_len | ref   | rows | Extra                       |
+----+-------------+------------+------+---------------+--------+---------+-------+------+-----------------------------+
|  1 | SIMPLE      | categories | ref  | parent        | parent | 4       | const |    1 | Using where; Using filesort | 
+----+-------------+------------+------+---------------+--------+---------+-------+------+-----------------------------+
1 row in set (0.00 sec)

UPDATE #2: I have now tried all of the following:

  1. I copied this table and data to a different site with the same software. The problem did not follow the table. It seems to be confined to this one database.
  2. I changed the index as gbn's answer suggested. The problem remained.
  3. I dropped the table and recreated as an InnoDB table and inserted the same 30 test rows above. The problem remained.

I suspect it must be something with this database…

UPDATE #3: I completely dropped the database and recreated it under a new name, importing her data. The problem remains.

I have found that the actual PHP statement which hangs is a call to mysql_query(). Statements after this never get executed.

While that call hangs, MySQL lists the thread as sleeping!

mysql> show full processlist;
+-------+------------------+-----------------------------+----------------------+---------+------+-------+-----------------------+
| Id    | User             | Host                        | db                   | Command | Time | State | Info                  |
+-------+------------------+-----------------------------+----------------------+---------+------+-------+-----------------------+
|  5560 | root             | localhost                   | problem_db           | Query   |    0 | NULL  | show full processlist |  
                          ----- many rows which have no relevancy; only rows from this customer's app are shown ------
| 16341 | shared_db        | oak01.sitepalette.com:53237 | shared_db            | Sleep   |  308 |       | NULL                  | 
| 16342 | problem_db       | oak01.sitepalette.com:60716 | problem_db           | Sleep   |  307 |       | NULL                  | 
| 16344 | shared_db        | oak01.sitepalette.com:53241 | shared_db            | Sleep   |  308 |       | NULL                  | 
| 16346 | problem_db       | oak01.sitepalette.com:60720 | problem_db           | Sleep   |  308 |       | NULL                  |  
+-------+------------------+-----------------------------+----------------------+---------+------+-------+-----------------------+

UPDATE #4: I have narrowed it down to the combination of two tables, the categories table detailed above and a media_images table with 556 rows. If the media_images table contains less than 556 rows, or the categories table contains less than 30 rows, the problem goes away. It's like it's some sort of MySQL limit I'm hitting here…

UPDATE #5: I just tried moving the database to a different MySQL server altogether and the problem went away… So it's related to my production database server…

UPDATE #6: Here's the relevant PHP code which hangs each time:

    public function find($type,$conditions='',$order='',$limit='')
    {
            if($this->_link == self::AUTO_LINK)
                    $this->_link = DFStdLib::database_connect();

            if(is_resource($this->_link))
            {
                    $q = "SELECT ".($type==_COUNT?'COUNT(*)':'*')." FROM `{$this->_table}`";
                    if($conditions)
                    {
                            $q .= " WHERE $conditions";
                    }
                    if($order)
                    {
                            $q .= " ORDER BY $order";
                    }
                    if($limit)
                    {
                            $q .= " LIMIT $limit";
                    }

                    switch($type)
                    {
                            case _ALL:
                                    DFSkel::log(DFSkel::LOG_DEBUG,"mysql_query($q,$this->_link);");
                                    $res = @mysql_query($q,$this->_link);
                                    DFSkel::log(DFSkel::LOG_DEBUG,"res = $res");

This code's in production and works fine on all other installs. Just on one install, it hangs at $res = @mysql_query($q,$this->_link);. I know because I see the mysql_query in the debug log, and not the res =, and when I strace the PHP process, it's hung at read(

UPDATE #whatever-it-is-I-hate-this-&(#^&-issue! This has now started happening to two customers of mine. I just fired up tcpdump and it looks like the response from MySQL is never sent completely. The TCP stream just seems to hang before the full MySQL response can be sent. (I am still investigating however)

UPDATE #I-have-gone-completly-crazy-but-it-works-now-kinda: Ok, this makes no sense, but I have found a solution. If I assign a second IP address to the MySQL server's eth2 interface, and use one IP for NFS traffic and the second IP for MySQL, then the problem goes away. It's like I'm somehow… overloading the Ip address if both NFS+MySQL traffic both go to that IP. But that makes zero sense because you can't "overload" an IP address. Saturating an interface sure, but it's the same interface.

Any idea what in the heck is going on here? This is probably a unix.SE or ServerFault question at this point… (At least it works now…)

UPDATE #why-oh-why: This issue is still occurring. It's started happening even using two different IPs. I can keep creating new private IPs, but clearly something's wrong.

Best Answer

For general profiling of what exactly is going on in the query plan, you can try PROFILING

It will basically help you determine where the hangup is.

Of course, it only works if you've compiled MySQL with enable-profiling.