MySQL – Query to Get Only One Result per Field

MySQL

I'm developing with CodeIgniter and I have the following problem:

Let's suposse that I have the following table:

| id |   employee   |     job     |    date    |
|----+--------------+-------------+------------|
|  1 |  John        | sysadmin    | 2017-12-01 |
|  2 |  Charles     | programmer  | 2017-11-24 |
|  3 |  Peter       | programmer  | 2017-11-29 |
|  4 |  Chris       | g.designer  | 2017-06-30 |
|  5 |  Paul        | web devel.  | 2017-10-08 |
|  6 |  Maria       | g.designer  | 2017-11-05 |
|  7 |  Abigail     | programmer  | 2017-09-24 |

I want to make a query that selects all fields, showing only 3 rows where the employees won't repeat the same job and the results must be ordered by date desc, so it would be:

| id |   employee   |     job     |    date    |
|----+--------------+-------------+------------|
|  1 |  John        | sysadmin    | 2017-12-01 |
|  3 |  Peter       | programmer  | 2017-11-29 |
|  6 |  Maria       | g.designer  | 2017-11-05 |

I have found the word DISTINCT, but after many tries, I couldn't make it to work as expected, so I made the following workaround:

for($i=0; $i<$limit; $i++) 
{
    $this->db->where('employee NOT IN("'. implode($employees, '", "') .'")');
    $this->db->order_by('date DESC');
    $q = $this->db->get('employees', 1);
    $r = !empty($q) ? $q->result() : 0;
    $employees[] = $r[0]->employee;
    $result[] = $r[0];
}

Which makes the following query and stack the result into an array

SELECT * FROM employees WHERE employee NOT IN (implode($employees)) ORDER BY date DESC LIMIT 1

it works perfectly, but quite dirty, since if I want 3 rows, It will make 3 queries, that is not so bad, but when I change it to 18 rows, it takes too long to load the webpage.

I have tried the functions group_by() and distinct() from CodeIgniter, but it neither worked as I expected.
Any way to achieve it with only 1 query?

Best Answer

Assumptions:

  • if multiple employees have the same job and date, display the one with the lowest id
  • an entry for web devel. was accidentally left out of the sample desired output

Sample data:

create table employees
(id          int
,employee    varchar(30)
,job         varchar(30)
,`date`      date);

insert into employees values
(1,'John',   'sysadmin',  '2017-12-01'),
(2,'Charles','programmer','2017-11-24'),
(3,'Peter',  'programmer','2017-11-29'),
(4,'Chris',  'g.designer','2017-06-30'),
(5,'Paul',   'web devel.','2017-10-08'),
(6,'Maria',  'g.designer','2017-11-05'),  <=== same job/date as 8/Bob
(7,'Abigail','programmer','2017-09-24'),
(8,'Bob'    ,'g.designer','2017-11-05');  <=== same job/date as 6/Maria

We'll use a derived table to order the data by job, date and id, and then with the help of some variables (@jname, @jcount) we'll find a single employee for each unique job who has the max(date):

select e.id,
       e.employee,
       e.job,
       e.`date`

      /* initialize our variables

          @jname  = keep track of previous job name
          @jcount = keep count of # records with same job name */

from  (select @jname  := '', 
              @jcount := 0 ) v

join  (select /* if we've got a new job name, set @jcount = 1, else increment by 1,
                 and since we're ordering by 'date desc', @jcount = 1 = max(date)  */

              @jcount := if(job != @jname, 1, @jcount + 1) as jcount,

              /* save current job name */
              @jname  := job,

              id,
              employee,
              job,
              `date`

       from   employees

       order by job,
                `date` desc,
                id) e           /* determine which employee to display if tie for job/max(date) */

where  e.jcount = 1  /* only display our records where jcount = 1 = max(date) for a given job name */

order by id;

id employee job        date
-- -------- ---------- ----------
1  John     sysadmin   2017-12-01
3  Peter    programmer 2017-11-29
5  Paul     web devel. 2017-10-08
6  Maria    g.designer 2017-11-05

NOTE: While Bob and Maria have the same job/date values (g.designer/2017-11-05), Maria is displayed because she has the lowest id value.

Here's a db-fiddle