I'm looking for some tips to find a nice solution for my problem:
Let's say I have a table, where days of vacation of employees are stored. e.g.
CREATE TABLE test_dates (
id NUMBER NOT NULL,
holidaydate DATE NOT NULL,
holidaycomment VARCHAR2(255) NULL
);
INSERT INTO test_dates VALUES (1, To_Date('17.08.2015','dd.mm.rrrr'), 'Trip to Berlin');
INSERT INTO test_dates VALUES (2, To_Date('18.08.2015','dd.mm.rrrr'), 'Trip to Berlin');
INSERT INTO test_dates VALUES (3, To_Date('19.08.2015','dd.mm.rrrr'), 'Trip to Berlin');
INSERT INTO test_dates VALUES (4, To_Date('20.08.2015','dd.mm.rrrr'), 'Trip to Berlin');
INSERT INTO test_dates VALUES (5, To_Date('21.08.2015','dd.mm.rrrr'), 'Stay at home');
INSERT INTO test_dates VALUES (6, To_Date('03.09.2015','dd.mm.rrrr'), 'Trip to London');
INSERT INTO test_dates VALUES (7, To_Date('04.09.2015','dd.mm.rrrr'), 'Trip to London');
INSERT INTO test_dates VALUES (8, To_Date('07.09.2015','dd.mm.rrrr'), 'Trip to London');
INSERT INTO test_dates VALUES (9, To_Date('08.09.2015','dd.mm.rrrr'), 'Trip to London');
INSERT INTO test_dates VALUES (10, To_Date('17.09.2015','dd.mm.rrrr'), NULL);
SELECT * FROM test_dates ORDER BY holidaydate;
Fine. Now I want to query this table to get the whole period of their holidays.
SELECT
*
FROM (
SELECT
connect_by_root(id) AS id,
connect_by_root(holidaycomment) AS holidaycomment,
connect_by_root(holidaydate) AS date_from,
Decode(connect_by_root(holidaydate), holidaydate, NULL, holidaydate) AS date_to,
CONNECT_BY_ISLEAF AS is_leaf,
LEVEL AS days
FROM test_dates
CONNECT BY PRIOR holidaydate = holidaydate - 1 AND (PRIOR holidaycomment = holidaycomment OR (PRIOR holidaycomment IS NULL AND holidaycomment IS NULL))
)
WHERE is_leaf = 1
ORDER BY date_from;
That query gave me 10 rows, a way too much (August 17th to 20th, August 18th to 20th and so on). So I'm trying to select the starting dates:
SELECT
id, holidaycomment, date_from, date_to, days
FROM (
SELECT
connect_by_root(id) AS id,
connect_by_root(holidaycomment) AS holidaycomment,
connect_by_root(holidaydate) AS date_from,
Decode(connect_by_root(holidaydate), holidaydate, NULL, holidaydate) AS date_to,
CONNECT_BY_ISLEAF AS is_leaf,
LEVEL AS days
FROM test_dates
-- NEW Code --
START WITH id IN (
SELECT t1.id
FROM test_dates t1
LEFT JOIN test_dates t2 ON t1.holidaydate - 1 = t2.holidaydate AND (t1.holidaycomment = t2.holidaycomment OR (t1.holidaycomment IS NULL AND t2.holidaycomment IS NULL))
WHERE t2.id IS NULL
)
-- NEW Code --
CONNECT BY PRIOR holidaydate = holidaydate - 1 AND (PRIOR holidaycomment = holidaycomment OR (PRIOR holidaycomment IS NULL AND holidaycomment IS NULL))
)
WHERE is_leaf = 1
ORDER BY date_from;
ID HOLIDAYCOMMENT DATE_FROM DATE_TO DAYS ------------------------------------------------------ 1 Trip to Berlin 17.08.2015 20.08.2015 4 5 Stay at home 21.08.2015 NULL 1 6 Trip to London 03.09.2015 04.09.2015 2 8 Trip to London 07.09.2015 08.09.2015 2 10 NULL 17.09.2015 NULL 1
It works, but it seems to be too complicated for me. Is there a better way for select the date_from?
And my last problem: My/The Application stores no information on weekend (See Ids 6 and 8). So I get 2 rows for one holiday.
Anyone an idea to join these rows with a query and no stored function? I'm looking for this result:
ID HOLIDAYCOMMENT DATE_FROM DATE_TO DAYS ------------------------------------------------------ 1 Trip to Berlin 17.08.2015 20.08.2015 4 5 Stay at home 21.08.2015 NULL 1 6 Trip to London 03.09.2015 08.09.2015 6 10 NULL 17.09.2015 NULL 1
Best Answer
This is my attempt at solving this "groups and islands" problem, with comments in-line:
SQLFiddle
I wrote this as my first Oracle query for about 20 years. The translation above is from my native SQL Server.