I am using PostgreSQL 9.6 and have following schema
CREATE TABLE activitylog (
sessionid text NOT NULL,
studentid integer NOT NULL,
activity text NOT NULL,
created timestamp without time zone DEFAULT now() NOT NULL
);
The schema is used to log visitors activity. Each visit is assigned a session id which is unique for the visit. Sessions ends when user logs out. Every user can log in and log out multiple times.
INSERT INTO activitylog VALUES
( 'abc', 1, 'logged in' , '2017-08-12T02:39:05.753088Z' ),
( 'abc', 1, 'page visited', '2017-08-12T02:49:05.753088Z' ),
( 'abc', 1, 'logged out' , '2017-08-12T02:59:05.753088Z' ),
( 'def', 1, 'logged in' , '2017-08-12T03:09:05.753088Z' ),
( 'def', 1, 'page visited', '2017-08-12T03:19:05.753088Z' ),
( 'def', 1, 'logged out' , '2017-08-12T03:29:05.753088Z' ),
( 'ghi', 2, 'logged in' , '2017-08-12T03:39:05.753088Z' ),
( 'ghi', 2, 'page visited', '2017-08-12T03:59:05.753088Z' ),
( 'ghi', 2, 'logged out' , '2017-08-12T03:59:05.753088Z' );
Required Result
I want to select data from the table such that I get result in following format
| sessionid | studentid | loggedin | loggedout | interval |
| ----------- | ----------- | -------------- | ----------- | ------------------ |
| abc | 1 | 2017-08-12T02:39:05.753088Z | 2017-08-12T02:59:05.753088Z | 20min |
| def | 1 | 2017-08-12T03:09:05.753088Z | 2017-08-12T03:29:05.753088Z | 20min |
| ghi | 2 | 2017-08-12T03:39:05.753088Z | 2017-08-12T03:59:05.753088Z | 20min |
Basically,
- Pivot table
- Add studentid column
- Calculate interval
What I have tried so far
I have successfuly piovoted the table
SELECT * FROM crosstab(
$$
select sessionid, activity, created from activitylog
where activity = 'logged in' OR activity = 'logged out'
ORDER BY 1, 2
$$
)
AS sessioninfo (sessionid text, loggedin timestamp, loggedout timestamp);
I tried to add studentid extra column using this query but it didn't give me desired result
SELECT * FROM crosstab(
$$
select sessionid, studentid, activity, created from
activitylog
where activity = 'logged in' OR activity = 'logged out'
ORDER BY 1, 2
$$,
$$
VALUES (1), (0)
$$
)
AS sessioninfo (sessionid text, studentid INTEGER,
loggedin timestamp, loggedout timestamp);
But this places null in logged in and logged out timestamps.
To sum it up
- What is the purpose of
VALUES
withcrosstab()
? To be honest I am not completely sure what purpose second parameter ofcrosstab()
serves and why in almost all examples it isVALUES
. - How do I add
studentid
to result? - How do I calculate interval between two columns that are result of
crosstab()
?
Best Answer
What you want is something like this,
Output:
It specifies the categories. It can also be a query but here there is no need for that and being explicit was even more terse. You're interested in two categories those are loggedin, and loggedout. Age being a function of those two categories is calculated with
CROSS JOIN LATERAL
.VALUES
itself is a SQL expression. We could have writtenSELECT * FROM VALUES () AS t(category)
but that's just more work.Just like in my demonstration.
We used
age
. However one caveat about this we should hit right now..Your schema is specified as
timestamp without time zone
. That's certainly a mistake. When that crosses DST, if ever, you're going to be in a world of pain. It'll show the time elapsed in the rollover.