I have a table like this one, that shows payments from each client, for different dates:
| stream_datetime | customer_id | order_status | rn |
|:----------------:|:-----------:|:------------:|:--:|
| 04/06/20 11:19AM | 1 | completed | 1 |
| 05/06/20 10:54AM | 1 | completed | 2 |
| 06/06/20 10:59AM | 1 | completed | 3 |
| 08/06/20 09:27AM | 2 | failed | 1 |
| 09/06/20 11:02AM | 2 | failed | 2 |
| 01/11/20 05:59PM | 3 | completed | 1 |
| 02/11/20 05:59PM | 3 | completed | 2 |
| 03/11/20 10:01AM | 3 | cancelled | 3 |
| 04/11/20 09:20AM | 3 | completed | 4 |
| 05/11/20 10:25AM | 3 | completed | 5 |
| 01/13/20 03:29PM | 4 | completed | 1 |
| 02/13/20 03:29PM | 4 | completed | 2 |
| 03/13/20 03:29PM | 4 | cancelled | 3 |
| 04/13/20 03:29PM | 4 | completed | 4 |
| 05/13/20 03:29PM | 4 | completed | 5 |
| 06/13/20 03:29PM | 4 | completed | 6 |
| 07/13/20 03:29PM | 4 | completed | 7 |
| 08/13/20 03:29PM | 4 | cancelled | 8 |
| 06/20/20 03:29PM | 5 | failed | 1 |
| 07/20/20 03:29PM | 5 | completed | 2 |
| 08/20/20 03:29PM | 5 | completed | 3 |
| 09/20/20 03:29PM | 5 | failed | 4 |
| 10/20/20 03:29PM | 5 | completed | 5 |
I want to calculate the difference of days a client canceled
his plan.
The challenge here is the fact that the client can cancel more than one time, so client 4
must
be counted as a churned
client two times, but client 3
would be counted as a churned
client
only one time.
I want to consider only clients that has an order_status = completed
followed by (not necessarily in the next month) an order_status = cancelled
.
I want too to create a column called purchase_day
that keeps the day of the payment.
Obs.: the column rn
means row number for the specific client.
EDITED:
Sorry. I made some mistakes and writing the question.
Maybe the order_status = cancelled
come first than order_status = completed
. This is due a mistake in the business, but it can happen. If this happen, so we cannot consider as a churned client.
So this is my expected result (now this is ok):
| purchase_day | customer_id | lifetime|
|:----------------:|:-----------:|:-------:|
| 01/11/20 05:59PM | 3 | 60 |
| 01/13/20 03:29PM | 4 | 60 |
| 04/13/20 03:29PM | 4 | 122 |
As you can see:
- client
1
never cancelled (so he doesn't need to be in the result) - client
2
never cancelled (so he doesn't need to be in the result) - client
3
cancelled one time (his lifetime is equal 60 days) - client
4
cancelled two times (one lifetime of 60 days and another lifetime of 122 days) - client
5
never cancelled (so he doesn't need to be in the result)
Best Answer
Okay, had to do this in three steps:
stream_datetime
and calculate the the difference betweenorder_status
was 'cancelled'.Results in:
You can test with this Fiddle.
I'd recommend running the subqueries by themselves to get a feel for how each step works and so you can be confident in the results.
Ok, this is easy enough to fix given how things are set up. If the cancel has been entered erroneously,
purchase_day
will be NULL (we already ignore instances with two consecutive statuses), so we can just change the finalWHERE
to:Updated fiddle for you to test.