I'm "about" to implement a feature on one of my apps which requires storing a lot of data (PlayLists).
Assume the following scenario:
- A user can have multiple playlists
- These playlists have up to 500k entries (maybe even more)
- I must be able to get, remove, re-sort or query the whole playlist or by using indexes
- Currently a playlist for each user; but
- I'm about to implement a feature that will allow users to have more than 1 playlist.
- About 3500 entries per playlist. Some can go as high as 590k.
Playlist entries are represented by the following class:
public class Item {
private Long owner; // User
private String data;
private String name;
private String author;
private int index; // Index of this item on the list
}
TL;DR: List< Item >
for every user (more than 1 per user) where this list can extend to 500k+ entries.
Currently, I have SQL-based storage for this, however, it's slow as heck, specially when I need to shift indexes – slow up to the point of my db driver thinking that the connection failed and leaked.
To explain "shift indexes": Let's say I'm removing an item at the 3rd index (i = 3
), this means that all other items after it need to shift one index down, so I can later on request index 3 and get the item that was previously index 4.
Currently, I decrement the column value Order
for that playlist for all elements after the one I deleted.
Current PlayList Items Table design
Id (Ordering/Index) | PlayList (Id) | Owner (User) | Name | Author | Data
Question
Which would be a better/more efficient way of storing this kind of "bulk" data, while still being able to query and modify it?
I'm ok with changing/implementing a new database software for this kind of data, since it's really big (I currently use Redis and MariaDB). However, I would love to avoid having to.
Best Answer
Hmmm... 1 person can have 590k tracks? Difficult to imagine, however, I have come up with a schema which might make a good starting point. You may want to have a playlist table for each user depending on how many users you have.
I have used PostgreSQL - you should be able to readily translate to another RDBMS - but, if you are going the Open Source route, I would recommend PostgreSQL.
I have also provided a minimal amount of DML (
INSERT
statements) which should let you see where I'm coming from.The playlist table is an example of an Associative Entity - otherwise known as a joining table or a many-to-many table - they are very useful.
Obviously, I don't have a real dataset, but I believe that the SQL updating a user's order by 1
should not be too onerous. Get back to us if you're still having problems. Note, that it should be within a single transaction!
Note also that you can put a lot of your application logic into the database creation script itself - various rules can be enforced without having to write a word of application code!