Mongodb – What mongodb database layout would fit better

mongodbnosqlperformance

I'm currently writing an application using the MEAN stack, pretty much only for fun and because MEAN is "cool" at the moment.

Therefore I'm thinking about the database layout. The data needed to be saved represents a music collection. So I had two ideas. But because I'm quite new to NoSQL, I don't know which one I should use or if both of them are rubbish.

First idea:

One big collection "artists" with an array "albums". This array also has an array called "songs". Here I don't know what to do about playlists!

Second idea:

Four collections "artists", "albums" ,"songs" and "playlists" all linked using the _id field – This is how I would do it in a traditional database, like MySQL.

Which one would in your opinion fit better and – that's what I'm even more interested in – why?

Best Answer

I think your first idea is better. The second way, as you state, is how you would model the data in an RDBMS. If you're going to use MongoDB for fun, you might as well explore the fact that it has a different data model, and structure your collections accordingly. While I'm sure performance is not going to be an issue for the scale of this project, keeping the data in a single document avoids joins, which can be expensive. It's commonly held that, within MongoDB environments, denormalization is faster (see here).

In the schema describe in your first idea, playlists could all be subdocuments in a kind of catch-all artist document called "various". I imagine the same would be done in a relational implementation, with the table artist having a record for "various", to cover compilation albums and soundtracks, for example.

I have a JSFiddle here with a JSON representation of what a music collection might look like implemented as your first idea.

Of course, there is yet another way, and that is a collection albums, with most documents having a key artist, which could either be a reference to the _id from a document in a collection artists, or just have the name of the artist itself. In this implementation, playlists wouldn't have this key; or, alternatively, it could be an array with all the artists with tracks in the playlist.