NoSQL stands for "Not only SQL" and usually means that the database is not a relational database, which have been very popular the last decades.
The reason why NoSQL has been so popular the last few years is mainly because, when a relational database grows out of one server, it is no longer that easy to use. In other words, they don't scale out very well in a distributed system. All of the big sites that you mentioned Google, Yahoo, Facebook and Amazon (I don't know much about Digg) have lots of data and store the data in distributed systems for several reasons. It could be that the data doesn't fit on one server, or there are requirements for high availability.
CAP Theorem
The properties of a distributed system can be described by the CAP Theorem. Of the three properties you can only have at most two:
- Consistency
- Availability
- tolerance to network Partitioning
Amazon Dynamo uses Eventual Consistency to come close to get all three properties. The paper Dynamo: Amazon’s Highly Available Key-value Store is worth reading when learning about NoSQL databases and distributed systems. Amazon Dynamo has the A and P properties.
Google take a different approach with BigTable, that has the C and A properties.
Other NoSQL databases
As I wrote in the beginning there are many other kind of NoSQL databases, that are designed for different requirements. E.g. graph databases like Neo4j, document databases like CouchDB and multimodel / object databases like OrientDB.
Finally I would like to say that relational databases will remain popular. They are very flexible and maintainable. But they are not always the best choice.
All a primary key is is a value that we have determined is the value that is of utmost importance in a record. Whether that key is a signed int, an unsigned int, a string, a blob (actually, there are limits) or a UUID (or whatever name it takes today), the fact still stands that it is a key, and that it is the thing of utmost importance.
Since we're not constrained to use only positive oriented numbers for our keys, it makes sense to consider that a signed int will only go to ~2 billion, whereas an unsigned int will go to ~4 billion. But there's nothing wrong with using a signed int, setting the initial value to ~ -2 billion and setting an increment of one. After ~2 billion records you'll hit "zero" and then you'll continue to ~2 billion.
As to why it would be helpful to have "negative keys" in a table, that's the same question as "why is it helpful to have keys in a table". The "value" of a key has no impact on its status as a key. A key is a key is a key.
What is important is if the key is valid.
As to why it would be useful to allow keys that were negative, I can suggest some reasons:
What if you wanted to indicate returns in a sales system as negative sales order numbers, that matched the positive sales order number, thus making correlation easy (this is naive, and poorly designed, but it would work in a "spreadsheet" sense).
What if you wanted to have a users table, and indicate that the ones with negative numbers were system controlled (SO does this very thing, for chat feed users).
I could go on, but really the only reason why the number being negative is of importance is if you or I assign importance to it. Aside from that, there is no great reason for the value of a key to have any bearing on the key itself.
Best Answer
You haven't really given us much info about what this data is going to be used for. I mean, you have said what data is going to be stored, but what are you going to do with it?
If your purpose is storing the data then reporting on it, then I think you're looking in the wrong place. A simple MySQL or SQL Database would do just fine and the reporting tools are readily available.
However if your going to be linking to something such as a web or mobile application where the data is constantly changing by multiple users (all accessing the same database stored in the cloud) then Firebase is the way to go.
So, your Pro's and Con's:
Pro's
Con's:
Note: I included "Who hosts the data" in both the pro's and con's. That's because you never told us how much data you were storing and who was going to be accessing it.