Mysql – NoSQL or MySQL for storing API usage

elasticsearchmongodbMySQLnosql

I need to be able to track API usage in order to generate reports or charts later. I've been struggling to decide whether to use regular MySQL or something like MongoDB or Elastic Search which I don't have any experience with at all.

The records I will insert will be

user_id (int)
request_time (datetime)
ip_address(varchar)
  • I will need to count total records per user_id per given time
  • I will need to grab all ips from user_id per given time
  • I will be inserting around 400k records per day
  • I need to be sure this will be able to handle at least 1 million records per day for inserts

Inserts are sent to a queue server (Redis) that will do single inserts to whatever system we decide to use.

What do you think will be the best solution for this scenario? It's worth mentioning that we are trying to find the least expensive way to do this.

Update: A million records using bigint for Ips will take approx 45mb making a total of 1350mb monthly

Best Answer

Two things.

  1. Use the tool you already know. This simplifies development, and reduces the desire to prematurely optimize.

  2. Fit your environment to the requirements. The hardware required to store 400,000 x 10 byte rows per day is vastly different from 400,000 x 10,000 byte rows.

Define your requirements by mocking up some sample data in MySQL, along with a sample of the queries you'll run. See how much space this sample takes up, and do the math to determine the requirements for your data over a 1 month, a year, and so-forth. This will give you a much better idea of the kind of hardware you'll need.

My answer talks about the requirements for storing billions of rows in SQL Server and shows some of the things to consider. The other answers on that question are also quite instructive. As I note in my answer, don't store an IP address as a varchar. Use an integer.