Querying a big database

database-design

I am PHP programmer with basic-intermediate knowledge on databases. I am working on a project where I need to verify a given phone number's presence in India's Do not call database. Following is the structure of the table,

Phone Number | Flag 1 | Flag 2 | Flag 3 | Flag 4 | Flag 5 | Flag 6 | Flag 7

here each Flag represents a category.

I would be provided a phone number and a category, or maybe phone number and multiple categories.

What would be the best way to optimize the table structure? Which will be best to use? Mysql or Oracle?

Should the table structure be normalized? How should the query be designed?

Response time is a key factor for the solution. The database has about 150 Million records.

Best Answer

That is a very denormalized design, as you have pointed out yourself. Especially if there isn't always a 1 : 7 ratio of Phone Numbers to Flags. Depending on your business data, you could potentially have a vast amount of sparse data.

I'd recommend a thin, normalized structure like this:

create table Flag
(
    pkFlag int identity(1, 1) not null primary key clustered,
    FValue varchar(100) not null
)
go

create table PhoneNumber
(
    pkPhoneNumber int identity(1, 1) not null primary key clustered,
    PnValue varchar(20) not null
)
go

create table PhoneNumberFlag
(
    fkFlag int not null foreign key references Flag(pkFlag),
    fkPhoneNumber int not null foreign key references PhoneNumber(pkPhoneNumber)
)
go

This allows you to create a one-to-many relationship (or possibly a many-to-many relationship, I can't say for sure as you haven't explained the meaning of Flags or any part of the domain).

A simple query to get all flags for a particular phone number would be:

select 
    f.fValue
from PhoneNumber pn
inner join PhoneNumberFlag pnf
on pn.pkPhoneNumber = pnf.fkPhoneNumber
inner join Flag f
on f.pkFlag = pnf.fkFlag
where pn.PnValue = '555-555-5555'

As for the RDBMS you are looking to go with, that is very dependent on MANY factors. What's your environment? What do you already have? MySql is a very different type, as it is open source. Oracle, SQL Server, these are not. I'm not sure about Oracle, but SQL Server has a free edition (SQL Server Express) that you could leverage, but it is limited. RDBMS licensing can get costly very fast. There are too many questions that need answering before a sound recommendation on that front can be made.