Mysql – Efficient workflow/queries to save unique information

database-designMySQLoptimizationPHP

I am saving company information in a database. There is a single table with an auto-incrementing key that is unique to each company. There are separate tables for emails, websites, phone numbers, and company names. When a new company is added to the database, a check should be performed across the various tables to see if unique data already exists.

For instance, if a new phone number, email, and website are added, a check should to be performed to see if they are all already present in the database. If the phone number and email are present (a unique combination), then I want to grab the unique company id for those two entries and add a row to the website table indicating that the website is now associated with that company id.

However, if the phone number and email exist for one company id while the email and website exist for a different company id, I want to merge the two ids into one.

I personally hate this setup, but I can't think of a cleaner way to save all this data. For instance, it's not guaranteed a phone number, email, or website will always be passed. Moreover, some of the data is fairly subjective. I can easily query email addresses and phone numbers from multiple tables:

SELECT id FROM companies LEFT JOIN emails using(id) LEFT JOIN phones using(id) WHERE phone = "123" AND email = "john@smith.com"

However comparing organization names, a WHERE query won't return correct results. I wrote a function in PHP to compare businesses names using similar_text and soundex, but those can't easily be incorporated into a MySQL query. Similarly, one check I have in place to differentiate businesses is a longitude/latitude distance check. Consequently, my envisioned workflow (as of now) is:

Loop through each unsaved company profile in PHP
Get all saved, unique information from the database:

SELECT id, email, position, address, zip, organization_name, phone, website FROM companies left join emails using(id) left join locations using(id) left join organization_names using(id) left join phones using(id) left join websites using(id)
In PHP, loop through each row, attempting to find a match between the saved values and unsaved value.
If a unique entry exists, update the database. If a unique entry does not exist, add the appropriate rows to the database.

I can see this being incredibly inefficient given the number of companies saved in the database, which is why I am here asking for help. Database design is not my forte, so any advice is appreciate.

Best Answer

My solution of this would be use relation tables instead of merging ids, use uniqueness for tables phone number, email, and website etc. and insert with IGNORE command like

Insert ignore into emails values (5,a.a@a.com);

If you use the IGNORE keyword, errors that occur while executing the INSERT statement are ignored. For example, without IGNORE, a row that duplicates an existing UNIQUE index or PRIMARY KEY value in the table causes a duplicate-key error and the statement is aborted. With IGNORE, the row is discarded and no error occurs. Ignored errors may generate warnings instead, although duplicate-key errors do not.

or if you want o update on duplicates, use Replace instead of Insert ignore

REPLACE works exactly like INSERT, except that if an old row in the table has the same value as a new row for a PRIMARY KEY or a UNIQUE index, the old row is deleted before the new row is inserted.

This way you don't need to check for duplicates. If you want more complicated duplicate checks, you need o do it in the code, not in the database

Related Solutions

Mysql – Help with modeling this table

One question I have is how does the Company relate to a phone number? I mean, is there just a list of internal numbers, and will all of these phone numbers be associated to an employee. If they are you can simply just make your phone number table and have it relate to an employee. For example.

create table company
(
company_id int PRIMARY KEY IDENTITY, --(Primary Key)
company_name varchar(100)
)

create table employee
(
employee_id int PRIMARY KEY IDENTITY,  --(Primary Key)
employee_name varchar(100),
company_id int not null  --(Foreign Key)
)

create table phone
(
phone_id int PRIMARY KEY IDENTITY,
phone_number varchar(15), -- 555-555-5555
phone_type varchar(10),   -- Home | Cell | Work
employee_id int not null  --(Foreign Key)
)

employee_id being your foreign key to your employees table. If each employee only works for one company, then to get all of the phone numbers available for that company, you could do something like this

select * from phone as p join employee as e on e.employee_id = p.employee_id where e.company_id = 1

I hope this helps :)

Sql-server – Unique column values between tables

If I understand what you are asking for: A given person can have multiple roles in a team but can only be on one team I would do it like this:

enter image description here

By putting the TeamId in the People table you enforce the fact that a Person can only be on one team. You then have a many:many relationship using a cross join table between People and Roles. This allows a person to have multiple roles. By putting the email in the PeopleRole table you enforce that a person has to have a separate email for each role. You would also want to put a unique index on the PeopleRole.Email column to enforce that all email address have to be different.

The only thing I believe I'm missing is enforcing that a team can only have one leader and one contact. You might be able to do that by adding TeamId to the PeopleRole and then creating a filtered unique index on TeamId & RoleId WHERE the RoleIds are those that can only have one per team. I haven't tried it though.

Best Answer

Related Solutions

Mysql – Help with modeling this table

Sql-server – Unique column values between tables

Related Question