Presumably, you want to see a single entry for each unique person/address/email/website combination. If so, try this:
SELECT (person.FirstName + ' ' + person.LastName) as FullName
, ISNULL(Person.isClient, '')
, ISNULL(Person.UDF1, '')
, ISNULL([Address].City, '')
, ISNULL([Address].[state], '')
, PersonAddress.Person
, PersonAddress.[Address]
, ISNULL(Phone.PhoneNumber, 'N/A')
, Email.Email
, Person.Website
FROM dbo.Person
LEFT JOIN dbo.PersonAddress ON Person.ID = PersonAddress.Person
LEFT JOIN dbo.[Address] ON PersonAddress.[Address] = [Address].ID
LEFT JOIN dbo.PersonPhone ON Person.ID = PersonPhone.Person
LEFT JOIN dbo.Phone ON PersonPhone.Person = Phone.ID
LEFT JOIN dbo.Email WITH (NOLOCK) ON Person.ID = Email.Person
WHERE (
isclient = 'prospect'
or isclient = 'client'
)
and [Address] is not null
and name like '%Mike%'
GROUP BY (person.FirstName + ' ' + person.LastName)
, ISNULL(Person.isClient, '')
, ISNULL(Person.UDF1, '')
, ISNULL([Address].City, '')
, ISNULL([Address].state, '')
, PersonAddress.Person
, PersonAddress.[Address]
, ISNULL(Phone.PhoneNumber, 'N/A')
, Email.Email
, Person.Website
ORDER BY isClient asc;
The GROUP BY
clause at the end ensures only a single row is returned for each unique combination of columns in the GROUP BY
clause. This should prevent duplicate rows being displayed in your results.
A couple of things to note:
Always use the schema qualifier on the FROM
clause. FROM Person
should be FROM dbo.Person
-> this eliminates any confusion if you introduce new schemas in the future, and prevents the query optimizer from having to look for the default schema for your user.
For maintainability in the future, you probably want to name columns the same regardless of which table they are in. So for instance, instead of the ID
column in the People
table being named ID
, and it being named Person
in the Address
table, I'd name it PersonID
in both tables. This prevents confusion (read bugs) in joins such as dbo.Person LEFT JOIN dbo.Address ON Person.ID = Address.Person
.
Instead of naming tables like Person
, they should be named after the collection of items they contain, in plural. So, Person
becomes People
, and Address
becomes Addresses
. This eliminates confusion -> does the Address
table actually contain a single address or multiple addresses?
WITH (NOLOCK)
should be avoided like the plague it is unless you fully understand the consequences of reading rows that have been modified by other transactions but not yet committed. From MSDN:
Transactions running at the READ UNCOMMITTED level do not issue shared locks to prevent other transactions from modifying data read by the current transaction. READ UNCOMMITTED transactions are also not blocked by exclusive locks that would prevent the current transaction from reading rows that have been modified but not committed by other transactions. When this option is set, it is possible to read uncommitted modifications, which are called dirty reads. Values in the data can be changed and rows can appear or disappear in the data set before the end of the transaction. This option has the same effect as setting NOLOCK on all tables in all SELECT statements in a transaction. This is the least restrictive of the isolation levels.
In SQL Server, you can also minimize locking contention while protecting transactions from dirty reads of uncommitted data modifications using either:
The READ COMMITTED isolation level with the READ_COMMITTED_SNAPSHOT database option set to ON.
The SNAPSHOT isolation level.
The question can be split into 3 main operations:
- Partition rows by
IndividualNumber
(using variables)
- Pivot Partitioned rows to columns (using CASE)
- Add Name and other information and remove
NULL
You can look at the sample here: SQL Fiddle
Partition rows by IndividualNumber
:
The behavior of this query is similar to the ROW_NUMBER() Window Function available in Oracle (>= 10g), PostgreSQL (>= 8.4) and SQL Server (>= 2012).
MySQL does not implement it and it must be done using variables and a CASE
statement:
SELECT @row := CASE WHEN inf.IndividualNumber = @id
THEN @row + 1 ELSE 1 END as row
, @id := inf.IndividualNumber as IndividualNumber
, inf.IndividualAddressStreet
, inf.IndividualCity
, inf.IndividualState
, inf.IndividualZip
FROM (SELECT @row := 0, @id := 0) v
, InfoaboutThemTable as inf
ORDER BY inf.IndividualNumber
It returns a unique row
value from 1 to n for each IndividualNumber
(partition) in InfoaboutThemTable
:
row IndividualNumber IndividualAddressStreet IndividualCity IndividualState IndividualZip
1 1 100 Pine Street New York NY 10019
1 2 200 Pine Street New York NY 10018
2 2 201 Pine Street New York NY 10018
3 2 230 Golden Street New York NY 10018
4 2 456 Golden Street New York NY 10018
1 3 310 Oak Street New York NY 10018
1 7 100 Apple Street New York NY 10018
2 7 200 Hickory Street New York NY 10018
3 7 1234 Pineapple Street New York NY 10018
Because you don't know yet how to order Address1
, Address2
, ... it only uses ORDER BY inf.IndividualNumber
to partition by IndividualNumber
in no specific order.
It could be replaced by ORDER BY inf.IndividualNumber, inf.IndividualAddressStreet
if you were to partition by IndividualNumber
and give a number to partition' members ordered by IndividualAddressStreet
.
Pivot
Once each row for each partition in IndividualNumber
has a unique row
value, it can be used to transpose/pivot the rows to columns.
MySQL does not implement the PIVOT
operator. Data can be move and pivoted to column X using a GROUP BY d.IndividualNumber
and for each transposed columns, a CASE WHEN row = X THEN ... END
is used with an aggregate (MAX
):
MAX(CASE WHEN row = 1 THEN d.IndividualAddressStreet END) AS Street1
MAX(CASE WHEN row = 1 THEN d.IndividualCity END) AS City1
...
MAX(CASE WHEN row = 2 THEN d.IndividualAddressStreet END) AS Street2
...
It only contains 3 groups but you can easily expand it to 6 or N groups.
Add Name info:
Finaly IndividualsTable
is LEFT JOIN
to the pivoted subquery in order to add Names and Colors to the desired output.
NULL
values are replaced by an empty string using COALESCE
.
Full Query:
SELECT idt.IndividualNumber
, idt.Name
, idt.IndividualEyeColor
, idt.IndividualHairColor
, COALESCE(grp.Street1, '') as Street1
, COALESCE(grp.City1, '') as City1
, COALESCE(grp.State1, '') as State1
, COALESCE(grp.Zip1, '') as Zip1
, COALESCE(grp.Street2, '') as Street2
, COALESCE(grp.City2, '') as City2
, COALESCE(grp.State2, '') as State2
, COALESCE(grp.Zip2, '') as Zip2
, COALESCE(grp.Street3, '') as Street3
, COALESCE(grp.City3, '') as City3
, COALESCE(grp.State3, '') as State3
, COALESCE(grp.Zip3, '') as Zip3
FROM IndividualsTable idt
LEFT JOIN (
SELECT d.IndividualNumber as IndividualNumber
, MAX(CASE WHEN row = 1 THEN d.IndividualAddressStreet END) AS Street1
, MAX(CASE WHEN row = 1 THEN d.IndividualCity END) AS City1
, MAX(CASE WHEN row = 1 THEN d.IndividualState END) AS State1
, MAX(CASE WHEN row = 1 THEN d.IndividualZip END) AS Zip1
, MAX(CASE WHEN row = 2 THEN d.IndividualAddressStreet END) AS Street2
, MAX(CASE WHEN row = 2 THEN d.IndividualCity END) AS City2
, MAX(CASE WHEN row = 2 THEN d.IndividualState END) AS State2
, MAX(CASE WHEN row = 2 THEN d.IndividualZip END) AS Zip2
, MAX(CASE WHEN row = 3 THEN d.IndividualAddressStreet END) AS Street3
, MAX(CASE WHEN row = 3 THEN d.IndividualCity END) AS City3
, MAX(CASE WHEN row = 3 THEN d.IndividualState END) AS State3
, MAX(CASE WHEN row = 3 THEN d.IndividualZip END) AS Zip3
FROM
(
SELECT @row := CASE WHEN inf.IndividualNumber = @id
THEN @row + 1 ELSE 1 END as row
, @id := inf.IndividualNumber as IndividualNumber
, inf.IndividualAddressStreet
, inf.IndividualCity
, inf.IndividualState
, inf.IndividualZip
FROM (SELECT @row := 0, @id := 0) v
, InfoaboutThemTable as inf
ORDER BY inf.IndividualNumber
) d
GROUP BY d.IndividualNumber
) grp
ON grp.IndividualNumber = idt.IndividualNumber
;
Best Answer
PS.
AND tb2.value != ''
condition is excess. If false, thenAND tb2.Value LIKE '%@%'
is false too.