Sql-server – Combining data from multiple large tables

join;sql servert-sql

I have a customer survey dataset split across 25 different tables. I need to combine all the tables and produce one file with all the columns from all tables. All the tables have the SurveyID key in common.
For example

table1: SurveyID, Region, Gender

table2: SurveyID, AnswerNumber, Rating

table3: SurveyID, AnswerNumber, Expediture.

I want the output like this and don't want any duplicate rows:

Final table: SurveyID, Region, Gender, Rating, Expenditure…and so on.

One issue is that not all tables contain data for the same SurveyID. For example table 1 has 114000 rows whereas table 2 only has 50000 rows, indicating not everyone answered those survey questions.

I've looked through the site and tried using left joins and inner joins as per the suggestions but not getting the result I want.

UPDATE – Here is the code I tried based on the response:

;with surveys as
(
select * from dbo.vw_IVSSurveyMainHeader s
),
responses as
(
select s.SurveyID, t1.["AnswerNumber"], t1.["AccommTypeUsed"],
t2.["ActivityCategory"], t2.["Activity"], t2.["ActivitySubType"],
t3.["AirlineUsed"], t3.["AirlineRating"],
t4.["AppealReason"],
t5.["CycleOwnership"],
t6.["InformationSourceUsed"],
t7.["WebsiteForPlanning"], t7.["WebsiteForBooking"], t7.["UsefulnessRating"],
t8.["AmenityType"], t8.["EnvPracticeRating"],
t9.["ItemisedCostAmt"], t9.["ExpenditureType"], t9.["ItemisedCostCurr"],
t10.["WhereStayed"], t10.["NoNights"],
t11.["SatisfactionRatingReason"],
sat.["SatisfactionRating"],
t12.["SocialMediaUsage"], t12.["SocialMediaType"], t12.["TripStage"],
t13.["SocialMediaUsed"],
t14.["TechnologyType"],
t15.["TechnologyUses"],
t16.["TransportMethod"],
t17.["TravelledWith"]
from dbo.vw_IVSSurveyMainHeader s 
left join dbo.vw_IVSAccommodationUsed t1 on s.SurveyID = t1.SurveyID
left join dbo.vw_IVSActivities t2        on s.SurveyID = t2.SurveyID
left join dbo.vw_IVSAirlineRating t3     on s.SurveyID = t3.SurveyID
left join dbo.vw_IVSAppealReasons t4     on s.SurveyID = t4.SurveyID
left join dbo.vw_IVSCycleOwnership t5    on s.SurveyID = t5.SurveyID
left join dbo.[vw_IVSDecisionMakingProcess - After arriving] t6 on s.SurveyID = t6.SurveyID
left join dbo.[vw_IVSDecisionMakingProcess - Planning and booking] t7 on s.SurveyID = t7.SurveyID
left join dbo.vw_IVSEnvironmentPracticeRating t8 on s.SurveyID = t8.SurveyID
left join dbo.vw_IVSExpenditure t9 on s.SurveyID = t9.SurveyID
left join dbo.vw_IVSItineraryPlaces t10 on s.SurveyID = t10.SurveyID
left join dbo.vw_IVSSatisfactionRatingReasons t11 on s.SurveyID = t11.SurveyID
left join dbo.vw_IVSSatisfactionRatings sat on s.SurveyID = sat.SurveyID
left join dbo.vw_IVSSocialMediaUsage t12 on s.SurveyID = t12.SurveyID
left join dbo.vw_IVSSocialMediaUsed t13 on s.SurveyID = t13.SurveyID
left join dbo.vw_IVSTechnologyTypes t14 on s.SurveyID = t14.SurveyID
left join dbo.vw_IVSTechnologyUses t15 on s.SurveyID = t15.SurveyID
left join dbo.vw_IVSTransport t16 on s.SurveyID = t16.SurveyID
left join dbo.vw_IVSTravelParty t17 on s.SurveyID = t17.SurveyID
)
select surveys.*, responses.*
from surveys su left join responses r
on su.SurveyID = r.SurveyID

This throws the error that the column prefix for surveys and responses does not match with a table name or alias name used.

Best Answer

You need to have information for all surveys, even the ones that have no answers, right?

;with surveys as 
    (
        select SurveyID, Region, Gender from Table1 
    )
, responses as 
    (
        select t1.SurveyID, t2.AnswerNumber, t2.Rating, t3.AnswerNumber, t3.Expediture -- add other columns from the other tables
        from Table1 t1 left join Table2 t2 on t1.SurveyId = t2.SurveyID
                   left join Table3 t3 on t1.SurveyId = t3.SurveyID
                   -- add other tables up to the 25th
    )
select s.SurveyId, s.Region, s.Gender, r.*
from surveys s left join responses r on s.SurveyId = r.SurveyID