Sql-server – SQL Preserve the Order in group by

group byorder-byquerysql server

I have records like below in my table.

I want distinct records and when I do group by it loses the Order. I want to maintain the order. I have written below query to get the desired however its not working:

 select route_id,fixcode,fixdescription from 
 route_fixcodes group by route_id,fixcode,fixdescription
 having route_id = 12345 Order by fixcode

I want Result like below:

Table DDL:

CREATE TABLE [dbo].[route_fixcodes](
  [id] [int] IDENTITY(1,1) NOT NULL,
  [route_id] [int] NOT NULL,
  [fixcode] [varchar](4) NOT NULL,
  [fixdescription] [varchar](32) NOT NULL,
  CONSTRAINT [PK__route_fi__3213E83FD7609D27] PRIMARY KEY CLUSTERED ([id] ASC)
)

Best Answer

If you do not specify an order by for the result set of your first query then your data is not guaranteed to be ordered.

It does appear that you want to order by the minimal id field in your table based on PRIMARY KEY CLUSTERED ( [id] ASC).

One way would be by using a CTE and MIN(id)

WITH CTE 
AS
(
SELECT route_id,fixcode,fixdescription, MIN(id) as minid
FROM route_fixcodes 
WHERE route_id = 995063 
GROUP BY route_id,fixcode,fixdescription
)
SELECT route_id,fixcode,fixdescription
FROM CTE
Order by minid;

Test data

CREATE TABLE #route_fixcodes( [id] [int] IDENTITY(1,1) PRIMARY KEY NOT NULL ,route_id int,fixcode int,fixdescription nvarchar(255));

INSERT INTO #route_fixcodes(route_id,fixcode,fixdescription)
VALUES(995063,100,'Issue_Observed'),(995063,100,'Issue_Observed'),(995063,137,'Swap Altice One Pack')
,(995063,137,'Swap Altice One Pack'),(995063,247,'Defective CPE Equip.'),(995063,247,'Defective CPE Equip.')
,(995063,112,'outside coax repair'),(995063,112,'outside coax repair')

Result

route_id    fixcode fixdescription
995063  100 Issue_Observed
995063  137 Swap Altice One Pack
995063  247 Defective CPE Equip.
995063  112 outside coax repair

Related Solutions

Sql-server – Indexing – Uniqueidentifier Foreign Key or Intermediary mapping table

Ok, I am making a lot of assumptions (INT instead of VARCHAR(50) being one of them) with this answer, so feel free to correct me if needed. The problem with option B is that it introduces a new join to relate Users to Alerts without any real added benefit. If joining on the UserID, it is best to index the UserID, so you can utilize seeks for your joins.

For Option A, UserID will be the clustering key (index key for the clustered index) on the Users table. UserID will be a nonclustered index key on Alerts table. This will cost 16 bytes per Alert.

For Option B, UserID will be the clustering key on the Users table. UserId will probably be the clustering key in UserMap too, to make joining more efficient. UserKey (assuming this is an INT) would then be a nonclustered index key on the Alerts table. This will cost 4 bytes per Alert. And 20 bytes per UserMap.

Looking at the big picture, one relationship, for Option A, costs 16 bytes of storage, and involves 1 join operation. Whereas, one relationship, for Option B, costs 24 bytes of storage, and involves 2 join operations.

Furthermore, there are a possibility of 340,282,366,920,938,000,000,000,000,000,000,000,000 uniqueidentifiers and only 4,294,967,296 INTs. Implementing a uniqueidentifier to INT map for a this type of relationship could cause unexpected results when you start reusing INTs.

The only reason for creating this type map table, is if you plan on creating a Many to Many relationship between Users and Alerts.

Taking all of this into consideration, I would recommend Option A.

I hope this helps,

Matt

Sql-server – Alternative query to this (avoid DISTINCT)

The two scripts in RThomas' answer are both useful. You could also use GROUP BY, which gives a similar advantage to RThomas' methods, but keeping a similar form to your original query.

select country 
from Users inner join
countries on users.CountryID=countries.CountryID
GROUP BY countries.CountryID, countries.country;

The reason why you group by CountryID is that it's the primary key of your countries table, giving the Query Optimizer some better options.

...except that it's not in your scripts.

Put PKs (with Clustered Indexes) on your tables, and a FK relationship between them. Index CountryID in the Users table, and put a Unique Index on the Country field.

Once you've done all that, using DISTINCT how you have will actually give you the ideal execution plan.

Best Answer

Related Solutions

Sql-server – Indexing – Uniqueidentifier Foreign Key or Intermediary mapping table

Sql-server – Alternative query to this (avoid DISTINCT)

Related Question