Result of a query in SQL

distinctsql-standard

Let us say that we have to write a query for:

Find the distinct names of all students with percent>90 in courseno=107

given three relations:

Students (rolno:integer,sname:string)
Courses (courseno:integer,cname:string)
Registration (rollno:integer,courseno:integer,percent:real)

According to an exam solution,the query solution is:

SELECT DISTINCT S.name 
FROM Students as S,
     Registration as R 
WHERE R.rollno=s.rollno 
  AND R.courseno=107 
  AND R.percent>90

I think that if two students(say there are two students having same name so its name would be eliminated) so I am worried about use of DISTINCT here.

Best Answer

I see the point you are making: if there is more than 1 student with the same name, enrolled in course 107, with a percentage greater than 90%, then your query will not distinguish between them.

Given the limited amount we know about the dataset, the solution provided does do the job, but I would have added the caveat about duplicate student names, or proposed an alternative solution:

SELECT S.name,
       count(*)
FROM   Students as S,
       Registration as R 
WHERE  R.rollno=s.rollno 
  AND  R.courseno=107 
  AND  R.percent>90
GROUP  BY S.name

and in this way, we list distinct names, but are aware when there are multiple records selected for the same name.

Related Solutions

Obtaining gratis copy of the SQL specification

Here they are:

But I'd rather read read a book:

Aggregate SQL Standard – Correct Result for Query

Per the standard:

SELECT 1 FROM r HAVING 1=1

means

SELECT 1 FROM r GROUP BY () HAVING 1=1

Citation ISO/IEC 9075-2:2011 7.10 Syntax Rule 1 (Part of the definition of the HAVING clause):

Let HC be the <having clause>. Let TE be the <table expression> that immediately contains HC. If TE does not immediately contain a <group by clause>, then “GROUP BY ()” is implicit. Let T be the descriptor of the table defined by the <group by clause> GBC immediately contained in TE and let R be the result of GBC.

Ok so that much is pretty clear.

Assertion: 1=1 is true search condition. I will provide no citation for this.

Now

SELECT 1 FROM r GROUP BY () HAVING 1=1

is equivlent to

SELECT 1 FROM r GROUP BY ()

Citation ISO/IEC 9075-2:2011 7.10 General Rule 1:

The <search condition> is evaluated for each group of R. The result of the <having clause> is a grouped table of those groups of R for which the result of the <search condition> is True.

Logic: Since the search condition is always true, the result is R, which is the result of the group by expression.

The following is an excerpt from the General Rules of 7.9 (the definition of the GROUP BY CLAUSE)

1) If no <where clause> is specified, then let T be the result of the preceding <from clause>; otherwise, let T be the result of the preceding <where clause>.

2) Case:

a) If there are no grouping columns, then the result of the <group by clause> is the grouped table consisting of T as its only group.

Thus we can conclude that

FROM r GROUP BY ()

results in a grouped table, consisting of one group, with zero rows (since R is empty).

An excerpt from the General Rules of 7.12, which defines a Query Specification (a.k.a a SELECT statement):

1) Case:

a) If T is not a grouped table, then [...]

b) If T is a grouped table, then

Case:

i) If T has 0 (zero) groups, then let TEMP be an empty table.

ii) If T has one or more groups, then each <value expression> is applied to each group of T yielding a table TEMP of M rows, where M is the number of groups in T. The i-th column of TEMP contains the values derived by the evaluation of the i-th <value expression>. [...]

2) Case:

a) If the <set quantifier> DISTINCT is not specified, then the result of the <query specification> is TEMP.

Therefore since the table has one group, it must have one result row.

Thus

SELECT 1 FROM r HAVING 1=1

should return a 1 row result set.

Q.E.D.

Best Answer

Related Solutions

Obtaining gratis copy of the SQL specification

Aggregate SQL Standard – Correct Result for Query

Related Question