Summary
There's no logical reason it couldn't be done, but the benefit is small and there are some pitfalls that may not be immediately apparent.
Research Results
I did some research and found some good information. The following is a direct quote from a reliable primary source (that wishes to remain anonymous) at 2012-08-09 17:49 GMT:
When SQL was first invented, it had no aliases in the SELECT clause.
This was a serious shortcoming that was corrected when the language
was standardized by ANSI in about 1986.
The language was intended to be "non-procedural"--in other words, to
describe the data that you want without specifying how to find it. So,
as far as I know, there's no reason why an SQL implementation couldn't
parse the whole query before processing it, and allow aliases to be
defined anywhere and used everywhere. For example, I don't see any
reason why the following query shouldn't be valid:
select name, salary + bonus as pay
from employee
where pay > 100000
Although I think this is a reasonable query, some SQL-based systems
may introduce restrictions on the use of aliases for some
implementation-related reason. I'm not surprised to hear that SQL
Server does this.
I am interested in further research into the SQL-86 standard and why modern DBMSes don't support alias reuse, but haven't had the time to get very far with it yet. For starters, I don't know where to get the documentation or how to find out who exactly made up the committee. Can anyone help out? I also would like to know more about the original Sybase product that SQL Server came from.
From this research and some further thought, I have come to suspect that using aliases in other clauses, while quite possible, simply has never been that high a priority for DBMS manufacturers compared to other language features. Since it is not that much of an obstacle, being easily worked around by the query writer, putting effort into it over other advancements is not optimal. Additionally, it would be proprietary as it is obviously not part of the SQL standard (though I'm waiting to find out more on that for sure) and thus would be a minor improvement, breaking SQL compatibility between DBMSes. By comparison, CROSS APPLY
(which is really nothing more than a derived table allowing outer references) is a huge change, that while proprietary offers incredible expressive power not easily performed in other ways.
Problems With Using Aliases Everywhere
If you allow SELECT items to be put in the WHERE clause, you can not only explode the complexity of the query (and thus the complexity of finding a good execution plan) it is possible to come up with completely illogical stuff. Try:
SELECT X + 5 Y FROM MyTable WHERE Y = X
What if MyTable already has a column Y, which one is the WHERE clause referring to? The solution is to use a CTE or a derived table, which in most cases should cost no extra but achieves the same final end result. CTEs and derived tables at least enforce the resolution of ambiguity by allowing an alias to be used only once.
Also, not using aliases in the FROM clause makes eminent sense. You can't do this:
SELECT
T3.ID + (SELECT Min(Interval) FROM Intervals WHERE IntName = 'T') CalcID
FROM
Table1 T
INNER JOIN Table2 T2
ON T2.ID = CalcID
INNER JOIN Table3 T3
ON T2.ID = T3.ID
That's a circular reference (in the sense that T2 is secretly referring to a value from T3, before that table has been presented in the JOIN list), and darn hard to see. How about this one:
INSERT dbo.FinalTransaction
SELECT
newid() FinalTransactionGUID,
'GUID is: ' + Convert(varchar(50), FinalTransactionGUID) TextGUID,
T.*
FROM
dbo.MyTable T
How much do you want to bet that the newid() function is going to be put into the execution plan twice, completely unexpectedly making the two columns show different values? What about when the above query is used N levels deep in CTEs or derived tables. I guarantee that the problem is worse than you can imagine. There are already serious inconsistency problems about when things are evaluated only once or at what point in a query plan, and Microsoft has said it will not fix some of them because they are expressing query algebra properly--if one gets unexpected results, break the query up into parts. Allowing chained references, detecting circular references through potentially very long such chains–these are quite tricky problems. Introduce parallelism and you've got a nightmare in the making.
Note: Using the alias in WHERE or GROUP BY isn't going to make a difference to the problems with functions like newid() or rand().
A SQL Server way to create reusable expressions
CROSS APPLY/OUTER APPLY is one way in SQL Server to create expressions that can be used anywhere else in the query (just not earlier in the FROM clause):
SELECT
X.CalcID
FROM
Table1 T
INNER JOIN Table3 T3
ON T.ID = T3.ID
CROSS APPLY (
SELECT
T3.ID + (SELECT Min(Interval) FROM Intervals WHERE IntName = 'T') CalcID
) X
INNER JOIN Table2 T2
ON T2.ID = X.CalcID
This does two things:
- Makes all expressions in the CROSS APPLY get a "namespace" (a table alias, here, X) and be unique within that namespace.
- Makes it obvious everywhere not only that CalcID is coming from X, but also makes it obvious why you can't use anything from X when joining table T1 and T3, because X hasn't been introduced yet.
I'm actually quite fond of CROSS APPLY. It has become my faithful friend, and I use it all the time. Need a partial UNPIVOT (which would require a PIVOT/UNPIVOT or UNPIVOT/PIVOT using native syntax)? Done with CROSS APPLY. Need a calculated value that will be reused many times? Done. Need to rigidly enforce execution order for calls over a linked server? Done-with a screaming improvement in speed. Need only one type of row split to 2 rows or with extra conditions? Done.
So at the very least, in DBMS SQL Server 2005 and up, you have no further cause for complaint: CROSS APPLY is how you DRY in the way you are wanting.
Another way, similar to what I proposed to your previous question: Return a set of well known type. Since your column list is dynamic, create a temporary table for the purpose. This announces the type to the system. As a side-effect you get a temp table to keep results for the duration of the session - like you needed in your last question.
CREATE OR REPLACE FUNCTION select_prices(_tbl anyelement, _cols text)
RETURNS SETOF anyelement AS
$func$
BEGIN
RETURN QUERY EXECUTE
'SELECT ' || colList || '
FROM prices
WHERE ...
ORDER BY ...';
END
$func$ LANGUAGE plpgsql;
Call:
CREATE TEMP TABLE t (col1, int, col2, date);
SELECT * FROM select_prices(NULL::t, 'col1, col2')
Or, to keep results in the temp table:
INSERT INTO t
SELECT * FROM select_prices(NULL::t, 'col1, col2')
If you need multiple tables in the same sessions, employ a sequence to get unique names. Related answer on SO:
Create a temporary table from a selection or insert if table already exist
However, this method (just like the other two in your question) are susceptible to SQL injection. You need to make sure it can't be abused.
SQL injection in Postgres functions vs prepared queries
Again, I would try to use this simple statement instead:
CREATE TEMP TABLE t AS
SELECT col1, col2 FROM prices;
Best Answer
If the columns don't exist (VALUES clause)
PostgreSQL gives them default names, but that's besides the point. You can't alias them in a COLUMN list that doesn't exist.
That would be a great example of the VALUES LIST syntax which requires you to alias in
FROM
list.From there you can even use the alias in the select list.
And, that brings us to a complex example like this one found on page 190 of PostGIS in Action, Second Edition.
If that's hard to swallow there is one other awkward construct there and that's a SELECT as a COLUMN.
lo_create
returns an oid. That's beyond the scope of this answer though.For another real world example, see this question I just answered.
In a SELF JOIN
This may be worth mentioning since it is in the docs,
And you self-join you'll get two
bar
columns.But, with FROM aliasing you can label them separately,
However, that's not really better than the COLUMN-aliasing method,
The only difference is that uniquely in the FROM-aliasing method you refer to all instance of
t2.bar
asbar2
.