You are too modest - your SQL is well and concisely written given the task you are undertaking. A few pointers:
t1.name <> t2.name
is always true if t1.name = REPLACE(t2.name, 'DUP_', '')
- you can drop the former
- usually you want
union all
. union
means union all
then drop duplicates. It might make no difference in this case but always using union all
is a good habit unless you explicitly want to drop any duplicates.
if you are willing for the numeric comparisons to happen after casting to varchar, the following might be worth considering:
create view test_attribs_cast as
select id, name, attr1, attr2, cast(attr3 as varchar(2000)) as attr3,
cast(attr4 as varchar(2000)) as attr4, attr5
from test_attribs;
create view test_attribs_unpivot as
select id, name, 1 as attr#, attr1 as attr from test_attribs_cast union all
select id, name, 2, attr2 from test_attribs_cast union all
select id, name, 3, attr3 from test_attribs_cast union all
select id, name, 4, attr4 from test_attribs_cast union all
select id, name, 5, attr5 from test_attribs_cast;
select 'attr'||t1.attr# as different, t1.id as id_1, t2.id as id_2, t1.name,
t2.name as name_dup, t1.attr as val1, t2.attr as val2
from test_attribs_unpivot t1 join test_attribs_unpivot t2 on(
t1.id<>t2.id and
t1.name = replace(t2.name, 'DUP_', '') and
t1.attr#=t2.attr# )
where t1.attr<>t2.attr or (t1.attr is null and t2.attr is not null)
or (t1.attr is not null and t2.attr is null);
the second view is a kind of unpivot
operation - if you are on at least 11g you can do this more concisely with the unpivot
clause - see here for an example
- I'm say don't go down the procedural route if you can do it in SQL, but...
- Dynamic SQL is probably worth considering despite the problems you mention with testing and maintenance
--EDIT--
To answer the more general side of the question, there are techniques to reduce repetition in SQL, including:
But you can't bring OO ideas into the SQL world directly - in many cases repetition is fine if the query is readable and well-written, and it would be unwise to resort to dynamic SQL (for example) just to avoid repetition.
The final query including Leigh's suggested change and a CTE instead of a view could look something like this:
with t as ( select id, name, attr#,
decode(attr#,1,attr1,2,attr2,3,attr3,4,attr4,attr5) attr
from test_attribs
cross join (select rownum attr# from dual connect by rownum<=5))
select 'attr'||t1.attr# as different, t1.id as id_1, t2.id as id_2, t1.name,
t2.name as name_dup, t1.attr as val1, t2.attr as val2
from t t1 join test_attribs_unpivot t2
on( t1.id<>t2.id and
t1.name = replace(t2.name, 'DUP_', '') and
t1.attr#=t2.attr# )
where t1.attr<>t2.attr or (t1.attr is null and t2.attr is not null)
or (t1.attr is not null and t2.attr is null);
It is possible to store strings with different collations in the same column using SQL_VARIANT:
CREATE TABLE dbo.Localized
(
text_id INTEGER NOT NULL,
lang_id INTEGER NOT NULL,
text_body SQL_VARIANT NOT NULL,
CONSTRAINT [PK dbo.Localized text_id, lang_id]
PRIMARY KEY CLUSTERED (text_id, lang_id),
)
GO
INSERT dbo.Localized
(text_id, lang_id, text_body)
VALUES
(1001, 2057, N'Database problems' COLLATE Latin1_General_CI_AS);
GO
INSERT dbo.Localized
(text_id, lang_id, text_body)
VALUES
(1001, 1025, N'قاعدة بيانات المشاكل' COLLATE Arabic_CI_AS)
This design has several drawbacks (including being limited to 8000 bytes), not least in the search area: SQL_VARIANT
cannot be full-text indexed, and some string comparison features (e.g. LIKE
) cannot be used directly either. On the other hand, it is possible to create a regular index on SQL_VARIANT
and perform the more basic comparisons (e.g. <, =, >) in a collation-aware fashion:
CREATE UNIQUE INDEX uq1 ON dbo.Localized (text_body)
GO
-- One row
SELECT
l.*
FROM dbo.Localized AS l
WHERE
l.text_body = CONVERT(SQL_VARIANT, N'Database problems' COLLATE Latin1_General_CI_AS)
-- No rows (and no collation error!)
SELECT
l.*
FROM dbo.Localized AS l
WHERE
l.text_body = CONVERT(SQL_VARIANT, N'Database problems' COLLATE Arabic_CI_AS)
-- One row, index seek, manual version of "LIKE 'D%'"
SELECT
l.*
FROM dbo.Localized AS l
WHERE
l.text_body >= CONVERT(SQL_VARIANT, N'D' COLLATE Latin1_General_CI_AS)
AND l.text_body < CONVERT(SQL_VARIANT, N'E' COLLATE Latin1_General_CI_AS)
We can also write the usual sort of procedures:
CREATE PROCEDURE dbo.GetLocalizedString
@text_id INTEGER,
@lang_id INTEGER,
@text_body SQL_VARIANT OUTPUT
AS
BEGIN
SELECT
@text_body = l.text_body
FROM dbo.Localized AS l
WHERE
l.text_id = @text_id
AND l.lang_id = @lang_id
END
GO
DECLARE @text SQL_VARIANT
EXECUTE dbo.GetLocalizedString
@text_id = 1001,
@lang_id = 1025,
@text_body = @text OUTPUT
SELECT @text
Of course, full-text indexing is also problematic in the "single table for all translations" design, since full-text indexing (all but) requires a language id setting per column. The multiple table design described by Joop Eggen could be full-text indexed (though it would naturally require one index per table).
The other main option is to have one column per locale in the base table:
CREATE TABLE dbo.Example
(
text_id INTEGER NOT NULL,
text_2057 NVARCHAR(MAX) COLLATE Latin1_General_CI_AS NULL,
text_1025 NVARCHAR(MAX) COLLATE Arabic_CI_AS NULL,
CONSTRAINT [PK dbo.Example text_id]
PRIMARY KEY CLUSTERED (text_id)
)
This arrangement does have a certain simplicity to it, and works well with full-text indexing, though it does require a new column to be added with each new language, and many developers find this sort of structure inelegant and unsatisfactory to work with.
Each of the alternatives has advantages and disadvantages, and will require indirection at some level or another, so it might depend on where the developers concerned feel happiest locating that indirection. I imagine most people will prefer the multiple-table design for most purposes.
Best Answer
Your code looks mature, mostly, but see below! Once you fixed that, I don't see much that could go wrong with it. I share the doubts that it will improve performance much, though. And there may be better alternatives, depending on exact requirements.
General advice
For simple functions as displayed:
Use simple SQL function (
LANGUAGE sql
) which can be inlined in the context of an outer query. SQL is preferable for simple functions that are typically nested.Remove the
modifier. Contrary to what you might expect, this is probably not going to help with simple SQL functions, because it can prevent said inlining. (STRICT
STRICT
may help with expensive functions, though.)Multiple users
To accommodate multiple users with different but more or less stable preferences, you could improve your design building on the typical default
search_path
settingserach_path = "$user",public
.Create a separate version for each user with the same function name in their respective private schemas. Unless schema-qualified, each user will see his own version of the function
fnc_common_format_money()
. And they can all use the same queries.Switch schema
Or, if we are dealing with changing requirements for a single user, you could switch the
search_path
manually (and an arsenal of functions with it):verbose_format
being one name of a schema with specialized functions. Could also come before"$user"
.Related:
You may not need a regime for dynamic settings any more. If you still do, extend your table
test_config
with a columnusername
(orschemaname
) to store a separate row for each user (schema) and adapt your code accordingly. Make it a bit simpler and safer while being at it:Code suggestions
For the "multiple users" scenario.
Major points
Build one function per user setting. (Optional, but the rest of the advice applies in any case.)
Build conditions into the trigger directly and call trigger function only after relevant columns have been changed or on
INSERT
(this way you do not have a reference toOLD
in the trigger function, which would not work with the additional callON INSERT
. The simple trigger only creates required functions and does not take care of orphaned versions.Parameter name inside the body of an SQL function requires pg 9.2+.
Use
format()
to inject (sanitized!) parameters into theCREATE FUNCTION
statement. Closing a possible security issue with SQL injection.