Use a SQLCLR UDT. This might could work, though it is unclear if it presents a net-gain as compared to the approach described above.
Yes, a SQLCLR UDT can have its comparison operators overridden with custom algorithms. This handles situations where the value is being compared to either another value that is already the same custom type, or one that needs to be implicitly converted. This should handle the range filter in a WHERE
condition.
With regards to sorting the UDT as a regular column type (not a computed column), this is only possible if the UDT is "byte ordered". Being "byte ordered" means that the binary representation of the UDT (which can be defined in the UDT) naturally sorts in the appropriate order. Assuming that the binary representation is handled similarly to the approach described above for the VARCHAR(50) column that has fixed-length segments that are padded, that would qualify. Or, if it was not easy to ensure that the binary representation would naturally be ordered in the proper way, you could expose a method or property of the UDT that outputs a value that would be properly ordered, and then create a PERSISTED
computed column on that method or property. The method needs to be deterministic and marked as IsDeterministic = true
.
Benefits of this approach are:
- No need for an "original value" field.
- No need to call a UDF to insert the data or to compare values. Assuming that the
Parse
method of the UDT takes in the P7B18
value and converts it, then you should be able to simply insert the values naturally as P7B18
. And with the implicit conversion method set in the UDT, the WHERE condition would also allow for using simply P7B18`.
Consequences of this approach are:
- Simply selecting the field will return the binary representation, if using the byte ordered UDT as the column datatype. Or if using a
PERSISTED
computed column on a property or method of the UDT, then you would get the representation returned by the property or method. If you want the original P7B18
value, then you need to call a method or property of the UDT that is coded to return that representation. Since you have to override the ToString
method anyway, that is a good candidate for providing this.
It is unclear (at least to me right now since I have not tested this part) how easy/difficult it would be to make any changes to the binary representation. Changing the stored, sortable representation might require dropping and re-adding the field. Also, dropping the Assembly containing the UDT would fail if used in either manner, so you would want to make sure that there was nothing else in the Assembly besides this UDT. You can ALTER ASSEMBLY
to replace the definition, but there are some restrictions on that.
On the other hand, the VARCHAR()
field is data that is disconnected from the algorithm so it would only require updating the column. And if there are tens of millions of rows (or more) then that can be done in a batched approach.
Implement the ICU library which actually allows for doing this alphanumeric sorting. While highly functional, the library only comes in two languages: C/C++ and Java. Which means you might need to either do some tweaks to get it to work in Visual C++, or there is the off chance that the Java code can be converted to MSIL using IKVM. There are one or two .NET side projects linked on that site that provide a COM interface that can be accessed in managed code, but I believe they have not been updated in a while and I have not tried them. The best-bet here would be to handle this in the app layer with the goal of generating sort keys. The sort keys would then be saved into a new sort column.
This might not be the most practical approach. However, it is still very cool that such an ability exists. I provided a more detailed walk-through of an example of this in the following answer:
Is there a collation to sort the following strings in the following order 1,2,3,6,10,10A,10B,11?
But the pattern being dealt with in that question is a bit simpler. For an example showing that the type of pattern being dealt with in this Question also works, please go to the following page:
ICU Collation Demo
Under "Settings", set the "numeric" option to "on" and all of the others should be set to "default". Next, to the right of the "sort" button, uncheck the option for "diff strengths" and check the option for "sort keys". Then replace the list of items in the "Input" text area with the following list:
P12B22
P7B18
P12B3
as456456hgjg6786867
P7Bb19
P7BA19
P7BB19
P007B18
P7Bb20
P7Bb19z23
Click the "sort" button. The "Output" text area should display the following:
as456456hgjg6786867
29 4D 0F 7A EA C8 37 35 3B 35 0F 84 17 A7 0F 93 90 , 0D , , 0D .
P7B18
47 0F 09 2B 0F 14 , 08 , FD F1 , DC C5 DC 05 .
P007B18
47 0F 09 2B 0F 14 , 08 , FD F1 , DC C5 DC 05 .
P7BA19
47 0F 09 2B 29 0F 15 , 09 , FD FF 10 , DC C5 DC DC 05 .
P7Bb19
47 0F 09 2B 2B 0F 15 , 09 , FD F2 , DC C5 DC 06 .
P7BB19
47 0F 09 2B 2B 0F 15 , 09 , FD FF 10 , DC C5 DC DC 05 .
P7Bb19z23
47 0F 09 2B 2B 0F 15 5B 0F 19 , 0B , FD F4 , DC C5 DC 08 .
P7Bb20
47 0F 09 2B 2B 0F 16 , 09 , FD F2 , DC C5 DC 06 .
P12B3
47 0F 0E 2B 0F 05 , 08 , FD F1 , DC C5 DC 05 .
P12B22
47 0F 0E 2B 0F 18 , 08 , FD F1 , DC C5 DC 05 .
Please note that the sort keys are structure in multiple fields, separated by commas. Each field needs to be sorted independently, so that presents another small problem to solve if needing to implement this in SQL Server.
Best Answer
(If you are using SQL Server 2012 or newer, please see @wBob's answer for a cleaner approach. The approach outlined in my answer below is only required if you are using SQL Server 2008 R2 or older.)
You don't need (or want) the thousands' separator when converting to
NUMERIC
, regardless if it is comma, period, or space, so just get rid of them first. Then convert the comma into a period / decimal and you are done:Returns:
For the sake of completeness, I should mention that I also tried:
SET LANGUAGE Greek;
Looking at various format styles for CONVERT, but nothing applies here.
The FORMAT function, but the input type must be a numeric or date/time/datetime value (that and it was introduced in SQL Server 2012, so not applicable to SQL Server 2008 R2 or older).
And nothing else seemed to work. I was hoping to find something more elegant than two
REPLACE
calls, but so far no such luck.Also, just to mention, while not a pure T-SQL solution, this can also be accomplished via SQLCLR. And, there is a pre-done function that does this in the SQL# library (that I wrote) named String_TryParseToDecimal. This function is available in the Free version, and works in every version of SQL Server starting with SQL Server 2005:
Returns: