In Arabic we have characters like ا (alef) and أ (alef with hamza).
Users write them interchangeably and we want to search them interchangeably. SQL Server treats them as separate characters. How can I make SQL treat them as the same character?
I thought to replace any أ (alef with hamza) with ا (alef) at insertion but we have a lot of alternatives in Arabic language not just ا (alef) and أ (alef with hamza).
I tried Arabic_CI_AS
and Arabic_CI_AI
but that doesn't solve the problem.
Here is a script to regenerate the issue:
CREATE TABLE [dbo].[TestTable] (
[ArabicChars] [nvarchar](50) NOT NULL,
CONSTRAINT [PK_TestTable] PRIMARY KEY CLUSTERED
(
[ArabicChars] ASC
)
) ON [PRIMARY];
INSERT INTO TestTable values (N'احمد');
INSERT INTO TestTable values (N'أحمد');
SELECT *
FROM TestTable
WHERE ArabicChars like N'ا%';
The result is:
ArabicChars
احمد
(1 row(s) affected)
The desired result would be both of the rows we inserted.
Best Answer
i did few tests and i guess it is a work around but can get your job done, since SQL it self isn't helping much.
if you notice that the unicodes of these characters are close to each other
so between أ and ا , its from 1571 to 1575 or if you want to make sure you get every thing in between
make sure you include from 1569 to 1575
which are
So to make sure that you include every thing similar in your search you can use regular expressions
so in this case you get all characters between ء and ا which include all those between 1569 to 1575
so in this case if your table has
the query above will get them all.
but you will notice something funny
if you have your column as a primary key
you wont be able to insert these 2 records
because the ء,أ,إ are all to SQL are part of hamza which is ء
So if you run the query
it will show you
so to get the long story short
to SQL أ is not = to ا because its 2 different letters hamza and alefp
but ء = آ = أ = ؤ = إ = ئ
they are all Hamza ء