I have to develop a CMS which will support two Language English, Arabic. This CMS will be a sort of Article Publishing site. While designing & analysis i found that some articles are more than 8000 characters in length. My table has some column as
PageID int,
PageTitleEnglish nvarchar(200),
PageTitleArabic nvarchar(200),
PageDescEnglish nvarchar(500),
PageDescArabic nvarchar(500),
PageBodyEnglish nvarchar(max)
PageBodyArabic nvarchar(max)
If i keep PageBody as nvarchar(4000) then i a limited to 4000 characters and if i have to store Arabic version then i need 16000 bytes (As Arabic is Unicode and take 3 time more space then ASCII).
So i am only left with option of defining PageBody as nVarchar(max), This will have it downside from performance point of view. My actual question is if some data in PageBody column is less than 4000 characters will it MS SQL Store than data in inline column or separately in the database.
I looked for this on Google also but didn't find any relevant answer and how i can improve performance in such scenario.
Any suggestions for best practice for such design of multilingual CMS are welcome.
I need to Support Only two languages Arabic & English
Best Answer
An
nvarchar(max)
value will be stored "in-row" if it is short enough.The default behaviour can be modified using sp_tableoption, "large value types out of row" option. I wouldn't bother. The DB engine will manage this efficiently by itself.
As for design, there are several ways of doing this based on your model:
1. Separate tables
That is, you can split off the separate languages into different tables.
This allows table level collations rather than column level ones
It allows allows more rows per page and more chance of in-row LOB storage
PageParent
PageEnglish (note varchar may be OK here)
PageArabic
2. Separate rows
Or have a languageID column to support several languages.
This has the drawback that collation will be fixed for all languages which means poor sorting/filtering
PageParent
Page