By default, as documented in MSDN, if no length is specified for varchar it will default to 30 when using CAST or CONVERT and will default to 1 when declared as a variable.
To demonstrate, try this :
DECLARE @WithLength varchar(3),@WithoutLength varchar;
SET @WithLength = '123';
SET @WithoutLength = '123';
SELECT @WithLength,@WithoutLength
This is very dangerous, as SQL Server quietly truncates the value, without even a warning and can lead to unexpected bugs.
However, coming to the topic in question, in the given scenario, with or without length does not make any difference between the two statements and I am unable to see the 2 trailing spaces that you are talking about. Try this:
SELECT CONVERT(varchar(8), GETDATE(), 112)
UNION ALL
SELECT DATALENGTH(CONVERT(varchar(8), GETDATE(), 112))
UNION ALL
SELECT CONVERT(varchar, GETDATE(), 112)
UNION ALL
SELECT DATALENGTH(CONVERT(varchar, GETDATE(), 112))
You will notice that the DATALENGTH()
function returns 8 in both cases.
Raj
Randomly, one of these tables' identity values will fall behind, stopping any inserts from happening and we have no idea why.
Inserts are probably stopping because of an attempt to reuse an already existing unique value in the PRIMARY KEY, thus triggering the error like:
Msg 2627, Level 14, State 1, Line 27
Violation of PRIMARY KEY constraint 'PK__ID__1234'. Cannot insert duplicate key in object 'dbo.MyTable'. The duplicate key value is (8).
The statement has been terminated.
Note that if your IDENTITY
column does not have a UNIQUE
index or constraint, it is possible to reseed repeatedly and have many identical ID
values. You do not want to do that, of course.
I have not personally found an error that, in itself, would reseed the IDENTITY
value. Of course, it is possible to reset the SEED
to a range where there will soon be a conflict by running a reseed that is lower than the current seed:
DBCC CHECKIDENT( MyTable,RESEED, 7) WITH NO_INFOMSGS
It could be that code somewhere in one of the processes actually does a RESEED
on the table under some unusual circumstances.
(For example, this could be from a merge of two data sets, where the code reads the high value from one data set and after the import RESEEDs to the lower of the two high values that were merged.)
You should also read Martin Smith's post at: https://stackoverflow.com/questions/14146148/identity-increment-is-jumping-in-sql-server-database
Best Answer
The
date
type can't store partial dates, like only year, or only day and month.In one project I had a need to store incomplete dates and I used three separate nullable columns for
day (tinyint)
,month (tinyint)
andyear (smallint)
. I don't see in what cases you'd need to usevarchar
instead of integer types.In my case it was information about customer birth dates and quite often salespeople didn't know the year, but knew month or day. With this approach it is very flexible and allows the user to enter any bit of information he has. It was also important to know that certain part of the date is missing, such as "we don't know the exact birth date, but we know that it is in June".
Obviously, there were a bunch of custom checks in the application (not at the database level) trying to prevent entering things like 31st of June. In that particular project it was OK to perform the checks in the application. Most likely it would have been possible to implement them in T-SQL as well. In general, it is better to have constraints implemented at the database level, because applications come and go, but data remains.
It was more difficult to use these incomplete dates in queries as well. Different queries/reports required different approaches to dealing with the missing bits.
30K rows is not much at all and performance should not be a problem. So, I'd focus on smart parsing and validation of your Excel data and storing all information that you can extract.