SQL Server 2008 R2 installed on a virtual server
IMO, the SQL Server VM should live with the rest of the VMs. If this device goes kaput (the Cybernetics website says nothing about the number of controllers in this model range, so presumably there's only one), you don't want all your eggs in one basket. Hopefully there are other, better protected devices available.
The SQL Server is more read intensive, with data bulk loaded nightly like a data warehouse and the web app used for looking up and reporting on the data.
You haven't said how big the data warehouse is or what the load time SLA is, but I strongly suspect RAID 5/6 is not going to be appropriate for that, particularly since 7.2k RPM drives are involved. High drive latency combined with RAID 5/6 is two feet in the grave for write performance.
The client can also look at optimizing the bulk loading process to be more efficient. It's likely there's inefficiencies in this area.
- Move backups to a different SAN or network location, in order to free up the F drive for SQL use as well as for better DR.
- Move log files and tempdb data files to the F drive.
Move the backups for DR: Yes, definitely. If there's space available on another device, use it.
Since the storage is all a big pool of RAID 6, the performance characteristics will be the same regardless of which logical volume is used, so I don't see any point in moving things around unless more space is needed.
I am wanting to suggest the client reconfigure their SAN as follows:
Aside from moving the backups off this SAN, this configuration will probably make things worse:
Using RAID 1 for a single database log file is fine, but as soon as you have more than one log file, that's random access, and performance will nosedive.
Reads from and writes to data files were distributed over the 8 drives before, now the same load will be on 3 drives.
I would recommend using all 8 drives in one pool of RAID 10, which is the only RAID level that makes sense given the workload and the available hardware, and this is only 1 drive less usable space than your proposed configuration. There's just not enough wiggle room to micromanage the physical pools here. The whole thing will need to be redone if the initial sizing is wrong, or if the sizing changes drastically over time. Using one pool gives way more flexibility.
Now having said all of this, depending on how big the data set is, it's usually a really good ROI to buy more RAM so that SQL Server needs to do fewer physical reads in the first place, thus increasing performance. But that won't help with the write workload, of course.
On SQL Server 2005, and SQL Server 2012, I did the following:
USE tempdb;
CREATE TABLE ImpTest
(
ImpTestID INT NULL
, ImpTestDate DATETIME NULL
);
GO
BULK INSERT ImpTest FROM 'C:\SQLServer\ImportTest.txt' WITH (FIELDTERMINATOR=',');
SELECT * FROM ImpTest;
DROP TABLE ImpTest;
With the following data:
1,
2,20130406
3,NOT_DATE
4,
5,20130409
On both servers, I received the following:
Msg 4864, Level 16, State 1, Line 1
Bulk load data conversion error (type mismatch or invalid character for the specified
codepage) for row 3, column 2 (ImpTestDate).
Perhaps, as Aaron noted, this is a bug that needs to be addressed through http://connect.microsoft.com
I'm wondering about some of the details of your system, including locale settings, collations, physical format of the import file (is it from a Unix system? etc).
Best Answer
The Circuitous Path to Learning (i.e. a semi-sequential log of what happened)
Sending a
BULK INSERT
query - referencing the z:\ path of my workstation - via the python script returns an errorThe query is a variation of
The error was
Switching to the S:\ based-path gives me
Hmm, okay. I'll follow my own advice here and enable read and execute permission for the service accounts for the path specified.
Then if I specify an error file via the
ERRORFILE = '{}'
parameter, I may get:I cannot specify an "append" mode. As the MS documentation points out if I bother to read it, it just returns an error if the file exists. That's frustrating. Okay...I say, I'll just delete the file (and the file.err.Error.txt that is also created) and re-run.
So...write access too then? Silly me. Do this again for write access.
About now I hit an error that suggested that I didn't have access for
BULK INSERT
operations, so had to enable that featureSolving that, I still get
eID is an autoincrement primary key that I'm trying to add to my data set in a staging table. I tried destroying the table and moving the eID column to the last. That doesn't appear to help.
I suspect that something is wrong with my row terminator. Expanding the field size of the 12th column, immediately before eID, complains about
Playing around with the
ROWTERMINATOR
statement, I settle on, in python,ROWTERMINATOR = '{}'.format(chr(10))
Still problems. Maybe if I add the
KEEPIDENTITY
keyword...That's weird, I say to myself, why would you need
ALTER TABLE
forKEEPIDENTITY
? Then I look upKEEPIDENTITY
again, and realize it's basically the opposite of what I want. I want it to "GENERATEIDENTITY
" instead, but that's not an option. So, one option is to go with an XML file where I specify that the eID column is not coming from the imported data. Nah, I'll try a view instead.Once the view was written, with all columns - minus eID - in same order as the underlying table, I could execute the following
which generates e.g.
I have to police the directory for error dump files and remove them before the SQL call, or I need to name my error files
"{}.err".format(gmtime())
and trash them manually, but frequently. I had intended to use the system temp directory viaERRORFILE = '%TEMP%/file.err'
. But that gives:So despite SQL Server resolving linux-style path slash direction, it can't handle a simple environment variable expansion. Maybe I'll hard-code my server's temp directory path into my python code so I don't have to do any manual purges of my data directory. An alternative would be to create a cross-drive junction/symlink that points to the system temp, but that seems to be asking for trouble.