I'm going to highlight one of the products listed in the wiki link provided by Jack Douglas.
Galera is a product I've been hearing a lot about recently, but have not had a cause to implement it yet into a production environment.
Back in July, Percona wrote a brief blog about implementing it, but mentioned there was no 'production' release at that time.
They've since released the first major version, v1.0. And Vladimir has been shedding more light on it this month. And with some benchmarks
Due to your write-heavy environment, I highlight this use-case:
Distributing writes across the cluster will harness the CPU power in slave nodes for better use to process client write transactions. Due to the row based replication method, only changes made during a client transaction will be replicated and applying such a transaction in slave applier is much faster than the processing of the original transaction. Therefore the cluster can distribute the heavy client transaction processing across many master nodes and this yields in better write transaction throughput overall.
Disclaimer: I'm not affiliate with Galeria or Percona, and have not used it personally...yet.
If your server has only 4GB of RAM, AWE isn't going to help you much. AWE is designed for when you have greater than 4GB of RAM you need to use.
Timeouts are harder to define. The timeout period itself is likely defined in your application, so it's not inherently SQL Server that's causing the timeouts. However, what you're running into are queries that take to long to complete. To troubleshoot this, I would first start by reviewing the wait stats to identify specifically what is causing things to drag in your SQL Server. Some waits to look for:
BACKUPTHREAD
means your queries are waiting on backup operations to complete. Other backup waits could indicate problems.
- Memory pressure could be any number of waits, but look for things like
RESOURCE_SEMAPHORE
or IO
waits. Also, check the trend on your Page Life Expectancy in Perfmon (SQL Server: Buffer Manager -> Page Life Expectancy). If this continually trends low, your pages are cycling out of memory to quickly.
Another thing you can do is check how long your log shipping backups are taking to complete:
SELECT
database_name
,backup_finish_date
,datediff(ss,backup_start_date,backup_finish_date) SecondsToBackup
FROM
msdb.dbo.backupset
WHERE
type='l'
ORDER BY
backup_finish_date desc;
As for resolution, if your log shipping is happening all at the same time, you might want to try and stagger that out. If it isn't necessary to log ship every 5 minutes, I'd try and go with your 15 minutes and stagger each database off by 5 minutes so that they're running separately.
The other thing is to consider a hardware upgrade. The box your describing sounds fairly out of date, considering there are laptops you can purchase now that have more horse power than what you're describing. If you can move the server to a virtual machine, you'll be able to immediately upgrade to better resources, which could resolve these issues for you.
Best Answer
As always, it depends on a lot of factors, but I'll answer like this:
If the SSIS package takes only the deltas from the last run and performs well, I'd keep using the SSIS, since its something that is already working and I assume you are familiar with.
Log shipping and replication will work, but they have drawbacks. For instance, with log shipping you will have to disconnect all of your users when restoring the log backups on the target server.