What is the best practice in order to have this scenario done?
It depends on how much data you are replicating i.e.
- Are you replicating selected columns/tables, entire database (all tables), etc ?
- Are both nodes located in the same domain or different domains ?
- Are you replicating in same region or cross region (US-UK, etc) ?
I have implemented T-Rep where I have used same server as publisher and distributor as the data that was needed to replicate was less and also, have implemented separate distribution database on separate server that does all the heavy lifting of publishing the data to subscribers where we had massive data to push down to subscribers.
You have to consider factors like -
- time taken to perform snapshot and applying that snapshot to subscribers
- Feasibility of re-initializing the articles when a major (e.g. schema) change occurs to the tables that are involved in replication.
should I create 2 distributors?
You can use the same distribution database. Though, for ease of maintenance and better performance [reducing contention - both writing to and reading from the distribution database] I would highly recommend you use separate Distribution databases.
Remember that distribution database is the heart of replication. So it requires proper maintenance, backups, etc. Now if you have just 1 distribution database that supports multiple publishers and a DISASTER happened, then restoring it from a previous backup will impact ALL publishers.
From BOL :
In many cases, a single distribution database is sufficient. However, if multiple Publishers use a single Distributor, consider creating a distribution database for each Publisher. Doing so ensures that the data flowing through each distribution database is distinct.
Lastly some good references that will help you :
Deep Dive on Initialize from Backup for Transactional Replication
Replicating Non-Clustered Indexes Improves Subscriber Query Performance
Follow the Data in Transactional Replication - Whitepaper
Troubleshooting Transactional Replication
Scaling Out the Distribution Database
There are locks taken on the subscriber during data transfer which can cause blocking on the subscriber - so you can definitely encounter issues there (and there are options to modify this behaviour at the risk of not being able to rollback failed transactions).
However there aren't any other magical locks or connections between the servers.
But also when the publisher/distributor go down I've never seen the insert also fail to go down and not release the locks (aside from standard rollback time). I'd hesitate to say it wasn't possible but it would be abnormal and I haven't seen it.
The much more common scenario is that the subscriber goes down, log records accumulate in the publisher, fill the disk and your publisher goes down. A large log file and monitoring of backups, space, and the replication itself are essential.
Best Answer
As long as you can access that folder from all the nodes of the cluster there shouldn't be any issue.
Just make sure you have a decent nw connection between the cluster nodes and the replication box, otherwise the generation of the snapshots will take a long time.