I do not understand why the failover to DC2 has to be done manually (even if other parts have to be done manually: one thing less on your to do list in case of a major failure is always a good thing!).
In general, my feeling is that there are conceptual flaws in your setup.
Here is how I would do it and why.
- I would not have manual failover. It is better to have slow access than none. What will happen in the current configuration is that if the primary fails, there will be a tie and therefor the whole set would enter secondary state, effectively turning the cluster into read-only mode. So even when everything else is fine in DC1 and there is no need for failing over to DC2, a failing primary will be a show stopper. With setup, you are artificially creating a single point of failure, effectively gainst the whole idea of a cluster, let alone a multi DC setup. Sounds like a Very Bad Idea™ to me. Automatic failover, even to DC2 sounds like a better idea. Slower reads and (depending on your write concern) slower writes still are better than read only mode.
- I would have a third datacenter with only one instance: an arbiter. An arbiter can easily be run on a micro-machine as it will only be called in case of an election and an election is a cheap task in terms of RAM and computation power. The arbiter will help the set to always have a majority: If one DC gets disconnected for whatever reason, the other DC and the arbiter will form a majority. So if one DC goes down, you have only to worry about your other parts of your application. You don't have to wait
- I am pretty sure that automatic failover for the other parts of your application can be achieved with some time and effort. Especially if you store all data in mongoDB and you have some sort of session replication available, it should be quite easy. Whether implementing automatic failover is worth the effort is pretty easy to calculate: Get your average downtime, find out how big the losses are created by this downtime in terms of money and customer satisfaction (if applicable). If the costs of implementing automatic failover is below or equal, go for automatic failover. I can help you with that if needed.
Your setup is plainly wrong.
First, what sense does it make to have automatic failover when your storage system or the connection to it creates a single point of failure? And if you have a storage system which eliminates every single point of failure ( redundancy in power, network interfaces plus network infrastructure, RAID controllers, main boards and according RAM ), this would be much more expensive than setting up two simple boxes (plus a virtualized arbiter) in a replica set. And you still would have only the same advantage as the much simpler and easy solution.
Next, the backup issue. Granted, SAN snapshots are propably the easiest way to create backups and they are pretty fast. However, there will be an interruption in the service, however short it may be, when doing a snapshot on one instance of data only.
Third, a situation dreaded in all high availability scenarios: the split brain situation. How should you deal with that when both data bearing nodes would try to access the same data set? Since one would hold the lock on the data set but is demoted to secondary by an election and the elected primary can not get hold of the data set as the other instances is holding the lock, the newly elected primary would step down. Let me note that this scenario could only very theoretically take place if you started a second node by means of HALinux or something with a very strange setup. You would start to have to deal with something like STONITH, which comes with it's very own and sometimes rather delicate problems, further increasing the most expensive ressource you have: continuous administrative costs. A classical HA setup requires 24/7 monitoring and response times measured rather in seconds than in several minutes. MongoDBs failover capabilities are - when set up correctly - reliable and usually works without the need of manual intervention unless in rare edge cases. Edge cases of a failover, that is. So the edge cases of anyway rare edge cases.
Having that said: no, what you want is not possible with MongoDB out of the box. The reason for that is that there are lock files created in the data directory, preventing the second server from accessing the same data. So what you would have to do is to set some HALinux, remove that lock file before firing up the second server and then fire up the IP address. This comes with some other serious drawbacks from the MongoDB side, however. The data may be inconsistent, as the first server might not be able to flush it's data to the data files. Thinking of it twice, it might well be that the lock is only checked during startup, which might actually lead to the two servers flushing their data on the same data set, resulting in data FUBAR.
TL;DR: Use MongoDB (or any other tool, for that matter) within it's intended environment and for it's intended purpose. Don't do it as you originally planned. It's neither worth the effort nor does it give any advantage except saving - compared to the costs you create by the idea like how you want to do it - relatively cheap disk space. Don't take chances with your production data.
Best Answer
So assuming that node(n) is a physical box, and the host(n) is a mongod process...If the link between the nodes fails, nothing cool will happen :).
node1 will continue operating as normal, and node2 will just sit there asking wtf. The regular secondary will only have 2/4 votes which is not the strict majority.
If node1 fails, then you will have to manually configure the secondary on node2 as the primary.
When node1 comes back online, the previous primary will come back online as a secondary and it will detect any unfinished writes, and do a rollback which will require manual intervention.
Keeping in mind that you need a strict majority to elect a new primary, I would try putting the arbiters on their own system that is independent of the two nodes.