We have a four node cluster (SQL Server 2014 on Windows 12) and we recently setup heart beat network on the nodes following a best practices source after experiencing some cluster instability issues.
Health checking goes across any available interface, some interfaces have metrics lower or higher than others but the loss of a single interface should be handled assuming there are other interfaces available. Just for future reference there is no heartbeat network, specifically I believe since 2008 came out.
Ever since a couple of secondary nodes are returning multiple IPs in the DNS.
You've found the issue, duplicate IPs on the network. Look at all of the adaptors and figure out which are duplicated... or better yet, run the cluster validation wizard and have it do the heavy lifting for you!
We have removed the entries from the DNS server for both the nodes multiple times but the IPs of the heartbeat network (the IPs are supposed to be non-routable) keep popping ultimately returning two IPs for the same node and this is causing issues with our backups.
This doesn't make much sense to me... if the underlying issue is that there are duplicated IP addresses (assuming IPv4) then the fix shouldn't be deleting records from DNS, that'll just delay the inevitable which would be that they register back with DNS at some point in the future. The fix would be to identify which adaptors and interfaces are improperly configured.
Non-routable doesn't mean it can't register or otherwise talk with anything else. It obviously is getting registered in DNS and can talk to other servers.
So, yes, this seems like it's working as it's been configured though configured incorrectly.
I have checked the settings on all four nodes and they are the same in the network adapter properties. What am I doing wrong?
My guess is there is a duplicate IP address (tongue-in-cheek). Give the cluster validation wizard a run and choose the networking tests, that should give some actionable output.
PS: In addition, I have also changed some settings in the TCP/IPv4 properties such as removing the DNS server addresses and unchecking the 'register this connection's address in DNS'.
That's probably not a good thing to do and could have undesirable side effects... like the cluster not working at all. Again, you know the root issue so instead of managing the symptoms fix the root cause.
Best Answer
I have come across this issue with my dev environment usually when during the setup of the cluster, i skip the warning for network binding. When you run a cluster validation, do you receive any warnings or errors on the network. Anyways this link fix my issue. @Amr provided the solution to issue. https://social.technet.microsoft.com/Forums/ie/en-US/c77c0b69-1f9d-4467-a0dd-6844e87e2d13/cluster-name-failed-to-update-the-dns-record?forum=exchange2010
Cause:
The cluster name resource which has been added to the DNS prior to setup active passive cluster ( or any type) need to be updated by the Physical nodes on behalf of the resource record itself. When the active node owns the resources it want to update the A record in the DNS database and DNS record which was created won’t allow any authenticated user to update the DNS record with the same owner
Solution:
Delete the existing A record for the cluster name and re-create it and make sure select the box says “Allow any authenticated user to update DNS record with the same owner name “Don’t worry about breaking anything , this has “ZERO” impact to cluster simply delete the A record and re-create as it is suggested here.
http://amradmin.wordpress.com/2011/01/27/event-id-1196-1119-dns-operation-refused-cluster-servers/