I am configuring Always-on for SQL Server 2017 RC1 (14.0.80.90, dated 2017-7-18) for Linux as per https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-availability-group-configure-ha. This install is using docker images, all on the same physical host. All the steps are working until I get to the step:
CREATE AVAILABILITY GROUP [ag1]
WITH (DB_FAILOVER = ON, CLUSTER_TYPE = EXTERNAL)
FOR REPLICA ON
N'always-onA'
WITH (
ENDPOINT_URL = N'tcp://always-onA:5022',
AVAILABILITY_MODE = SYNCHRONOUS_COMMIT,
FAILOVER_MODE = EXTERNAL,
SEEDING_MODE = AUTOMATIC
),
N'always-onB'
WITH (
ENDPOINT_URL = N'tcp://always-onB:5022',
AVAILABILITY_MODE = SYNCHRONOUS_COMMIT,
FAILOVER_MODE = EXTERNAL,
SEEDING_MODE = AUTOMATIC
),
N'always-onC'
WITH(
ENDPOINT_URL = N'tcp://always-onC:5022',
AVAILABILITY_MODE = SYNCHRONOUS_COMMIT,
FAILOVER_MODE = EXTERNAL,
SEEDING_MODE = AUTOMATIC
);
And I get the error message:
Msg 35237, Level 16, State 1, Line 2
None of the specified replicas for availability group ag1 maps to the instance of SQL Server to which you are connected. Reenter the command, specifying this server instance to host one of the replicas. This replica will be the initial primary replica.
I have reviewed the release notes at https://docs.microsoft.com/en-us/sql/linux/sql-server-linux-release-notes to insure that there are no known issues, and that this feature should be supported at this time.
Within each docker container, the names "always-onX" resolve:
> root@5194403487fe:/# ping always-onA
> PING always-onA (172.17.0.10): 56 data bytes
> 64 bytes from 172.17.0.10: icmp_seq=0 ttl=64 time=0.125 ms
> ^C--- always-onA ping statistics ---
> 1 packets transmitted, 1 packets received, 0% packet loss
> round-trip min/avg/max/stddev = 0.125/0.125/0.125/0.000 ms
> root@5194403487fe:/# ping always-onB
> PING always-onB (172.17.0.11): 56 data bytes
> 64 bytes from 172.17.0.11: icmp_seq=0 ttl=64 time=0.156 ms
> ^C--- always-onB ping statistics ---
> 1 packets transmitted, 1 packets received, 0% packet loss
> round-trip min/avg/max/stddev = 0.156/0.156/0.156/0.000 ms
> root@5194403487fe:/# ping always-onC
> PING always-onC (172.17.0.12): 56 data bytes
> 64 bytes from 172.17.0.12: icmp_seq=0 ttl=64 time=0.308 ms
> ^C--- always-onC ping statistics ---
> 1 packets transmitted, 1 packets received, 0% packet loss
> round-trip min/avg/max/stddev = 0.308/0.308/0.308/0.000 ms
SQL Server is also listening to port 5022 on each instance (this shows the same for all instances):
# netstat -alnp | grep 5022
tcp 0 0 0.0.0.0:5022 0.0.0.0:* LISTEN 9/sqlservr
Further, no additional IP's exist on the system that could be causing confusion:
# ifconfig
eth0 Link encap:Ethernet HWaddr 02:42:ac:11:00:0a
inet addr:172.17.0.10 Bcast:0.0.0.0 Mask:255.255.0.0
inet6 addr: fe80::42:acff:fe11:a/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:4058 errors:0 dropped:0 overruns:0 frame:0
TX packets:2016 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2771303 (2.7 MB) TX bytes:241070 (241.0 KB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:470 errors:0 dropped:0 overruns:0 frame:0
TX packets:470 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1870657 (1.8 MB) TX bytes:1870657 (1.8 MB)
The names always-onA/B/C resolve both in the host file for the docker images, AND via DNS, so there shouldn't be any issue there as well:
# nslookup always-onA
Server: 192.168.1.1
Address: 192.168.1.1#53
Name: always-onA
Address: 172.17.0.10
And finally, the SQL Server logs show that the mirroring endpoint is listening for connections:
# docker logs always-onA | tail -2
2017-07-23 16:53:27.76 spid56 Server is listening on [ 0.0.0.0 <ipv4> 5022].
2017-07-23 16:53:27.76 spid56 The Database Mirroring endpoint is now listening for connections.
It seems that SQL server isn't able to cleanly identify that one of the configuration nodes in the availability group is in fact itself, and thus gets the error, but I don't see any way to debug this. I've tried using the following instead of the name "always-onA":
always-ona
127.0.0.1
localhost
0.0.0.0
The error is always the same. Looking for any thoughts on how to resolve this issue.
Best Answer
Resolved: The issue is that when creating the availability group, the node specifier, i.e.
N'name'
needs to use the EXACT name that is returned as per the"hostname"
command. Using any other identifier, even if it resolves to the correct IP, does not appear to work.