Ubuntu – Juju bootstrap fails with connection refused errors

14.04cloudipmijujumaas

We have the following MAAS setup-
One node is setup as the server with both MAAS cluster and region controllers running on it. We added 2 nodes which are in a private virtual LAN with the server node. We brought the nodes into 'Ready' state and installed juju on the server. Now when we try to run juju bootstrap, it says Attempting to connect to 10.10.10.104 and fails after 10min with the connection refused error. 10.10.10.104 is one of our node in the private vLAN and that which was already in MAAS.

My suspicion is – the node is in 'Ready' state and hence no OS is installed yet on it. juju is attempting to connecting to it. It should obviously be unable to connect, as MAAS collects all info required from the nodes during PXE boot and shuts down the machines.

juju wants to install OS on the nodes but the machines are not up.

PS: Our power on type is IPMI

EDIT: On running juju bootstrap --debug, we see a slew of these messages

2014-10-12 02:50:58 DEBUG juju.utils.ssh ssh_openssh.go:122 running: ssh -o "StrictHostKeyChecking no" -o "PasswordAuthentication no" -i /root/.juju/ssh/juju_id_r sa -i /root/.ssh/id_rsa ubuntu@slot13.maas /bin/bash

And after 10mins, it now fails with

waited for 10m0s without being able to connect: /var/lib/juju/nonce.txt does not exist

Best Answer

In my case I had wakeonlan as power mode and had to power-up my box to pxe mode after issuing the juju bootstrap command. On bootstrap, Juju will acquire the node in Maas and deploy the OS based on the type Debian or Fast which you have configured for the node - this install time takes long and not within 10mins. So you should update your environment.yaml file with an entry of bootstrap-timeout: 1800 (or value greater than 600 which suits your system)

Related Question