Sunday, April 16, 2017

OpenShift origin installation issues and fix


# Add a entry for master node and slave nodes in /etc/hosts m1    c2    c1

# setup OpenShift cluster
ansible-playbook byo/config.yml 

# faced this error
fatal: [m1]: FAILED! => {"changed": false, "cmd": ["oc", "create", "-n", "openshift", "-f", "/usr/share/openshift/examples/image-streams/image-streams-centos7.json"], "delta": "0:00:00.181546", "end": "2017-04-16 23:59:08.477861", "failed": true, "failed_when_result": true, "rc": 1, "start": "2017-04-16 23:59:08.296315", "stderr": "Unable to connect to the server: x509: certificate signed by unknown authority", "stdout": "", "stdout_lines": [], "warnings": []}

# check differences between these two files - there are few differences, especially master's IP vs hostname

vimdiff /etc/origin/master/admin.kubeconfig  /root/.kube/config

# remove kube config file
mv  /root/.kube/config  /tmp/

# setup OpenShift cluster again
ansible-playbook byo/config.yml 
Another error faced is:

TASK [openshift_master : Start and enable master]
FAILED - RETRYING: TASK: openshift_master : Start and enable master (1 retries left).
fatal: [m1]: FAILED! => {"attempts": 1, "changed": false, "failed": true, "msg": "Unable to start service origin-master: Job for origin-master.service failed because the control process exited with error code. See \"systemctl status origin-master.service\" and \"journalctl -xe\" for details.\n"}

2. When verified using journalctl -xe, following is the error:
 http: TLS handshake error from read tcp4> read: connection reset by peer

All the issues can be resolved by removing /root/.kube/config and rebuilding cluster using ansible-playbook again.