Posts tagged cluster
In case you have the questionable idea of renaming a hypervisor of your proxmox cluster, you are going to feel some pain. (It won't work and you will get scared wether you fucked your system landscape up or not. Been there, done that.)
The only viable and reproducible approach I found was removing all cluster configurations from all HV's, rebooting them, then recreating the cluster on one HV and readding all the others again.
read this, or continue at your own peril
To sum it up again, some notes before:
- tested with proxmox 4.4
- do all the next steps on all hosts
- rebooting is neccessary aftwards, of all hv's. maybe not, but at least the first node had to be rebooted
- the sqlite output, if any is shown, should only appear once
- working ssh between your hv's is neccessary, errors or warning will prevent you from readding nodes after the cluster recreation
/etc/pvebefore doing anything, as you will lose all your vm configurations in the process, but these can be copied back afterwards.
- no guarantee that this post will cover everything
howto remove all clusterconfigs
Let's go: (this is completely paste-able)
# backup cp -va /etc/pve /root systemctl stop pvestatd.service systemctl stop pvedaemon.service systemctl stop pve-cluster.service systemctl stop corosync systemctl stop pve-cluster pmxcfs -l rm /etc/pve/corosync.conf rm /etc/corosync/* rm /var/lib/corosync/* rm -rf /etc/pve/nodes/* sqlite3 /var/lib/pve-cluster/config.db "select * from tree where name='corosync.conf'" sqlite3 /var/lib/pve-cluster/config.db "delete from tree where name='corosync.conf'" sqlite3 /var/lib/pve-cluster/config.db "select * from tree where name='corosync.conf'"
Check for error messages, then:
Recreate the cluster on the first HV: (or whichever one you see fit)
pvecm create CLUSTER-NAME
Then readd all other HVs to your newly created cluster. From each of them, do:
#test ssh ssh IP-OF-FIRST-HV if that does work, add, else see below how to troubleshoot pvecm add IP-OF-FIRST-HV
troubleshooting SSH issues
Adding nodes works best with keyauth (Don't know wether I ever tried it without, to be honest, but I doubt it works.). In case you have reinstalled a node or something, try connecting via ssh from the host in question to your 'first' hv.
Read the error message closely, as known hosts are stored in
# in case you have trouble on a certain host > /root/.ssh/known_hosts > /etc/ssh/ssh_known_hosts ssh-copy-id FIRST_HV
As said before, ssh errors or warnings won't let you add vm's to a cluster.
browser not working
Once you have completed the stuff above, close all browsertabs you had opened to access your cluster. Simply refreshing them does not seem to work.
finishing touches (fix your vms before you become stressed out)
When looking at the webgui, you might become scared, as all your virtual hosts seem to be missing. This happens with VM's, but I guess the same happens with Containers, too.
In fact, we worked on proxmox cluster filesystem where it stores a lot of its settings, which gets mounted at
Which happens to be stored completely under
/var/lib/pve-cluster/config.db as a sqlite3 database.
There all file contents (the actual character that get written into the config file(s)), the inode of the file that shall be created, along with the folder structure etc. etc. .
Once your cluster is running, try
colordiff to spot the exact differences.
colordiff /root/pve /etc/pve to see the file contents)
Or a simple
find /root/pve -iname "*conf" might also do.
Copy the configs back to their original locations, and everything should be fine.
- change drbd on current node to 'active', if needed
- service pve-cluster stop
- pmxcfs -l
- start vm again (
qm start <vm-id>)
To rebuild the cluster again:
- service pve-cluster start
View posts from 2017-03, 2017-02, 2017-01, 2016-12, 2016-11, 2016-10, 2016-09, 2016-08, 2016-07, 2016-06, 2016-05, 2016-04, 2016-03, 2016-02, 2016-01, 2015-12, 2015-11, 2015-10, 2015-09, 2015-08, 2015-07, 2015-06, 2015-05, 2015-04, 2015-03, 2015-02, 2015-01, 2014-12, 2014-11, 2014-10, 2014-09, 2014-08, 2014-07, 2014-06, 2014-05, 2014-04, 2014-03, 2014-01, 2013-12, 2013-11, 2013-10