Occasionally, a situation arises where due to internal problems with self-hosted management VM, system which monitors its health begins to turn it down. The hardest thing in this situation, that it begins to turn it down immediately after turning it on, thus making it impossible to correct the situation. To remedy this situation, it is necessary to translate the system into maintenance mode, thereby disabling the tracking state of this VM.
sudo hosted-engine --set-maintenance --mode=global
Then start VM
sudo hosted-engine --vm-start
Then connect to this VM to detect and resolve problems
Do not forget to turn off maintenance mode
sudo hosted-engine --set-maintenance --mode=none
You can test system state using command (do it after each step, to be shure in right system state)
sudo hosted-engine --vm-status
ovirt-shell failed to start with “No module named kitchen.text.converters” error:
# ovirt-shell
Traceback (most recent call last):
File "/usr/bin/ovirt-shell", line 9, in <module>
load_entry_point('ovirt-shell==3.1.0.7-SNAPSHOT', 'console_scripts', 'ovirt-shell')()
File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 299, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)
File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 2229, in load_entry_point
return ep.load()
File "/usr/lib/python2.6/site-packages/pkg_resources.py", line 1948, in load
entry = __import__(self.module_name, globals(),globals(), ['__name__'])
File "/usr/lib/python2.6/site-packages/ovirtcli/main.py", line 20, in <module>
from ovirtcli.context import OvirtCliExecutionContext
File "/usr/lib/python2.6/site-packages/ovirtcli/context.py", line 18, in <module>
from cli.command import *
File "/usr/lib/python2.6/site-packages/cli/__init__.py", line 3, in <module>
from cli.context import ExecutionContext
File "/usr/lib/python2.6/site-packages/cli/context.py", line 27, in <module>
from cli.settings import Settings
File "/usr/lib/python2.6/site-packages/cli/settings.py", line 23, in <module>
from cli import platform
File "/usr/lib/python2.6/site-packages/cli/platform/__init__.py", line 5, in <module>
from cli.platform.posix.terminal import PosixTerminal as Terminal
File "/usr/lib/python2.6/site-packages/cli/platform/posix/terminal.py", line 24, in <module>
from cli.terminal import Terminal
File "/usr/lib/python2.6/site-packages/cli/terminal.py", line 17, in <module>
from kitchen.text.converters import getwriter
ImportError: No module named kitchen.text.converters
Reason: python-kitchen not installed
Solution:
Install python-kitchen from EPEL repository.
yum install python-kitchen
Vm failed to start, and you can see error looks like that:
VM testVm is down. Exit message: internal error Failed to open socket to sanlock daemon: permission denied.
Possible reason: selinux configuration problem.
Check sebool values:
getsebool -a | grep virt
virt_use_comm --> off
virt_use_fusefs --> off
virt_use_nfs --> on
virt_use_samba --> off
virt_use_sanlock --> on
virt_use_sysfs --> on
virt_use_usb --> on
virt_use_xserver --> off
virt_use_sanlock and virt_use_nfs must be on, if not set it:
setsebool -P virt_use_sanlock=on
setsebool -P virt_use_nfs=on
Vm failed to start, and you can see error looks like that:
VM testVm is down. Exit message: internal error Failed to open socket to sanlock daemon: No such file or directory.
Possible reason: softdog module not loaded.
Solution:
modprobe softdog
service wdmd start
service sanlock start
And, for autoloading softdog module:
echo modprobe softdog >> /etc/rc.modules
chmod +x /etc/rc.modules
Or:
echo -e '#!/bin/sh\nmodprobe softdog\nexit 0' > /etc/sysconfig/modules/softdog.modules
chmod +x /etc/sysconfig/modules/softdog.modules
Ian Levesque reported in users@ovirt.org maillist:
New engine install on remote DB fails “uuid-ossp extension is not loaded”
Alex Lourie post recommendation/solution:
The solution we’ve come up with is this:
Use (or tell remote DB admin to do so) the psql command to load the extension functions to template1 DB on remote DB server: psql -U postgres -d template1 -f /usr/share/pgsql/contrib/uuid-ossp.sql
Now, all newly created databases will include extension functions.
template1 is a special DB in postgres. In fact, when you create a new DB, it is actually copied from template1 with a new name.
Ricky Schneberger reported in users@ovirt.org maillist:
After an normal “yum update” i am unable to get one of the storage domains “UP”.
Maor Lipchuk post solution:
go to the meta data of the data storage (in the storage server
go to {storage_domain_name}/######..../dom_md/metadata)
delete the chksum line _SHA_CKSUM=################
try to activate the storage domain again the DC (it should fail again)
vdsm.log should print the computed cksum of the storage domain (Should
be an error there which say "Meta Data seal is broken (checksum
mismatch).... computed_cksum = ")
copy the comuted chksum to the meta data (_SHA_CKSUM={new chksum number}
try to activate it again.
Force NFS ver. 3, in file /etc/nfsmount.conf
[ NFSMount_Global_Options ]
Defaultvers=3
Nfsvers=3
If management bridge was not created during host setup procedure, remove host from the engine management console. Also, remove vdsm and libvirt from host machine:
service vdsmd stop
service libvirtd stop
yum -y remove *vdsm* *libvirt* *qemu* *sanlock* jpackage*
rm -rf /etc/libvirt/
rm -rf /var/lib/libvirt/
yum clean all
yum makecache
Then try to reinstall host. If that not helps you can try to add ovirt management bridge manually.
At first disable NetworkManager, then correct /etc/resolv.conf
service NetworkManager stop
chkconfig NetworkManager off
Here the examples of ifcfg files, resides in /etc/sysconfig/network-scripts
vim /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
BOOTPROTO=none
NM_CONTROLLED=no
ONBOOT=yes
BRIDGE=ovirtmgmt
vim /etc/sysconfig/network-scripts/ifcfg-ovirtmgmt
DEVICE=ovirtmgmt
BOOTPROTO=static
GATEWAY=xxx.xxx.xxx.xxx
IPADDR=xxx.xxx.xxx.xxx
NETMASK=255.255.255.0
NM_CONTROLLED=no
ONBOOT=yes
TYPE=Bridge
How to prevent possible VM startup failed. I.e. you can look message in vdsm log like that:
qemuProcessReadLogOutput:1005 : internal error Process exited while reading console log output: Supported machines are:
pc RHEL 6.2.0 PC (alias of rhel6.2.0)
rhel6.2.0 RHEL 6.2.0 PC (default)
rhel6.1.0 RHEL 6.1.0 PC
rhel6.0.0 RHEL 6.0.0 PC
rhel5.5.0 RHEL 5.5.0 PC
rhel5.4.4 RHEL 5.4.4 PC
rhel5.4.0 RHEL 5.4.0 PC
Try to run this command on oVirt management node (hack from Jerome Deliege):
psql -U postgres engine -c "update vdc_options set option_value='rhel6.3.0' where option_name LIKE 'EmulatedMachine';"
or this:
psql -U postgres engine -c "update vdc_options set option_value='pc' where option_name LIKE 'EmulatedMachine';"
How to disable ssl support:
psql -U postgres engine -c "update vdc_options set option_value='false' where option_name='UseSecureConnectionWithServers' and version='general';"
psql -U postgres engine -c "update vdc_options set option_value='' where option_name = 'SpiceSecureChannels';"
Then restart oVirt
For version 3.0
service jboss-as stop
service jboss-as start
For version 3.1 and greater
service ovirt-engine stop
service ovirt-engine start
If you disable ssl, you must stop firewalls on engine and hosts.
service iptables stop
After virt-v2v you got error: Failed to import Vm
Also you can look error in /var/log/ovirt-engineengine.log :
2012-08-16 16:39:30,090 ERROR [org.ovirt.engine.core.bll.ImportVmCommand] (pool-3-thread-50) [2781049c] Command
org.ovirt.engine.core.bll.ImportVmCommand throw exception: org.springframework.dao.DataIntegrityViolationException:
CallableStatementCallback; SQL [{call insertsnapshot(?, ?, ?, ?, ?, ?, ?, ?)}]; ERROR: duplicate key value violates
unique constraint "pk_snapshots"
Where: SQL statement "INSERT INTO snapshots( snapshot_id, status, vm_id, snapshot_type, description, creation_date,
app_list, vm_configuration) VALUES( $1 , $2 , $3 , $4 , $5 , $6 , $7 , $8 )"
Solution:
cd export
vim `grep -Ri <vm name> * | cut -d : -f 1`
If Vm or Template remain in state Image Locked more than the reasonable time period, check that the operation (template creation, in my case) really occurs, if not, you can reset this state:
psql -U engine -d engine -c "SELECT vm_guid,template_status,vm_name from vm_static where vm_name like '%<vm or template name>%'";
The output looks like:
vm_guid | template_status | vm_name
--------------------------------------+-----------------+----------------------
61eedf77-de4e-42c2-8870-420372b44501 | | <VmName>
6b807ca8-3bbb-4339-bafa-f6a67893b3bb | 0 | <TemplateName>
If template status not NULL, this line contains vm_guid for locked template in other case line contains vm_guid for you locked Vm.
psql -U engine -d engine -c "update vm_dynamic set status = 0 where vm_guid='61eedf77-de4e-42c2-8870-420372b44501';"
psql -U engine -d engine -c "update vm_static set template_status=0 where vm_guid='61eedf77-de4e-42c2-8870-420372b44501';"