At Edinburgh our storage test server (sl6) just updated it's kernel and had to reboot. Unfortunately it did not come back and suffered a kernel segfault during the reboot.
This was spotted to be during the filesystem mounting stage in the init scripts and specifically was caused by modprobe-ing the zfs module which had just been built by dkms.
The newer sl6 redhat kernels (2.6.32-754....) appear to have broken part of the abi used by the ZFS modules built by dkms.
The solution to fix this was found to be:
- Reboot into the old kernel (anything with a version 2.6.32-696... or older)
- check dkms for builds of the zfs/spl modules:
dkms status
- run:
dkms uninstall zfs/0.7.9; dkms uninstall spl/0.7.9
- make sure dkms removed this for ALL kernel versions (if needed run
dkms uninstall zfs/0.7.9 -k 2.6.32-754
) to remove it for a specific kernel - remove all traces of these modules:
for i in /lib/modules/*; do
for j in extra weak-updates; do
for k in avl icp nvpair spl splat unicode zcommon zfs zpios ; do
rm -r ${i}/${j}/${k};
done;
done;
done - reboot back into the new kernel and reinstall zfs:
dkms install zfs/0.7.9; dkms install spl/0.7.9
- Check that you've saved everything important.
- Now load the new modules:
modprobe zfs
- re-import your pools:
zpool import -a
Alternatively: Remove all of the zfs modules (steps 3 and 5) before you reboot your system after installing the new kernel and then dkms will re-install everything on the next reboot.
For more info: https://github.com/zfsonlinux/zfs/issues/7704
For more info: https://github.com/zfsonlinux/zfs/issues/7704
TL;DR When building new kernel modules dkms doesn't always rebuild external modules safely, make sure you remove these dependencies when you perform a kernel update so that everything is rebuilt safely
No comments:
Post a Comment