Proxmox 4.1 kernel panic: downgrade DRBD resources from DRBD 9 to 8.4
Please let our ADS show!
This sites offers only FREE software and it's supported by a few advertisement boxes (no intrusive popups).
Please:
- disable your AdBlocker by adding CoolSoft website to whitelist
- give the proper cookie consent
- enable JavaScript for this website
This seconds wait is to let you update your browser configuration...
Ok, I've done the required changes... now show me your content!I've upgraded my Proxmox environment to the latest 4.1, featuring the brand new (and still in preview state) DRBD 9.
Proxmox documentation is not so clear about it's preview state and, also, users have no choice about the DRBD version to use; Proxmox 4.1 only has DRBD 9, no choice.
That said, once completed upgrade I found that DRBD9 was the cause of host node crashes (I mean the whole host crashed with a kernel panic, together with all hosted VMs).
The issue has been reported by me and another user here:
https://forum.proxmox.com/threads/kernel-panic-with-proxmox-4-1-13-drbd-...
http://www.gossamer-threads.com/lists/drbd/users/27685
It seems that the only way-out is to downgrade DRBD resources to version 8.4.
I've done it and I had no issues after that (my environment is running flawlessy since a week ago).
NOTE: since Proxmox doesn't support 8.4 anymore, you have to build it from sources and replace the original one installed from Proxmox repositories.
System configuration: two nodes Proxmox 4.1, LVM on DRBD resource r0, DRBD 9.0, kernel pve-kernel-4.2.8-1-pve.
Downgrade running kernel from DRBD 9.x to DRBD 8.4
This part describes the procedure to downgrade from a running DRBD 9.x to DRBD 8.4 on the same kernel version.
If you're going to upgrade your kernel (already running a DRBD 8.4 module) to a newer version please follow the chapter below.
We'll have to downgrade one node at a time, let's start from NodeA.
Initialization
Define a variable containing the kernel version you're compiling for. If this is the first downgrade (still running the bundled DRBD 9.0), KVER must be set to current kernel version:
export KVER=`uname -r`
Install build tools, DRBD sources and kernel headers
apt-get install build-essential flex apt-get install pve-headers-$KVER cd /usr/src wget http://oss.linbit.com/drbd/8.4/drbd-8.4.7-1.tar.gz wget http://oss.linbit.com/drbd/drbd-utils-8.9.6.tar.gz tar zxvf drbd-8.4.7-1.tar.gz tar zxvf drbd-utils-8.9.6.tar.gz
Build DRBD module and utils
NOTE: configure scripts will automatically use the $KVER variable defined above.
cd /usr/src/drbd-8.4.7-1 make clean cd drbd make
you can now (optionally) strip the binaries and make them smaller
strip --strip-unneeded drbd.ko
now build the userland utils
cd /usr/src/drbd-utils-8.9.6 ./configure --prefix=/usr --localstatedir=/var --sysconfdir=/etc --without-83support --with-84support --without-manual --with-distro=debian make clean make
and (optionally) strip the binaries to make them smaller
strip --strip-unneeded drbdadm-84 strip --strip-unneeded drbdsetup-84 strip --strip-unneeded drbdmeta
Move all the VMs to NodeB and shutdown resource
Move all of the VMs using the resource we're going to downgrade to the other node, then demote and deactivate resource.
drbdadm secondary r0 drbdadm down r0
Replace the bundled DRBD 9.x module with our own 8.4 version
rmmod drbd_transport_tcp rmmod drbd cd /lib/modules/$KVER/kernel/drivers/block/drbd mv drbd.ko drbd.ko-9.0.0 mv drbd_transport_tcp.ko drbd_transport_tcp.ko-9.0.0 cp /usr/src/drbd-8.4.7-1/drbd/drbd.ko . modprobe drbd
Replace DRBD 9.x tools with version 8.4
cd /usr/sbin mv drbdadm drbdadm-9.0.0 mv drbdmeta drbdmeta-9.0.0 mv drbd-overview drbd-overview-9.0.0 mv drbdsetup drbdsetup-9.0.0 cp /usr/src/drbd-utils-8.9.6/user/v84/drbdadm-84 . ln -s drbdadm-84 drbdadm cp /usr/src/drbd-utils-8.9.6/user/v84/drbdsetup-84 . ln -s drbdsetup-84 drbdsetup cp /usr/src/drbd-utils-8.9.6/user/v9/drbdmeta . cp /usr/src/drbd-utils-8.9.6/scripts/drbd-overview.pl drbd-overview
NOTE: in my setup I've also had to edit resource configuration file and comment out lines with "node-id: xxx;" parameters
DRBD utils warning
DRBD utils now emit a warning like this:
DRBD module version: 8.4.7 userland version: 8.9.6 please don't mix different DRBD series.
That's because drbd-utils-8.9.6 looks forward to 9.x series so they warn if used with previous ones.
It is safe to ignore it but, if you don't feel comfortable, you can suppress the message by defining an environment variable:
export DRBD_DONT_WARN_ON_VERSION_MISMATCH=1
The relevant source file is in drbdadm_main.c, line 3597:
if (!getenv("DRBD_DONT_WARN_ON_VERSION_MISMATCH")) warn_on_version_mismatch();
Then each time you run drbdadm, the version mismatch warning won't be shown anymore.
You can add it to /etc/environment to have it defined for each new shell.
Downgrade resource metadata
sudo drbdadm create-md r0
It will ask you a confirmation before downgrading metadata from v90 to v84 format.
Restart DRBD service
/etc/init.d/drbd restart
Check resource status
The resource should be up and running (and resyncing), otherwise bring it up.
drbdsetup status --verbose
drbdadm up r0
Downgrade the other node
You can follow the same guide to downgrade the other host or, if both nodes share the same kernel and hardware, you could simply copy binary compiled files over it and install them, no need to install build tools.
Kernel updates
Each new kernel you'll install in the future needs a downgraded DRBD 8.4 module to be built for it.
The procedure is almost identical with some small changes.
Install new kernel & headers
Install the new kernel and its headers (i.e. version 4.99.99)
sudo apt-get install pve-kernel-4.99.99-pve pve-headers-4.99.99-pve
Define a variable containing the target kernel version.
Since we're compiling for a kernel not currently running, KVER must be set manually:
export KVER=4.99.99-pve
If there's a newer DRBD version available, you could update its sources too:
cd /usr/src wget http://oss.linbit.com/drbd/8.4/drbd-X.X.X.tar.gz wget http://oss.linbit.com/drbd/drbd-utils-Y.Y.Y.tar.gz tar zxvf drbd-X.X.X-1.tar.gz tar zxvf drbd-utils-Y.Y.Y.tar.gz
Build DRBD module and utils
NOTE: configure scripts will automatically use the $KVER variable defined above, but making the module need to specify the KDIR parameter manually.
Build drbd module:
cd /usr/src/drbd-8.4.7-1 make clean cd drbd make KDIR=/usr/src/linux-headers-$KVER
you can now (optionally) strip the binaries to make them smaller
strip --strip-unneeded drbd.ko
NOTE: building drbd-8.4.7-1 on some 4.4 kernels could fail with the error: drbd_bitmap.c:1033:60: error: ‘__GFP_WAIT’ undeclared (first use in this function)
You need to apply a small patch (see here: http://www.engisoftcloud.com/2016/04/07/instalacion-drbd-en-amazon-linux...):
UPDATE: kernel 4.4.13-2-pve compiled successfully without this patch
find /usr/src/drbd-8.4.7-1 -type f -exec sed -i s/__GFP_WAIT/__GFP_RECLAIM/g {} \;
Then build drbd-tools:
NOTE: this step is optional if DRBD tools sources were not updated.
cd /usr/src/drbd-utils-8.9.6 ./configure --prefix=/usr --localstatedir=/var --sysconfdir=/etc --without-83support --with-84support --without-manual --with-distro=debian make clean make
and (optionally) strip the binaries to make them smaller
strip --strip-unneeded drbdadm-84 strip --strip-unneeded drbdsetup-84 strip --strip-unneeded drbdmeta
Replace the bundled DRBD 9.x module with our own 8.4 version
cd /lib/modules/$KVER/kernel/drivers/block/drbd mv drbd.ko drbd.ko-9.0.0 mv drbd_transport_tcp.ko drbd_transport_tcp.ko-9.0.0 cp /usr/src/drbd-8.4.7-1/drbd/drbd.ko .
Replace DRBD 9.x tools with version 8.4
cd /usr/sbin mv drbdadm drbdadm-9.0.0 mv drbdmeta drbdmeta-9.0.0 mv drbd-overview drbd-overview-9.0.0 mv drbdsetup drbdsetup-9.0.0 cp /usr/src/drbd-utils-8.9.6/user/v84/drbdadm-84 . ln -s drbdadm-84 drbdadm cp /usr/src/drbd-utils-8.9.6/user/v84/drbdsetup-84 . ln -s drbdsetup-84 drbdsetup cp /usr/src/drbd-utils-8.9.6/user/v9/drbdmeta . cp /usr/src/drbd-utils-8.9.6/scripts/drbd-overview.pl drbd-overview
Reboot the node and check resource status
Move all the virtual machines to the other node(s) and reboot this node.
When it came back up check if DRBD resources are up and running on the newer kernel.
uname -r --> must print new kernel version! sudo drbdsetup status --verbose
Downgrade the other node
Follow this same guide to upgrade the other host kernel.
If both nodes have the same kernel and hardware, you could simply copy binary compiled files over it and install them, no need to install build tools.
Beware: copy the *-84 binaries then recreate the links as above.
Hope these instructions will help other users experiencing my same issue...
03 Aug 2016
- just updated to kernel 4.4.13-2-pve without issues, using this same instructions
20 May 2016
- added instructions to disable drbd-utils warning message
12 Apr 2016
- updated procedure (thanks to Jean-Laurent Ivars suggestions)
- added the kernel update procedure
02 May 2016
- added workarounds to build modules for kernel 4.4
05 May 2016
- added optional binary strip commands
06 May 2016
- removed sudo usage (not installed on proxmox by default)
- utils binary update is not optional but mandatory on kernel update
Navigation
Login
Support me
Click here if you want to support CoolSoft using PayPal
Comments
Upgrading kernel part
Hello :)
After having succesfully followed your procedure for the initial downgrade, I followed your procedure for kernel upgrades:
- I made the apt-get dis-upgrade, so it installed the latest kernel version : 4.4.6-1-pve
- I did not forgot to install pve-headers-4.4.6-1-pve too
- I used your tip : export KVER=4.4.6-1-pve
But I can’t obtain the make to use the right kernel version, when I launch the compilation, I clearly see it’s using the actual running kernel version (4.2.8-1-pve), I opened the makefile and I saw the line : KVER = `uname -r` so I decided to comment it but it did not change nothing, I even tried to put directly the right value in the makefile but it’s ignoring these values and using the running kernel ones :
root@virt1 /usr/src/drbd-8.4.7-1/drbd # make
Calling toplevel makefile of kernel source tree, which I believe is in
KDIR=/lib/modules/4.2.8-1-pve/build
make -C /lib/modules/4.2.8-1-pve/build SUBDIRS=/usr/src/drbd-8.4.7-1/drbd modules
CHK /usr/src/drbd-8.4.7-1/drbd/.compat_test.have_bdev_discard_alignment.result
UPD /usr/src/drbd-8.4.7-1/drbd/.compat_test.have_bdev_discard_alignment.result
I don’t know what more to do, I don’t dare reboot to compile it in the right environnement, I really would prefer compile everything correctly before rebooting…
Tanks in advance for your answer ! (I’m going to post this as a comment on your website as it can help other people maybe)
Best regards
Thanks for your comment, I've
Thanks for your comment, I've updated the post to let the build complete successfully for 4.4 kernels.
module and userland version are different
Hello,
i did a fresh install of proxmox 4.2 and test to downgrade drbd to 8.4.
I did all the commands lines, except for moving vm.
I have the following message when I restart drbd :
[email protected]:/usr/sbin# /etc/init.d/drbd restart
DRBD module version: 8.4.7
userland version: 8.9.6
please don't mix different DRBD series.
it's normal or i missed something ?
thank you, best regards.
That message is normal: drbd
I thought about that but I
I thought about that but I wondered if this would interfere with the good functioning of drbd version 8.4. Thank you for this tutorial very clear and usefull :)
upgrade same kernel version
Hello,
I recently (just a few days ago) upgraded the kernel to 4.4.8-1-pve following the upgrade procedure and thank to it everything went perfefctly fine (thanks) but today when I enter apt-get dist-upgrade, i can see in the list, the system want to upgrade my kernel again, even if it's the same version number !
uname -r
4.4.8-1-pve
apt-get dist-upgrade
Lecture des listes de paquets... Fait
Construction de l'arbre des dépendances
Lecture des informations d'état... Fait
Calcul de la mise à jour... Fait
Les NOUVEAUX paquets suivants seront installés :
pve-docs
Les paquets suivants seront mis à jour :
libexpat1 libpve-common-perl libpve-storage-perl proxmox-ve pve-cluster pve-container
pve-firewall pve-headers-4.4.8-1-pve pve-kernel-4.4.8-1-pve pve-manager pve-qemu-kvm qemu-server
12 mis à jour, 1 nouvellement installés, 0 à enlever et 0 non mis à jour.
Il est nécessaire de prendre 62,9 Mo dans les archives.
Après cette opération, 5 823 ko d'espace disque supplémentaires seront utilisés.
Souhaitez-vous continuer ? [O/n]
I don't really understand how a kernel with the same number version can appear in this list but it seem the command "apt-cache show pve-kernel-4.4.8-1-pve" give more information and there is more version number that showned (-49, -50, -51...) so, I suppose, as it's not the same version and the procedure has to be followed one more time...?
it's not that it's long or complicated but always a big moment of stress for me because it's on a production cluster and I don't feal comfortable to rely for my production on the good willing of a compilation :(
Thanks in advance for you answer
That should be the version of
That should be the version of the package and it's up to package mantainer.
You could create a package containing kernel 4.4.8, and name it pve-kernel-4.4.8-pve.
Then apply a patch to the kernel and bring it to version 4.4.8-1, then build a package pve-kernel-4.4.8-1-pve.
Now you find a minor error/bug/discrepancy inside your package or in its install/post-install script; this is completely unrelated with the contained kernel, but you need to rebuild a new package and let your users install it.
That's what the last part of the version number is usually used for.
IMHO this should be included also in package name, like pve-kernel-4.4.8-1-51-pve (like Ubuntu team does), anyway it's up to package mantainer...
I understand, thank you for
I understand, thank you for the clarification :)
regards,
did you give a new try to drbd9
Hello Claudio,
How are you ? I hope good :)
There is again a new kernel !
I’m tired to hold my breath every time there is an upgrade and I was wondering if you recently gave a new try to version 9 ?
I had a few weeks ago a communication with someone from Linbit and he told me that the bug that we had had been clearly identified and even corrected in the lasts dev version of drbd9 but I don’t know exactly witch drbd9 version is delivered with proxmox but it could be possible the kernel panic bug is gone ? (which wouldn’t necessary mean there would no be other issues though)
I just installed pve in a vm to see the drbd used module info :
I don’t know if the kernel hang bug is settle in this version… but it seem to be the last dev version according to this page : http://www.drbd.org/en/community/download
I don’t have the time to test it right now but as soon as i have the time to do so I'll rent two servers for one week (you can do this with OVH) and I’ll give a try because I’m tired to freak out for every update on my production system…
Best regards.
Hi jeanlau,
Hi jeanlau,
no, I haven't upgraded to the new kernel and, most of all, I've not got back to DRBD 9.x.
I think I won't do it till forced (i.e. the 8.4 series won't compile with newer kernels anymore).
I'm actually completely satisfied with my setup and I'm not going to change it so much.
My hosts are isolated from outside world, so kernel upgrades are not my priority.
Anyway I'm interested in your test results: if the bug we're experiencing is fixed in current DRBD9 versions, well, I'll schedule an upgrade as soon as possible.
Cheers
Claudio
Hi Claudio,
Hi Claudio,
I followed the tutorial but I did not have the file
/etc/init.d/drbd
so, for drbd started every reboot, I did this:
apt-get install drbd8-utils
before compiling the module 8.4update-rc.d drbd defaults
Now I have the following message:
Maybie a better solution to have drbd starting at everyreboot, or something to modify in a config file ?
thank you
best regards
It seems you're missing DRBD
It seems you're missing DRBD configuration and it warns about it at start.
Please note that this tutorial is for downgrading an already existing and well configured DRBD installation on a ProxmoxVE host.
It is not enough for using DRBD on a clean system.
It's better you read the DRBD9 page at Proxmox wiki and check your config before trying to downgrade (if needed).
update drbd to 8.4.8-1
Hi,
for information, there is an update drbd version to 8.4.8-1.
I compiled it with last kernel 4.4.15-1-pve without the patch. No problem. Its works :)
Seems there is optimisation for kernel v4.x in changelogs.
Thanks for your comment, will
Thanks for your comment, will update the post...
Drbd utils
Hi Claudio
Thanks a lot for this tutorial. I spent several days trying to setup drbd9 before finding your tutorial wich solves all the problems with this downgrade.
Just a question about drbd utils install: why do you copy the new binaries in the original drbd utils install instead of runing a make install ? I'm not familar with debian OS and packaging (using rpm based linux distro) but running:
seems to do the job.
Any subtle difference ?
Well, your list should be the
Well, your list should be the one to follow BUT it could have some drawbacks:
1) This could remove other dependent packages, or at least mark them as "removable".
6) Make install does not only install binaries, it could do anything related to the software being installed: patching config files, update libs config, cleaning up something.ù
This is why I'd prefer a less-invasive approach by replacing only the needed binaries and keep a copy of the old ones.
PS: the best solution should be to have backported packages like drbd-84 and drbd-utils-84 coming from official Proxmox repos...
Maybe the bug is corrected ?
Hi everyone !
I'm tired of making all these manipulations everytime a kernel upgrade comes... (and feeling so uncomftable, I'm on a production cluster...)
It would be so much peace of mind if I could update my system without all these manip...
Did someone gave a new try to the version included in the lastest proxmox ? aparently there's a new release of drbd (9.0.5) that could have corrected the issue...
Thank you very much for your feedback and until it is the case, thank you so much Claudio for your help !
regards,
I agree, but I don't have any
I agree, but I don't have any experience on newer versions because I'm afraid to get back to kernel panic nightmare...
Hope someone will jump in here and post a positive feedback ;)
bug corrected
I was wondering about renting for a week 2 or 3 servers at ovh and give this a new try but as I remember that I already did it a few month ago and it was a complete waste of time and money... two things I don't have so much ;) I'm yet hesitating...