Lots of BUG: soft lockup emails
Lots of BUG: soft lockup emails
I have a Dell R730 that runs ESXi 6.7 and has only three EFA 4 virtual machines on it. They each handle email for different domains. I keep getting emails with the subject "[abrt] kernel: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 24s! [kswapd0:36]" from just one of the instances, never get them from the other two. They differ in the CPU and amount of time stuck as well as the last part, sometimes it's [abrt] kernel: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 40s! [mysqld:2415]. I'll go days without any and then days like today I've already gotten over 10 of them. Are the details in these emails worth posting here for diagnosis?
Re: Lots of BUG: soft lockup emails
No. it's not related to eFa in any way. Google on the message and it will clear things up.Are the details in these emails worth posting here for diagnosis?
Generally, it's a lack of resources on the real host ( The VM is on an "over committed" host with insufficient RAM, CPU, or disk throughput to support the guests.)
To monitor your VM environment and resources, take a look at (free) RVTools https://www.robware.net/rvtools/
No issues seen on a Dell T6xx/ T7xx with the DellEMC-ESXi-6.7Ux versions. (Assuming you did check your CPU compatibility)
“We are stuck with technology when what we really want is just stuff that works.” -Douglas Adams
Re: Lots of BUG: soft lockup emails
What's weird is that the VM with the least amount of use is the only one doing this. This EFA VM had 37 emails through it yesterday and I got 10-12 of these lockup messages and the other two EFA VMs had 1936 and 765 emails through them yesterday with none of these messages ever.
Re: Lots of BUG: soft lockup emails
So...How much system resources have you allotted to each system?
Re: Lots of BUG: soft lockup emails
I'm running EFA v4 on ESXi server 6.5 with almost 20 other vms. I've never had such an issue.
This problem is beyond the scope of the eFa forums as it is a problem with your esxi system.
This problem is beyond the scope of the eFa forums as it is a problem with your esxi system.
Re: Lots of BUG: soft lockup emails
I agree that it isn't EFA related. I have the same issue on an owncloud server on a the same host as the EFA device. EFA device is fine, but the owncloud device was receiving lock ups. I adjusted the CPUs deployment and the lock ups have gone away. I also will be adding additional ram to better serve the 3 vmware images/server that I am running. The owncloud box is Ubuntu and the EFA and the 3rd machine are centos. The fact that one locks up and the other doesn't probably has something to do with which server claims priority over the other. Not sure where to how to diagnose that, but it might be in vmware documentation somewhere. Best of luck.
Re: Lots of BUG: soft lockup emails
How much ram is in your esxi server? My esxi servers are never over 70% committed ram (host has 256 GB and 12 cpu cores).
Also, for those running ESXI server, have you patched your system recently?
Also, for those running ESXI server, have you patched your system recently?