soft lockup

最新推荐文章于 2023-04-09 15:59:10 发布

剥丝机器人

最新推荐文章于 2023-04-09 15:59:10 发布

阅读量3.5k

点赞数

分类专栏： linux watch dog linux

本文链接：https://blog.csdn.net/jk198310/article/details/8963376

版权

linux 同时被 2 个专栏收录

299 篇文章 3 订阅

订阅专栏

linux watch dog

2 篇文章 0 订阅

订阅专栏

static void dump_softlock_debug(unsigned long data);

DEFINE_TIMER(softlock_timer, dump_softlock_debug, 0, 0);

init_timer(&softlock_timer);

static void dump_softlock_debug(unsigned long data)
{
int i, reboot;
u64 system[NR_CPUS], num_jifs;

num_jifs = jiffies - beattime;//获得过去了的时间
for_each_possible_cpu(i) {
system[i] = kcpustat_cpu(i).cpustat[CPUTIME_SYSTEM] - heartbeats[i];
}

for_each_possible_cpu(i) {
if ((num_jifs - cputime_to_jiffies(system[i])) < msecs_to_jiffies(10)) {//如果逝去的时间减去系统占用的时间小于10ms, 说明有问题。
WARN(1, "cpu %d wedged\n", i);
smp_call_function_single(i, smp_dumpstack, NULL, 1);
reboot = 1;
}
}

if (reboot) {
panic_timeout = 10;
trigger_all_cpu_backtrace();
panic("Soft lock on CPUs\n");
}

}

在某个tasklet func( )里面

{

beattime = jiffies;

for_each_possible_cpu(i) {
heartbeats[i] = kcpustat_cpu(i).cpustat[CPUTIME_SYSTEM];

}

mod_timer(&softlock_timer, jiffies + SOFT_LOCK_TIME * HZ);

}

================================================

How to Deliberately Crash a System when Soft Lockup Occurs

Information

When the system experience soft-lockups, e.g. BUG: soft lockup - CPU#1 stuck for 15s! [swapper:0] Pid: 0 one needs to generate a vmcore at the time of the soft-lockups which could be used for further investigation of the issue.

Details

Starting from Red Hat Enterprise Linux 5.3, it is now possible to have the vmcore dump generated automatically at the time of a soft-lockup.

To implement this, firstly one needs to set up and test kdump.

Then update the sysctl.conf file by the below command to panic the system when soft-lockup occurs.

# sysctl -w kernel.softlockup_panic=1

This should now result in the system deliberately crashing and generating a vmcore at the time of a soft-lockup.

Soft lockups are situations in which the kernel's scheduler subsystem has not been given a chance to perform its job for more than 10 seconds.

They can be caused by defects in the kernel, by hardware issues or by extremely high workloads. The kernel includes code (in kernel/softlockup.c) to detect these situations and take action on them.

Issue

Enduser may see CPU soft lockup messages in the log files under heavy load. These are informational messages indicating that a CPU did not respond to a soft lockup timer within the timer window (currently 10 seconds on Red Hat Enterprise Linux). They do not indicate a problem with the system.

Solution

The current upstream setting for this soft lockup timer parameter is 60 seconds.

Altering the default value of kernel.softlockup_thresh from 10 to 30 or above would get rid of this message.

# sysctl -w kernel.softlockup_thresh=30

Add this line to /etc/sysctl.conf (takes effect on next reboot):

kernel.softlockup_thresh=30

Change value dynamically; only affects the system's current value:

echo 30 > /proc/sys/kernel/softlockup_thresh

剥丝机器人

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
soft lockup

static void dump_softlock_debug(unsigned long data);DEFINE_TIMER(softlock_timer, dump_softlock_debug, 0, 0);init_timer(&softlock_timer);static void dump_softlock_debug(unsigned long da
复制链接

扫一扫