soft lockup

static void dump_softlock_debug(unsigned long data);

DEFINE_TIMER(softlock_timer, dump_softlock_debug, 0, 0);


init_timer(&softlock_timer);


static void dump_softlock_debug(unsigned long data)
{
    int i, reboot;
    u64 system[NR_CPUS], num_jifs;

    num_jifs = jiffies - beattime;//获得过去了的时间
    for_each_possible_cpu(i) {
        system[i] = kcpustat_cpu(i).cpustat[CPUTIME_SYSTEM]    -     heartbeats[i];
    }    

    for_each_possible_cpu(i) {
        if ((num_jifs - cputime_to_jiffies(system[i]))  <    msecs_to_jiffies(10)) {//如果 逝去的时间减去系统占用的时间 小于10ms, 说明有问题。
            WARN(1, "cpu %d wedged\n", i);
            smp_call_function_single(i, smp_dumpstack, NULL, 1);
            reboot = 1; 
        } 
    }  


    if (reboot) {
        panic_timeout = 10;
        trigger_all_cpu_backtrace();
        panic("Soft lock on CPUs\n");
    }

}

在某个tasklet func( )里面

{

    beattime = jiffies;

    for_each_possible_cpu(i) {
            heartbeats[i] = kcpustat_cpu(i).cpustat[CPUTIME_SYSTEM];

    }

    mod_timer(&softlock_timer, jiffies + SOFT_LOCK_TIME * HZ);

}

================================================

How to Deliberately Crash a System when Soft Lockup Occurs

Information

When the system experience soft-lockups, e.g.  BUG: soft lockup - CPU#1 stuck for 15s! [swapper:0] Pid: 0 one needs to generate a vmcore at the time of the soft-lockups which could be used for further investigation of the issue.

Details

Starting from Red Hat Enterprise Linux 5.3, it is now possible to have the vmcore dump generated automatically at the time of a soft-lockup.
To implement this, firstly one needs to set up and test kdump.
Then update the  sysctl.conf file by the below command to panic the system when soft-lockup occurs.
sysctl -w kernel.softlockup_panic=1
This should now result in the system deliberately crashing and generating a vmcore at the time of a soft-lockup.
Soft lockups are situations in which the kernel's scheduler subsystem has not been given a chance to perform its job for more than 10 seconds.
They can be caused by defects in the kernel, by hardware issues or by extremely high workloads. The kernel includes code (in kernel/softlockup.c) to detect these situations and take action on them.

Issue

Enduser may see  CPU soft lockup messages in the log files under heavy load. These are informational messages indicating that a CPU did not respond to a soft lockup timer within the timer window (currently 10 seconds on Red Hat Enterprise Linux). They do not indicate a problem with the system.

Solution

The current upstream setting for this soft lockup timer parameter is 60 seconds.
Altering the default value of  kernel.softlockup_thresh from 10 to 30 or above would get rid of this message.
# sysctl -w kernel.softlockup_thresh=30
OR
Add this line to  /etc/sysctl.conf (takes effect on next reboot):
      kernel.softlockup_thresh=30
OR
Change value dynamically; only affects the system's current value:
      echo 30 > /proc/sys/kernel/softlockup_thresh


  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值