UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Explorer
Explorer
6,136 Views
Registered: ‎05-23-2017

Embedded linux system are not stable

Jump to solution

My linux system on the zynq arm are not stable.

 

Sometimes it will stop to some place when booting.

But some times it will work. However, it will crush sometimes:

 

How could I debug the system?

 

Thanks.

[  231.020234] INFO: rcu_sched self-detected stall on CPU
[  231.025299]  0-...: (5249 ticks this GP) idle=afb/140000000000001/0 softirq=8                                                                                                                           52/852 fqs=2625
[  231.033807]   (t=5250 jiffies g=238 c=237 q=10)
[  231.038313] Task dump for CPU 0:
[  231.041524] kworker/u8:0    R  running task        0     6      2 0x00000002
[  231.048566] Workqueue: edac-poller edac_device_workq_function
[  231.054282] Call trace:
[  231.056720] [<ffffff8008088138>] dump_backtrace+0x0/0x198
[  231.062100] [<ffffff80080882e4>] show_stack+0x14/0x20
[  231.067134] [<ffffff80080c04ac>] sched_show_task+0x94/0xf0
[  231.072602] [<ffffff80080c26d8>] dump_cpu_task+0x40/0x50
[  231.077899] [<ffffff800812e258>] rcu_dump_cpu_stacks+0xb4/0xe8
[  231.083714] [<ffffff80080e8474>] rcu_check_callbacks+0x67c/0x860
[  231.089703] [<ffffff80080ebb2c>] update_process_times+0x34/0x60
[  231.095605] [<ffffff80080fafe0>] tick_sched_handle.isra.4+0x38/0x48
[  231.101854] [<ffffff80080fb034>] tick_sched_timer+0x44/0x90
[  231.107409] [<ffffff80080ec5e8>] __hrtimer_run_queues+0xf0/0x178
[  231.113398] [<ffffff80080ec978>] hrtimer_interrupt+0x98/0x1c8
[  231.119130] [<ffffff80086db8e8>] arch_timer_handler_phys+0x30/0x40
[  231.125291] [<ffffff80080dee70>] handle_percpu_devid_irq+0x78/0x128
[  231.131542] [<ffffff80080d9b74>] generic_handle_irq+0x24/0x38
[  231.137268] [<ffffff80080da1ec>] __handle_domain_irq+0x5c/0xb8
[  231.143083] [<ffffff80080814cc>] gic_handle_irq+0x64/0xc0
[  231.148464] Exception stack(0xffffffc0758bbb90 to 0xffffffc0758bbcc0)
[  231.154888] bb80:                                   0000000000000000 ffffff80                                                                                                                           0865ad88
[  231.162707] bba0: 0000000000000000 0000000000000001 ffffffc077f83598 ffffffc0                                                                                                                           77f83580
[  231.170519] bbc0: 0000000000000000 0000000000000000 ffffffc0758aa960 ffffffc0                                                                                                                           758b8000
[  231.178331] bbe0: 0000000000000780 0000000000000000 000000000001480a 00000000                                                                                                                           00000000
[  231.186142] bc00: 0000000000000000 0000000000000000 0000000000000000 00000000                                                                                                                           00000000
[  231.193955] bc20: 0000000000000000 0000000000000000 0000000000000000 ffffff80                                                                                                                           0865ad88
[  231.201767] bc40: 0000000000000000 ffffff8009499580 ffffff8009417000 ffffff80                                                                                                                           09417000
[  231.209579] bc60: ffffffc0758bbd48 ffffffc07586fc78 ffffffc07586fea8 ffffffc0                                                                                                                           758bbcc0
[  231.217391] bc80: ffffff800865afbc ffffffc0758bbcc0 ffffff80080ff958 00000000                                                                                                                           60000145
[  231.225203] bca0: ffffffc0758bbd40 ffffff800886511c ffffffffffffffff ffffff80                                                                                                                           0809a088
[  231.233014] [<ffffff80080827b0>] el1_irq+0xb0/0x140
[  231.237870] [<ffffff80080ff958>] smp_call_function_single+0x88/0x128
[  231.244206] [<ffffff800865afbc>] cortex_arm64_edac_check+0x7c/0xd8
[  231.250369] [<ffffff80086571b0>] edac_device_workq_function+0x78/0xc0
[  231.256793] [<ffffff80080b0474>] process_one_work+0x[  609.092229] INFO: rcu_sched self-detected stall on CPU
[  609.097285]  0-...: (99389 ticks this GP) idle=afb/140000000000001/0 softirq=852/852 fqs=49685
[  609.105971]   (t=99768 jiffies g=238 c=237 q=455)
[  609.110650] Task dump for CPU 0:
[  609.113862] kworker/u8:0    R  running task        0     6      2 0x00000002
[  609.120895] Workqueue: edac-poller edac_device_workq_function
[  609.126619] Call trace:
[  609.129055] [<ffffff8008088138>] dump_backtrace+0x0/0x198
[  609.134438] [<ffffff80080882e4>] show_stack+0x14/0x20
[  609.139471] [<ffffff80080c04ac>] sched_show_task+0x94/0xf0
[  609.144940] [<ffffff80080c26d8>] dump_cpu_task+0x40/0x50
[  609.150235] [<ffffff800812e258>] rcu_dump_cpu_stacks+0xb4/0xe8
[  609.156051] [<ffffff80080e8474>] rcu_check_callbacks+0x67c/0x860
[  609.162040] [<ffffff80080ebb2c>] update_process_times+0x34/0x60
[  609.167942] [<ffffff80080fafe0>] tick_sched_handle.isra.4+0x38/0x48
[  609.174192] [<ffffff80080fb034>] tick_sched_timer+0x44/0x90
[  609.179747] [<ffffff80080ec5e8>] __hrtimer_run_queues+0xf0/0x178
[  609.185737] [<ffffff80080ec978>] hrtimer_interrupt+0x98/0x1c8
[  609.191466] [<ffffff80086db8e8>] arch_timer_handler_phys+0x30/0x40
[  609.197628] [<ffffff80080dee70>] handle_percpu_devid_irq+0x78/0x128
[  609.203879] [<ffffff80080d9b74>] generic_handle_irq+0x24/0x38
[  609.209605] [<ffffff80080da1ec>] __handle_domain_irq+0x5c/0xb8
[  609.215421] [<ffffff80080814cc>] gic_handle_irq+0x64/0xc0
[  609.220802] Exception stack(0xffffffc0758bbb90 to 0xffffffc0758bbcc0)
[  609.227226] bb80:                                   0000000000000000 ffffff800865ad88
[  609.235045] bba0: 0000000000000000 0000000000000001 ffffffc077f83598 ffffffc077f83580
[  609.242857] bbc0: 0000000000000000 0000000000000000 ffffffc0758aa960 ffffffc0758b8000
[  609.250669] bbe0: 0000000000000780 0000000000000000 000000000001480a 0000000000000000
[  609.258481] bc00: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[  609.266293] bc20: 0000000000000000 0000000000000000 0000000000000000 ffffff800865ad88
[  609.274105] bc40: 0000000000000000 ffffff8009499580 ffffff8009417000 ffffff8009417000
[  609.281917] bc60: ffffffc0758bbd48 ffffffc07586fc78 ffffffc07586fea8 ffffffc0758bbcc0
[  609.289729] bc80: ffffff800865afbc ffffffc0758bbcc0 ffffff80080ff958 0000000060000145
[  609.297541] bca0: ffffffc0758bbd40 ffffff800886511c ffffffffffffffff ffffff800809a088
[  609.305353] [<ffffff80080827b0>] el1_irq+0xb0/0x140
[  609.310207] [<ffffff80080ff958>] smp_call_function_single+0x88/0x128
[  609.316544] [<ffffff800865afbc>] cortex_arm64_edac_check+0x7c/0xd8
[  609.322707] [<ffffff80086571b0>] edac_device_workq_function+0x78/0xc0
[  609.329130] [<ffffff80080b0474>] process_one_work+0x1bc/0x380
[  609.334858] [<ffffff80080b0680>] worker_thread+0x48/0x4a8
[  609.340239] [<ffffff80080b6294>] kthread+0xd4/0xe8
[  609.345013] [<ffffff8008082e80>] ret_from_fork+0x10/0x50

 

0 Kudos
1 Solution

Accepted Solutions
Moderator
Moderator
9,980 Views
Registered: ‎12-04-2016

Re: Embedded linux system are not stable

Jump to solution

Hi 

 

May I know the host details of your PC, I mean RAM size and no.of cores etc.,on which you are running Petalinux? Are you using virtual box or normal linux distro?

 

May be as a quick try, try disabling the CONFIG_CPU_IDLE in petalinux and check to see if it makes any progress:

petalinux-config –c kernel -à CPU Power Management -à CPU Idle -à CPU idle PM support

 

 

Best Regards

Shabbir

10 Replies
Newbie rschmitt
Newbie
6,085 Views
Registered: ‎04-30-2017

Re: Embedded linux system are not stable

Jump to solution

If it makes you feel any better, I'm seeing the same thing.  Same stack footprint you are.

 

I will let you know if I figure anything out, please do the same.

 

Thanks,

Rich

0 Kudos
Explorer
Explorer
6,058 Views
Registered: ‎05-23-2017

Re: Embedded linux system are not stable

Jump to solution

Hi @rschmitt Thanks.

 

I thought I am the only one has the same problem. Haha.

BTW which version of petalinux are you using now. I am using the 2017.1 version. 

0 Kudos
Moderator
Moderator
9,981 Views
Registered: ‎12-04-2016

Re: Embedded linux system are not stable

Jump to solution

Hi 

 

May I know the host details of your PC, I mean RAM size and no.of cores etc.,on which you are running Petalinux? Are you using virtual box or normal linux distro?

 

May be as a quick try, try disabling the CONFIG_CPU_IDLE in petalinux and check to see if it makes any progress:

petalinux-config –c kernel -à CPU Power Management -à CPU Idle -à CPU idle PM support

 

 

Best Regards

Shabbir

Newbie rschmitt
Newbie
6,029 Views
Registered: ‎04-30-2017

Re: Embedded linux system are not stable

Jump to solution

Hi, I'm running the Xilinx Linux tagged with v2017.1.  We are running it on a zynqmp based board.  For the most part, it is just the standard linux.  We have a few patches but nothing this low level.  

 

One thing I did do was to disable the cortex edac work queue and the issue went away.  I don't see this as a solution, rather a piece of evidence.  My thinking is we are probably receiving a lot of edac events which cause the workqueue to run non-stop.  There are two firsts involved in this.  The first "first" is that we had just upgraded from the 2016.4 based SDK and the second "first" is that this is a new board for me to test with.  In my previous testing with the 2016.4 Linux and a different board, I never saw this issue.

 

On Monday I will be trying with a different board.  My question to the first guy who reported this is whether he saw his issue on more then one board?  And another question is whether he had run the 2016.4 kernel and if he had ever seen it then?

 

I'll update you after I try another board.

Rich

 

0 Kudos
Moderator
Moderator
5,963 Views
Registered: ‎04-24-2017

Re: Embedded linux system are not stable

Jump to solution

Rich,

 

Currently disabling EDAC drivers is the right solution as this is a know issue. Alternatively you can use anyone of this method.

 

Workaround 1: Add cpuidle.off=1 option in bootargs.

Workaround 2: Remove CONFIG_EDAC_ZYNQMP_OCM and CONFIG_EDAC_CORTEX_ARM64 options from linux.

Workaround 3: Disable low power mode "echo 1 > /sys/devices/system/cpu/cpu1/cpuidle/state1/disable"

 

Also check your device-tree settings and see if you can increase to 800ms 

With this it will avoid cores entering into deeper states aggressively which might help on your device.

 

min-residency-us = <0x000186a0>; //100ms

 

min-residency-us = <0x000c3500>; // 800ms

Thanks,
Sandeep
PetaLinux Yocto | Embedded SW Support

---------------------------------------------------------------------------
Don’t forget to Reply, Kudo, and Accept as Solution.
---------------------------------------------------------------------------
5,948 Views
Registered: ‎06-26-2017

Re: Embedded linux system are not stable

Jump to solution

Hello friends,

 

This is how I solved the problem (at least I think I solved it).

 

petalinux-config -c kernel

General setup>IRQ subsystem>enable Expose hardware/virtual IRQ mapping

General setup>Timers subsystem>Timer tick handling> choose Periodic timer ticks (constant rate, no dynticks)

petalinux-build

 

I think with these modifications the problem has gone. I am testing my board for few days and has not seen the problem yet, but please let us know if after these modifications you still have the problem.

 

0 Kudos
Explorer
Explorer
5,941 Views
Registered: ‎05-23-2017

Re: Embedded linux system are not stable

Jump to solution

@shabbirk

Thanks very much.

You are right.  After disable the "CPU idle PM support", the system runs well.

 

 

0 Kudos
Explorer
Explorer
5,906 Views
Registered: ‎03-22-2016

Re: Embedded linux system are not stable

Jump to solution

The self-detected stall means that some kernel operation is taking too long. Usually that means that the CPU is bogged down and can't keep up or that something running in kernel space has a bug.

 

I ran into this issue with the zcu102. Very repeatable, very frequent. I tried different hardware. I tried swapping out the kernel for a different version. I tried changing realtime options (preempt, etc). The only way I was able to fix it was to entirely disable EDAC (Error Detection and Correction) in the kernel. I'm not sure what the tradeoffs are, but seeing as EDAC is a fairly new feature, I think going without it will probably be ok.

 

It's under Device Drivers -> EDAC (Error Detection And Correction) reporting.

Moderator
Moderator
5,737 Views
Registered: ‎04-24-2017

Re: Embedded linux system are not stable

Jump to solution
AR https://www.xilinx.com/support/answers/69433.html is available for this.
Thanks,
Sandeep
PetaLinux Yocto | Embedded SW Support

---------------------------------------------------------------------------
Don’t forget to Reply, Kudo, and Accept as Solution.
---------------------------------------------------------------------------
Xilinx Employee
Xilinx Employee
3,036 Views
Registered: ‎09-11-2014

Re: Embedded linux system are not stable

Jump to solution

The other issue mentioned in this thread is likely addressed by:

https://www.xilinx.com/support/answers/69143.html