Comments (15)
@masayuki2009 could you help look at this issue? The deadlock can 100% repo on QEMU.
from incubator-nuttx.
I've just enabled CONFIG_DEBUG_FEATURES CONFIG_DEBUG_ERROR but it works.
diff --git a/boards/arm/imx6/sabre-6quad/configs/smp/defconfig b/boards/arm/imx6/sabre-6quad/configs/smp/defconfig
index f871b91a12..335915089f 100644
--- a/boards/arm/imx6/sabre-6quad/configs/smp/defconfig
+++ b/boards/arm/imx6/sabre-6quad/configs/smp/defconfig
@@ -21,6 +21,8 @@ CONFIG_ARCH_STACKDUMP=y
CONFIG_BOARD_LOOPSPERMSEC=99369
CONFIG_BOOT_RUNFROMSDRAM=y
CONFIG_BUILTIN=y
+CONFIG_DEBUG_ERROR=y
+CONFIG_DEBUG_FEATURES=y
CONFIG_DEBUG_FULLOPT=y
CONFIG_DEBUG_SYMBOLS=y
CONFIG_DEV_ZERO=y
@@ -28,7 +30,6 @@ CONFIG_EXAMPLES_HELLO=y
CONFIG_FS_PROCFS=y
CONFIG_HAVE_CXX=y
CONFIG_HAVE_CXXINITIALIZE=y
-CONFIG_HOST_WINDOWS=y
CONFIG_IMX6_UART1=y
CONFIG_IMX_DDR_SIZE=1073741824
CONFIG_INTELHEX_BINARY=y
from incubator-nuttx.
My defconfig is as follows:
Hmm, you modified many configs in boards/arm/imx6/sabre-6quad/configs/smp/defconfig.
So I've just tried your defconfig again and ran the nuttx on qemu on ubuntu18.04 but it worked.
from incubator-nuttx.
Just tried the latest version of the community, the problem did not repeat. Maybe my version is a little bit old. Thank you for your help. @masayuki2009
from incubator-nuttx.
Just tried the latest version of the community, the problem did not repeat. Maybe my version is a little bit old. Thank you for your help. @masayuki2009
Thanks for your updates.
I think it relates #774. which I fixed recently.
from incubator-nuttx.
@masayuki2009 we found that sabre-6quad:smp will always hang on QEMU if we add some log like the attached patch.
0001-Add-some-log-in-arm_decodeirq-to-repo-the-deadlock-i.zip
from incubator-nuttx.
@masayuki2009 we found that sabre-6quad:smp will always hang on QEMU if we add some log like the attached patch.
0001-Add-some-log-in-arm_decodeirq-to-repo-the-deadlock-i.zip
@xiaoxiang781216 I've just confirmed the deadlock. I will take a look at what is happening tomorrow.
from incubator-nuttx.
Sure, thank for take time to investigate SMP issue.
from incubator-nuttx.
Hmmm... just looking at the code, it looks like arm_decode_irq() would be a bad place to put any debug output. That is because arm_decode_irq() calls arm_doirq() and is where the interrupt mode is set:
88 /* Current regs non-zero indicates that we are processing an interrupt;
89 * CURRENT_REGS is also used to manage interrupt level context switches.
90 */
91
92 CURRENT_REGS = regs;
Then it is tested with:
60 bool up_interrupt_context(void)
61 {
62 return CURRENT_REGS != NULL;
63 }
Prior to CURRENT_REGS being set, the system has no idea that it is in an interrupt handler. Syslog will think it is in a normal task mode and will almost certainly do the wrong thing. I have not tracked down all of that, but I have looked at enough to see that arm_decode_irq() is not a valid place to call syslog() functions. All of these tests would to the incorrect thing:
ramlog.c: DEBUGASSERT(!up_interrupt_context());
syslog_device.c: if (up_interrupt_context() || getpid() == 0)
syslog_putc.c: if (up_interrupt_context() || sched_idletask())
syslog_putc.c: if (up_interrupt_context())
syslog_write.c: if (up_interrupt_context() || sched_idletask())
syslog_write.c: if (up_interrupt_context())
syslog_write.c: if (!up_interrupt_context() && !sched_idletask())
If you move the __err to _arm_doirq() after CURRENT_REGS is set, it may work. It should also work if you add logic to set CURRENT_REGS in arm_decode_irq() too.
from incubator-nuttx.
Another thing to be careful of is doing syslog output in the same logic patch that handles serial console output. If you generate serial console output everytime a serial interrupt occurs, then you will also get into a different kind of infinite loop. Best to check if the irq is the console irq and don't do syslog output.
from incubator-nuttx.
@xiaoxiang781216 I've just confirmed the deadlock. I will take a look at what is happening tomorrow.
(gdb) target extended-remote:1234
Remote debugging using :1234
up_testset () at armv7-a/arm_testset.S:120
120 mov r0, #SP_LOCKED
(gdb) info thread
Id Target Id Frame
* 1 Thread 1 (CPU#0 [running]) up_testset () at armv7-a/arm_testset.S:120
2 Thread 2 (CPU#1 [running]) 0x10802eb0 in spin_trylock_wo_note () at semaphore/spinlock.c:178
3 Thread 3 (CPU#2 [halted ]) 0x108081cc in up_idle () at chip/imx_idle.c:59
4 Thread 4 (CPU#3 [halted ]) 0x108081cc in up_idle () at chip/imx_idle.c:59
(gdb) where
#0 up_testset () at armv7-a/arm_testset.S:120
#1 0x10802e70 in spin_lock (lock=lock@entry=0x10829379 <g_cpu_paused+1> "\001") at semaphore/spinlock.c:89
#2 0x10801250 in up_cpu_pause (cpu=cpu@entry=1) at armv7-a/arm_cpupause.c:282
#3 0x10813164 in sched_addreadytorun (btcb=btcb@entry=0x10836eb0) at sched/sched_addreadytorun.c:280
#4 0x10808148 in up_unblock_task (tcb=0x10836eb0) at armv7-a/arm_unblocktask.c:102
#5 0x10802e28 in nxsem_post (sem=sem@entry=0x10837f94) at semaphore/sem_post.c:165
#6 0x10804068 in nxtask_exitwakeup (status=277052944, tcb=0x10837e10) at task/task_exithook.c:547
#7 nxtask_exithook (tcb=0x10837e10, status=status@entry=0, nonblocking=nonblocking@entry=0 '\000') at task/task_exithook.c:677
#8 0x10803438 in exit (status=0) at task/exit.c:96
#9 0x1080341c in nxtask_start () at task/task_start.c:151
#10 0x00000000 in ?? ()
(gdb) thread 2
[Switching to thread 2 (Thread 2)]
#0 0x10802eb0 in spin_trylock_wo_note () at semaphore/spinlock.c:178
178 return SP_UNLOCKED;
(gdb) where
#0 0x10802eb0 in spin_trylock_wo_note () at semaphore/spinlock.c:178
#1 0x10802638 in irq_waitlock (cpu=1) at irq/irq_csection.c:137
#2 0x10802734 in enter_critical_section () at irq/irq_csection.c:345
#3 0x10806154 in ramlog_addchar (priv=priv@entry=0x10829130 <g_sysdev>, ch=ch@entry=97 'a') at syslog/ramlog.c:230
#4 0x10806478 in ramlog_putc (ch=97) at syslog/ramlog.c:749
#5 0x10814c48 in syslogstream_putc (ch=<optimized out>, this=<optimized out>) at syslog/syslog_stream.c:174
#6 syslogstream_putc (this=0x1082fb8c <g_irqstack_alloc+4004>, ch=97) at syslog/syslog_stream.c:130
#7 0x10806d60 in vsprintf_internal (arglist=0x0, numargs=0, ap=..., fmt=<optimized out>, stream=0x1082fb8c <g_irqstack_alloc+4004>) at stdio/lib_libvsprintf.c:909
#8 lib_vsprintf (stream=0x1082fb8c <g_irqstack_alloc+4004>, stream@entry=0x1082fb84 <g_irqstack_alloc+3996>, fmt=fmt@entry=0x10820de6 "%s: cpu = %d, irq %d.\n", ap=...) at stdio/lib_libvsprintf.c:1278
#9 0x10814c1c in nx_vsyslog (priority=priority@entry=3, fmt=0x10820de6 "%s: cpu = %d, irq %d.\n", fmt@entry=0x10807434 <vsyslog+36> "\f\320\215\342\004\360\235\344T\221\202\020\016", ap=0x1082fbac <g_irqstack_alloc+4036>, ap@entry=0x1082fba4 <g_irqstack_alloc+4028>) at syslog/vsyslog.c:148
#10 0x10807434 in vsyslog (priority=priority@entry=3, fmt=fmt@entry=0x10807434 <vsyslog+36> "\f\320\215\342\004\360\235\344T\221\202\020\016", ap=..., ap@entry=...) at syslog/lib_syslog.c:84
#11 0x10807458 in syslog (priority=priority@entry=3, fmt=0x10820de6 "%s: cpu = %d, irq %d.\n") at syslog/lib_syslog.c:116
#12 0x10800dd8 in arm_decodeirq (regs=0x10833c28 <g_cpu1_idlestack+1832>) at armv7-a/arm_gicv2.c:397
As you can see, the thread2 running on cpu#1 is in interrupt context.
However, as @patacongo pointed out, enter_critical_section() processes as if it were in normal tasking environment, because _err() was called before setting CURRENT_REGS.
from incubator-nuttx.
As you can see, the thread2 running on cpu#1 is in interrupt context.
However, as @patacongo pointed out, enter_critical_section() processes as if it were in normal tasking environment, because _err() was called before setting CURRENT_REGS.
That is not expected to work. That is a coding error that should result in a crash or a hang or other fatal consequence.
from incubator-nuttx.
If you move the __err to _arm_doirq() after CURRENT_REGS is set, it may work.
I've just revert the patch which @xiaoxiang781216 attached and added the following code.
--- a/arch/arm/src/armv7-a/arm_doirq.c
+++ b/arch/arm/src/armv7-a/arm_doirq.c
@@ -91,6 +91,14 @@ static inline uint32_t *_arm_doirq(int irq, uint32_t *regs)
CURRENT_REGS = regs;
+#if 1
+ int cpu = up_cpu_index();
+ if (cpu != 0)
+ {
+ _err("cpu = %d, irq %d.\n", cpu, irq);
+ }
+#endif
+
/* Deliver the IRQ */
Actually hello app works on qemu. (smp and ostest apps also work)
qemu-system-arm -M sabrelite -smp 4 -kernel ./nuttx/nuttx -nographic -s
ABCDGHIJKNOPQ
NuttShell (NSH) NuttX-8.2.0
nsh> dmesg
_arm_doirq: cpu = 1, irq 1.
_arm_doirq: cpu = 2, irq 1.
_arm_doirq: cpu = 3, irq 1.
nsh> hello
Hello, World!!
nsh> dmesg
_arm_doirq: cpu = 1, irq 2.
from incubator-nuttx.
@masayuki2009 and @patacongo thank for the explanation.
Maybe we can change up_interrupt_context to check the hardware status bit instead of CURRENT_REGS, then syslog can be called in any point, but it is another story.
from incubator-nuttx.
Maybe we can change up_interrupt_context to check the hardware status bit instead of CURRENT_REGS, then syslog can be called in any point, but it is another story.
That might be much more compilicated than you think. You should study the probleml before jumping into an unnecessary, alternative solution. Nothing is broken.
Whenever, up_interrupt_context() returns true, the code also assumes that CURRENT_REGS is non-NULL. That is the accepted semantics. Those two things cannot be easily separated. up_assert() is a good example. In the interrupt context, it needs the CURRENT_REGS. These are probably other less obvious dependences.
Also, high priority, nested, zero latency interrupts worked very differently. They are not treated as interrupts at all.
I think this is not a good idea. It is, at least, not something that you should do impulsively or without some significant analysis of the consequences.
from incubator-nuttx.
Related Issues (20)
- Build error in sixaxis_main.c HOT 3
- Documentation: Custom Apps HOT 1
- Tiva C 123gxl CAN driver issue HOT 5
- STM32F411-MINIMUM Kconfig error HOT 1
- Can't create new folder on the dialog when vscode.window.showOpenDialog is called on WSL environment. HOT 3
- Debug Assertions asserting on `rv-virt:knsh32` HOT 1
- Incident with Pull Requests and Actions HOT 6
- cmake fail to generate build file for target rv-virt:ksmp64 HOT 1
- Simulator export fails HOT 16
- CMake Support for The JH7110 Board HOT 3
- [Discussion]How to elegantly implement ALLSYMS feat in CMake build HOT 3
- STM32H7 (& STM32F4): UART write issue HOT 4
- [Discussion] Rust integration in Nuttx HOT 29
- PolarFire SoC - C++ Exception Exits Before Catch HOT 17
- arm64 smp hangs on the sched_unlock call in nx_start function HOT 7
- [Discussion] Doubts regarding nxflat supported features HOT 21
- How to fix: undefined reference to `arm_netinitialize' when configuring pppd on nucleo-f103rb:nsh HOT 2
- Question about RISCV cpu_start HOT 8
- Is this an issue? In SAMA5 sam_tsd_read a call to enter_critical_section() can possibly block. HOT 7
- There are many references to CONFIG_SPI_DMAPRIO in the kernel but it is not defined in none Kconfig or source file HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from incubator-nuttx.