Manually controlling OpenWrt hardware watchdog
Introduction to hardware watchdogs
Function of hardware watchdog is to monitor if system is working correctly and if it goes into an unknown state (ie. freezes or stops working as expected) to return it to a known state - usually by resetting or rebooting the device.
Why should you care?
Why should you care about hardware watchdogs? Well you shouldn't unless you are embedded engineer or if you are making devices and displays for "real world". As you can clearly see from this photo some people should definitely think about it:
Know your enemy :)
Linux and OpenWrt hardware watchdog consists of three parts;
- Hardware (usually part of embedded chip or cpu)
- Middleware (usually linux kernel driver that communicated directly to hardware watchdog)
- Software - tool or system daemon (that enables, disables and configures watchdog settings)
Hardware
Focus of this article is using hardware watchdog on OpenWrt, and most embedded devices that OpenWrt uses have real hardware watchdog in their SoC.
To be honest I haven't researched do our laptops, desktops and servers have hardware watchdog or not.
Middleware
Middle layer between software tools and hardware are Linux kernel drivers, you shouldn't worry about this part, this part is "automagically" handled by OpenWrt. Once hardware watchdog drivers are loaded you will see a new device /dev/watchdog
appear.
Software and confusion
Originally OpenWrt used watchdog
daemon tool to manage hardware watchdog (there is a very detailed man page explaining what exactly watchdog
daemon does).
If for any reason watchdog process stopped 'tickling' hardware watchdog trough /dev/watchdog
device the hardware watchdog would trigger a hardware reset.
The confusion started when in 2013 OpenWrt developers removed watchdog daemon and implemented watchdog feature into procd.
The biggest issue was that there was no mechanism to take control back from procd
so that you can manually tickle or not tickle hardware watchdog.
This caused a lot of confusion and many unanswered questions regarding how to properly use hardware watchdog in OpenWrt for custom projects.
Some projects like sudomesh created a custom version of procd
to take back watchdog control from procd
.
Magicclose
In middle of 2017 we finally got a patch (thanks to Hans Dedecker) that implemented magicclose feature that made procd
release file lock of /dev/watchdog
if watchdog feature in procd
was stopped.
This feature finally gave control of hardware watchdog back to users. Freedom!!! :)
How to manually control hardware watchdog?
If you just stop procd
from tickling the hardware watchdog you still can't manually tickle watchdog:
root@OpenWrt:~# ubus call system watchdog '{"stop": true}'
{
"status": "stopped",
"timeout": 30,
"frequency": 5,
"magicclose": false
}
root@OpenWrt:~# echo 1 > /dev/watchdog
-ash: can't create /dev/watchdog: Resource busy
But if you enable magicclose
feature then you can:
root@OpenWrt:~# ubus call system watchdog '{"magicclose": true}'
{
"status": "running",
"timeout": 30,
"frequency": 5,
"magicclose": true
}
root@OpenWrt:~# ubus call system watchdog '{"stop": true}'
{
"status": "offline",
"timeout": 0,
"frequency": 0,
"magicclose": true
}
root@OpenWrt:~# echo 1 > /dev/watchdog
root@OpenWrt:~#
Photos: