Giter Site home page Giter Site logo

Comments (5)

ojousima avatar ojousima commented on June 30, 2024

There's a few practical issues with this approach.

  1. Watchdog reload value cannot be changed after initialization without reboot
  2. This would require support in all the client libraries, which means a lot of coordinated effort forced on volunteers to update and maintain the error codes and their handling to backends. I'd prefer using GATT, maybe even as a plaintext description.
    5, 6. This is problematic in extreme cold, such as freezers. The extra current draw can collapse the battery, and the battery may not have enough current capacity to reboot the tag. This issue is often observed on accelerometer and DSP heavy firmwares. The end result is that useable lifetime of tag can be shortened in cold which should be avoided.

Regarding other points:

  1. It would be possible to log abnormal events to UICR, for example temperature over 85 C, RH at 100% (possible condensation) etc.
  2. Both Android and iOS have NFC apps capable of reading and displaying plain text records, and there is already a plain text field "Data" in NFC. Those can be used for displaying the data.
  3. Yes, especially voltage droops in cold cause unpredictable behavior which is often fixed by reinitializing the board

from ruuvi.firmware.c.

DG12 avatar DG12 commented on June 30, 2024
  1. I suppose what I meant by this is "Don't forget about the WDOG timer" especially when you get to number 5 and 6.

  2. As you mentioned, failures across various libraries end up at ruuvi_driver_error_check. Providing a specific error code for each would be ideal but is not really practical. Saving a module specific code during initialization is easily accomplished. Saving a "something else bad happened" , ie "it didn't happen during initialization", would be at least something.

  3. LED blinking and 6) delay: I did not intend that either of these should be so long as to deplete the battery. All these actions would only be taken when there is already a fatal failure anyway.
    Perhaps, to handle the unattended "can't sees the LED" case "looping until environment changes" (see O3), subsequent failures using the stored failure information should use low-power sleep instead. The intention of the "local CPU based interrupt disabled, delay spin, not dependent on the RTC " was to prevent an interrupt from pulling out of the failure handling. Perhaps disabling interrupts is a better way.

Your other points:
O1) Good idea to log other "possible caused damage" events.

O2) Regarding 3 NFC. Use of "Data" will be able to retrieve information without necessitating connecting a debugger to read UICR.

O3) Cold causes "temporary" failures, especially if the environment returns to a "within operating" range. This is an additional reason to even prolong a low-power delay. Good example for careful interpretation of the stored failure info.

In general none of the actions I proposed are guaranteed to be successful. Those that can succeed would be helpful.

If you flag this as help wanted. I would be willing to write code to implement as many of these ideas as I am able.

from ruuvi.firmware.c.

ojousima avatar ojousima commented on June 30, 2024

In general signaling any error to user and waiting for a few minutes for user to notice the error is not useful outside of developing. Even if the user is carrying the tag with them chances are they're not looking at the tag at the time of error.

On the other hand, developer can read the error with debugger and visual indications such as flashing LEDs is not the easiest way to check for the error case.

Storing the configuration of sensors to flash is on my TODO anyway, I'll come back to storing initialization status once I'm working with flash logging.

from ruuvi.firmware.c.

DG12 avatar DG12 commented on June 30, 2024

You are correct regarding the interaction with the users.

It seems that the only 2 recoverable events are
A) temporary battery droop
B) temporary environmental conditions
Any other failures necessitate the same action on the part of the user: replace the tag!

In hopes that the battery has only temporarily drooped or the environment will return to operating conditions I propose the best actions are:
0) WDOG: deal with it to prevent reset looping

  1. Status saving: expecting that a reset occurred and that (in the future) a failing subsystem is not needed and can be ignored. It's really easy.
  2. BLE Message: let collectors/observers know. An application like Scrin's collector can take action like even an email, but only if it gets a message that something went wrong. RuuviStation could alert the user that it is not just the the tag is out of range. Not to easy but worth it especially for end user.
  3. LED blink not helpful unless LOG_ENABLE false. Implementation is not that hard.
  4. LOG helpful for developer, only if LOG_ENABLE true. Implementation is easy.
  5. NFC not worth the effort
  6. Delay loop: low power, long - perhaps 5 minutes(?) non-interruptible prevent rapid reset loop. This is necessary to prevent battery exhaustion. WDOG needs to be feed here.
  7. Finally a software initiated reset

I still think that saving status and configuration ("tunables") in UICR is easier than flash, and easier to find and modify by either debugger or NFC interface. ( Leaves one more page for long term history data too :-) )

An additional note: I looked into the WDOG event handler in ruuvi_nrf5_sdk15_watchdog and have similar suggestions for that, hoping that the same temporary conditions caused the failure. (Should this be a separate issue?)
W0) feed the dog FIRST (only 30ms before reset)
W1) save status. include PC from stack for developer
W2) LOG_ERROR (only developer will see, only if LOG_ENABLED)
W3) Delay loop
W4) Reset

If you flag this as help wanted. I would be willing to write code to implement as many of these ideas as I am able.

from ruuvi.firmware.c.

ojousima avatar ojousima commented on June 30, 2024

Status saving: expecting that a reset occurred and that (in the future) a failing subsystem is not needed and can be ignored. It's really easy.

This is now implemented, reason of fatal error and line is logged to flash and printed to user.

BLE Message: let collectors/observers know. An application like Scrin's collector can take action like even an email, but only if it gets a message that something went wrong. RuuviStation could alert the user that it is not just the the tag is out of range. Not to easy but worth it especially for end user.

This would require custom parsers and everyone to support it. The effort/benefit ratio does not justify the work.

LED blink not helpful unless LOG_ENABLE false. Implementation is not that hard.

I agree that led blinking is not beneficial in addition to logging. Let's revisit this once main.c has unit test coverage

Delay loop: low power, long - perhaps 5 minutes(?) non-interruptible prevent rapid reset loop. This is necessary to prevent battery exhaustion. WDOG needs to be feed here.

The problem here is that when we're in unknown state the sleep might fail. We have seen this in 1.x firmware where NFC occasionally hangs and consumes excess current even though the software thinks that tag is asleep. A tag in the hands of everyday user stuck in reboot loop is faulty anyway, and expert users can probably power it via devkit.

If you flag this as help wanted. I would be willing to write code to implement as many of these ideas as I am able.

Thanks for the offer, let's first get the unit test coverage and documentation of tasks in a good shape.
For example src/tasks/peripherals/task_timer is a good place to start as it requires only updating the comments and writing a single unit test.

from ruuvi.firmware.c.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.