Fridge Health Guard – Alert Testing Guide

This guide explains how to test each alert type in real-world conditions. Use it to verify your device correctly detects fridge issues.


Quick Reference

Alert What It Detects How to Test Time Needed
Power Outage Power lost to monitor Unplug sensor 3 min
Extended Run Compressor won’t stop Prop door open 1–3 hours
No Cooling Compressor won’t start Turn thermostat OFF 4+ hours
Device Offline Lost Wi-Fi/cloud connection Disconnect Wi-Fi 30 min

Understanding Alert Severity

Level Meaning App Display
WARN ⚠️ Issue detected, monitor situation Yellow banner
CRIT 🔴 Serious issue, act now Red banner
Clear Issue resolved Notification only

User Testing Guide

1. Power Outage Alert ✅

What it detects: The monitor lost power (outlet unplugged, breaker tripped, power outage).

How to Test

  1. Ensure your device shows solid LED (connected)
  2. Unplug the Fridge Health Guard sensor from the wall outlet
  3. Wait 3–5 minutes
  4. Plug it back in
  5. Expected: Push notification within 30 seconds

What You’ll See

Duration Severity Message
2 min – 2 hours WARN ⚠️ “Power was out for X minutes”
> 2 hours CRIT 🔴 “Power was out for X hours – check food safety”

Why 2 Hours is Critical

FDA guidelines: Discard perishables if your fridge was above 40°F (4°C) for more than 2 hours.


2. Extended Run Alert ⚠️

What it detects: Compressor running much longer than normal (door ajar, bad seal, dirty coils).

  1. Wait for LED solid (device connected)
  2. Turn fridge thermostat to COLDEST setting
    • Compressor will run longer trying to reach colder temperature
  3. Wait 60–90 minutes (or until threshold exceeded)
  4. Expected: Push notification about extended run
  5. Return thermostat to normal – alert clears after ~5 minutes idle
  1. Wedge fridge door open with a folded towel (1-2 inch gap)
  2. Compressor runs continuously trying to cool
  3. ⚠️ Risk: Food temperature rises – monitor closely!

What You’ll See

Condition Severity Message
Run > threshold WARN ⚠️ “Compressor running X min (normal: Y min)”
Run > 4× cycle time CRIT 🔴 “Compressor stuck running – service required”

⚠️ Normal Usage Note

Turning your thermostat colder is normal behavior and may trigger this alert temporarily. The alert is designed to catch:

  • Sustained long runs (2+ consecutive)
  • Extreme runs (2× normal threshold)

A single longer run from adjusting the thermostat typically won’t trigger an alert due to the consecutive-run requirement.

Real Issues This Alert Catches

  • Door left ajar
  • Damaged door gasket/seal
  • Dirty condenser coils
  • Hot food loaded
  • Ambient temperature spike

3. No Cooling Alert ✅

What it detects: Compressor hasn’t run for too long (may indicate failure).

How to Test

Method A: No Load (Easiest)

  1. Plug the sensor into a wall outlet with nothing connected to it
    • Device sees 0W continuously (no compressor activity)
  2. Wait 4+ hours
  3. Expected: Push notification about no cooling
  4. Plug in any load (lamp, fridge) – alert clears when it detects power draw

Method B: Thermostat Off

  1. Turn fridge thermostat to OFF (or warmest setting)
    • This safely stops the compressor without unplugging
  2. Wait 4+ hours (before learning) or check your app for typical cycle time
  3. Expected: Push notification about no cooling
  4. Turn thermostat back to normal
  5. Alert clears when compressor starts

What You’ll See

Duration Without Cooling Severity Message
> 1× threshold (~4h) WARN ⚠️ “No cooling for X hours – compressor may not be starting”
> 2× threshold (~8h) CRIT 🔴 “No cooling for X hours – check compressor immediately”

⚠️ Normal Usage Note

Turning your thermostat to OFF/warm is normal behavior (e.g., vacation mode, defrosting). If you intentionally turn off cooling:

  • You may receive a No Cooling alert after 4+ hours
  • Simply dismiss it – the alert will clear when you turn cooling back on
  • Consider temporarily disabling this alert (set push mask to exclude bit 2)

Real Issues This Alert Catches

  • Failed start relay or capacitor (most common)
  • Stuck thermostat
  • Compressor overload tripped
  • Control board failure
  • Compressor motor failure

4. Device Offline Alert

What it detects: Monitor disconnected from cloud (Wi-Fi or internet issue).

Note: This alert is generated by the cloud service, not the device itself.

How to Test

  1. Turn off your Wi-Fi router or move device out of range
  2. Wait ~30 minutes
  3. Expected: Push notification that monitor is offline
  4. Restore Wi-Fi
  5. Alert clears automatically

What You’ll See

Condition Message
Offline > 30 min “Fridge monitor is offline. Check power and Wi-Fi.”
Back online “Fridge monitor is back online.”

After Testing: Verify Results

In the App

  1. Open the Alerts or Events section
  2. You should see entries for each alert you triggered
  3. Check that alerts show correct timestamps and values

Alert History Shows

  • type: Which alert (EXTENDED_RUN, POWER_OUTAGE, NO_COOLING)
  • severity: WARN (1) or CRIT (2), or INFO (0) for cleared
  • val1/val2: Duration and threshold values

Managing Alert Notifications

Disable Specific Alerts

If you want to disable push notifications for certain alerts:

Alert Bit To Disable
Extended Run 0 mask = 14
Power Outage 1 mask = 13
No Cooling 2 mask = 11
Offline 3 mask = 7

All enabled (default): mask = 15


Troubleshooting

Alert Didn’t Fire?

Possible Cause Solution
Still in learning period Wait ~7 days for adaptive thresholds
Alert on cooldown Wait for cooldown period to expire
Push notifications disabled Check mask in app settings
Device offline Check Wi-Fi connection

Too Many Alerts?

Symptom Likely Cause
Frequent Extended Run Door seal issue, dirty coils, or sensor too sensitive
Frequent No Cooling Normal for manual-defrost fridges; check cycle times

Engineering Reference

Threshold Details

Power Outage (from alert_power_outage.c)

Parameter Default Description
FHP_ALERT_PWR_GAP_THRESHOLD_MS 120,000 (2 min) Minimum gap to trigger
FHP_ALERT_PWR_CRIT_THRESHOLD_MS 7,200,000 (2 hours) Gap for CRIT severity
FHP_ALERT_PWR_COOLDOWN_MS 600,000 (10 min) Cooldown between alerts

Test Steps (Engineering)

  1. Let fridge run normally for a few minutes (establish baseline)
  2. Unplug the sensor device (ESP32) for at least 2 minutes
  3. Plug it back in
  4. On first tick after reconnect, alert fires based on gap duration:
    • WARN if gap was 2 min to 2 hours
    • CRIT if gap was > 2 hours (food safety concern)

Event Payload

  • val1 = outage duration in minutes
  • val2 = 0 (unused)

Expected Message

  • WARN: “Power was out for {val1:.0f} minutes”
  • CRIT: “Power was out for {val1:.0f} minutes - check food safety”

No Cooling (from alert_no_cooling.c)

Parameter Default Description
NO_COOLING_FALLBACK_HOURS 4 Threshold before learning ready
NO_COOLING_Z_FACTOR 4.0 Stddevs above mean cycle period
NO_COOLING_FLOOR_MULT 2.0 Min threshold as multiple of avg
NO_COOLING_WARN_MULT 1.0 WARN fires at 1× threshold
NO_COOLING_CRIT_MULT 2.0 CRIT fires at 2× threshold
NO_COOLING_COOLDOWN_HOURS 4 Cooldown after clear

Test Steps (Engineering)

  1. Turn fridge thermostat dial to OFF (or warmest setting)
  2. Wait 4+ hours (fallback threshold) or ~2× your fridge’s typical cycle period
  3. Alert fires:
    • WARN at 1× threshold (~4h)
    • CRIT at 2× threshold (~8h)
  4. Turn thermostat back to normal
  5. Alert clears when compressor starts

Event Payload

  • val1 = hours since last cooling run
  • val2 = threshold in hours (WARN threshold)

Expected Message

  • WARN: “No cooling activity for {val1:.1f} hours - compressor may not be starting”
  • CRIT: “No cooling for {val1:.1f} hours - check compressor immediately”
  • Clear: “Compressor running normally”

Extended Run (from alert_extended_run.c)

Parameter Default Description
EXTENDED_RUN_FALLBACK_ARM_MIN 60 Threshold before learning ready
EXTENDED_RUN_Z_FACTOR 5.0 Stddevs above mean duration
EXTENDED_RUN_FLOOR_MULT 1.5 Min threshold as multiple of avg
EXTENDED_RUN_CONSECUTIVE_REQUIRED 2 Long runs needed before alert
EXTENDED_RUN_CLEAR_IDLE_MIN 5 Idle time to clear alert
EXTENDED_RUN_COOLDOWN_HOURS 12 Cooldown between alerts
EXTENDED_RUN_CRIT_CYCLE_MULT 4.0 Multiplier for CRIT escalation

Alert Logic

Two-tier trigger system:

  1. Normal path: Run exceeds threshold → count it. 2 consecutive long runs → WARN
  2. Extreme path: Run exceeds 2× threshold → immediate WARN (bypasses consecutive requirement)

CRIT escalation: If WARN active and run exceeds 4× cycle period → CRIT

Test Steps (Engineering)

  1. Prop fridge door open with a book (small gap, ~2 inches)
  2. Compressor will run continuously trying to cool
  3. First long run: noted but no alert (needs 2 consecutive)
  4. If door stays open through second cycle → alert fires (WARN)
  5. Close door, let fridge idle 5+ minutes → alert clears

⚠️ Risks

  • Food temperature rises during test
  • Monitor with thermometer - don’t exceed 40°F/4°C for >2 hours
  • Wastes energy

Event Payload

  • val1 = current run duration in minutes
  • val2 = threshold in minutes (the value that was exceeded)

Expected Message

  • WARN: “Compressor running {val1:.0f} min (normal: {val2:.0f} min) - check door seal”
  • CRIT: “Compressor stuck running {val1:.0f} min - immediate service required”
  • Clear: “Cooling cycle returned to normal”

Synthetic Test Data

The fhp_engine/data/test/ directory contains synthetic test files for each alert:

File Alert Description
bootstrap-home.txt N/A Learning period (65 runs from home data)
alert-extended-run.txt EXTENDED_RUN Two 3.5h consecutive runs
alert-no-cooling.txt NO_COOLING 10h gap with no compressor runs

Running Synthetic Tests

cd fhp_engine/test

# Test individual alerts
make test-alert-extended    # EXTENDED_RUN
make test-alert-nocooling   # NO_COOLING

# Run all
make test-alerts-all

POWER_OUTAGE Cannot Be Tested via Replay

The POWER_OUTAGE alert detects gaps in the tick stream – when no ticks are received for an extended period. This requires actual power loss to the sensor.

Why replay can’t simulate it:

  • Replay generates idle ticks between runs to keep the engine state machine running
  • There’s never a gap in the tick stream, so the alert never fires

How to test POWER_OUTAGE:

  1. On real hardware: Unplug the sensor for 2+ minutes
  2. In unit tests: Call fhp_engine_on_tick() with timestamp jumps
  3. The alert will fire on the first tick after a gap > 2 minutes

Learning Period Notes

Behavior During Learning (~7 days) After Learning
POWER_OUTAGE ✅ Active (safety-critical) ✅ Active
EXTENDED_RUN Uses 60-min fallback Adaptive threshold
NO_COOLING Uses 4-hour fallback Adaptive threshold

The device learns your fridge’s normal behavior:

  • Average cooling run duration
  • Standard deviation of run times
  • Average cycle period (time between runs)
  • Defrost patterns (excluded from alerts)

After learning, thresholds automatically adjust to your specific fridge.