Back to AI Research

AI Research

HaorFloodAlert: Deseasonalized ML Ensemble for 72-H... | AI Research

Key Takeaways

  • HaorFloodAlert: Deseasonalized ML Ensemble for 72-Hour Flood Prediction in Bangladesh Haor Wetlands The haor wetlands of northeast Bangladesh are vast, flat...
  • Flash floods in Bangladesh's haor wetlands show up with almost no warning.
  • They wreck the annual boro rice harvest.
  • Current setups, built for riverine floods, miss backwater dynamics entirely.
  • Water does not behave like it does on the Brahmaputra.
Paper AbstractExpand

Flash floods in Bangladesh's haor wetlands show up with almost no warning. They wreck the annual boro rice harvest. Current setups, built for riverine floods, miss backwater dynamics entirely. These basins are flat. Water does not behave like it does on the Brahmaputra. We built HaorFloodAlert, a deseasonalized machine learning ensemble that forecasts 72-hour flood probability for the Sunamganj Haor (approximately 8,000 km2). Temperature was acting as a seasonal cheat code - it inflated accuracy by 6.9 pp just because floods happen in warm months. We caught that. We also built an upstream Barak River Sentinel-1 SAR proxy from Silchar, Assam, giving about 36 hours of lead time. Otsu-thresholded SAR change detection validates at 84-91 percent spatial match. The operational ensemble (RF 0.5625 + XGBoost 0.4375) hits 89.6 percent LOOCV accuracy, 87.5 percent recall, and 0.943 AUC-ROC on 77 real Sentinel-1 events. A three-tier alert pipeline and a BRRI-calibrated boro rice damage estimator are included.

HaorFloodAlert: Deseasonalized ML Ensemble for 72-Hour Flood Prediction in Bangladesh Haor Wetlands
The haor wetlands of northeast Bangladesh are vast, flat basins that fill rapidly during pre-monsoon rains, often leaving local communities with only hours of warning before flash floods destroy the annual boro rice harvest. Existing flood forecasting systems are designed for deep, channelized rivers like the Brahmaputra and fail to account for the unique "backwater" dynamics of these shallow, bowl-like wetlands. This paper introduces HaorFloodAlert, a machine learning system specifically engineered to provide a 72-hour flood probability forecast for the Sunamganj region, helping to bridge the gap between technical data and the urgent survival needs of 3–4 million people.

Correcting for Seasonal Bias

A significant challenge in flood modeling is the "seasonal cheat code." Because floods in this region naturally occur during warmer months, raw temperature data often acts as a calendar proxy rather than a physical cause of flooding. The researchers found that including raw temperature inflated model accuracy by 6.9 percentage points. To fix this, they replaced raw temperature with a "climatological anomaly"—the difference between observed temperature and the long-term monthly average. This ensures the model learns from genuine weather patterns rather than simply identifying the time of year.

Integrating Satellite and Upstream Data

The system moves beyond traditional gauge-based monitoring by using Sentinel-1 satellite radar (SAR) to detect water movement in real-time. Because the haor basins are fed by water flowing down from the Barak River in Assam, India, the researchers built a "Barak River Sentinel-1 proxy." By monitoring this upstream area, the system gains approximately 36 hours of lead time before water reaches the wetlands. The model combines this satellite data with rainfall forecasts, soil moisture levels, and wind speed to create a comprehensive picture of flood risk.

Operational Alerts and Crop Protection

HaorFloodAlert is designed to be a functional tool for the community, not just a research project. It features a three-tier alert pipeline that sends automated SMS, email, and WhatsApp messages to farmers and local officials when flood risk reaches critical levels. Additionally, the system includes a crop damage estimator calibrated by the Bangladesh Rice Research Institute (BRRI). This tool calculates potential yield loss based on the flood's timing relative to the rice growth cycle, providing farmers and disaster managers with actionable information to help mitigate economic devastation.

Performance and Limitations

The ensemble model, which combines Random Forest and XGBoost algorithms, achieved an 89.6% accuracy rate and an AUC-ROC score of 0.943 on 77 real-world flood events. While these results are strong, the authors note several limitations. The system currently struggles to predict floods driven primarily by extreme rainfall when upstream river levels are low. Furthermore, the crop damage estimates carry a significant margin of uncertainty (±25–40%) and are intended for planning purposes rather than as precise insurance-grade assessments. The team plans to refine these estimates and improve validation in future versions of the system.

Comments (0)

No comments yet

Be the first to share your thoughts!