About Neer Vazhvu
An open-source water intelligence dashboard for Chennai, built to make public data accessible and actionable.
How “Days of Water Left” Works
We compute three scenarios based on current reservoir storage, daily consumption, and inflow patterns:
Default Assumptions
Users can adjust consumption and desalination values via sliders on the dashboard.
Data Sources & Aggregation
All data is collected automatically by our Python pipeline, which runs daily at 06:00 IST. Raw data is upserted into Supabase (PostgreSQL) and processed through ETL and intelligence stages.
Daily reservoir levels for 6 reservoirs: Poondi, Cholavaram, Red Hills, Chembarambakkam, Veeranam, and Kannankottai. Includes storage (mcft), water level (ft), inflow/outflow (cusecs), and rainfall (mm).
Primary weather source for Chennai (13.08°N, 80.27°E): precipitation, temperature, humidity, reference evapotranspiration (ET₀), and wind speed. Zero data lag, no API key required. ET₀ is used in the ARIMAX forecasting model to account for reservoir evaporation losses.
Fallback weather source. Satellite-derived data for Chennai: precipitation, max/min temperature, and relative humidity. Activated automatically when Open-Meteo is unreachable. 2-day data lag.
Ward-wise depth to water table (metres below ground level) for all 200 GCC wards across 15 zones. Sourced from CGWB/GCC monitoring wells. Data available from 2021 onwards.
Monthly reservoir storage data (mcft) for all 6 reservoirs, spanning 2003-2021. Used as historical seed for the forecasting model.
305 Chennai water bodies from the First Census of Water Bodies (2018-19) by the Ministry of Jal Shakti. Includes ownership, storage capacity (original vs present), encroachment status, depth, construction year, and basin information. Overlaid as markers on the Water Bodies map.
15 years of daily reservoir data (2004-2019) compiled by Sudalai Rajkumar. Used as additional historical training data for the forecasting model.
All current water bodies (lakes, tanks, reservoirs, ponds, marshes) within the Chennai metropolitan bounding box. Queried via the Overpass API and saved as a static GeoJSON. 1,635 polygon features, ~95,000 ha total surface. Also source for river polyline geometry (Cooum, Adyar, Buckingham Canal, Kosasthalaiyar) and industrial zone polygons in the north Chennai corridor. Data reflects OSM contributor edits as of the last script run.
15 manually curated lost or encroached water bodies, compiled from published research, court records, and environmental organisation reports. See the Water Bodies Map section below for per-record provenance.
Annual reports from the Central Pollution Control Board's National Water Monitoring Programme. Source for DO, BOD, pH, and conductivity readings at monitoring stations on the Cooum, Adyar, Buckingham Canal, and Kosasthalaiyar rivers. Supplemented by IIT Madras / Anna University peer-reviewed studies and NGT Chennai bench orders.
7 major industrial facilities in the Ennore-Manali corridor, curated from NGT Southern Bench orders (2017-2022), TNPCB enforcement records, CPCB industrial monitoring reports, and academic studies. Each facility entry includes pollutant types, documented incidents with volumes and dates, and NGT order summaries.
CFLOWS model flood hazard zones (5 categories), 2015 Chennai flood hotspots with vulnerability ratings, 2015 inundation depth readings, 2020 Cyclone Nivar hotspots, and return period flood maps (5-200yr).
10,308 official storm water drain segments from GCC survey (2023) across 197 wards, with street name, drain type, depth, width, length, material, and condition status.
CMWSSB sewerage infrastructure: 8 sewage treatment plants (STPs) with capacity, 348 pumping stations (SPS) linked to STPs, and 3,834 pumping main segments with pipe material and size.
AI-generated city and ward narratives connecting reservoir, groundwater, and risk data (Claude Sonnet for city, Haiku for wards)
Intelligence Layer
Beyond raw data display, Neer Vazhvu runs three intelligence modules daily to generate actionable insights.
Reservoir Forecasting
The dashed violet line on the storage trend chart shows an ARIMAX-based forecast for each reservoir, extending 6 months into the future. The shaded band around it represents an 80% confidence interval - the range within which actual storage is expected to fall, 4 out of 5 times.
Technique
We use AutoARIMA library with statsforecast exogenous regressors (ARIMAX). AutoARIMA automatically selects the best ARIMA(p,d,q) order and seasonal component by testing multiple model configurations and choosing the one with the lowest information criterion (AICc).
- Each reservoir is forecasted independently; six separate models are fitted.
- The model is re-trained daily as new data arrives from the CMWSSB scraper.
- Exogenous variables: Inflow and outflow (cusecs) are fed as external regressors alongside storage.
- Future flow estimation: Since future inflow/outflow are unknown, the model uses historical seasonal averages as proxy values for the forecast horizon.
- Graceful fallback: If a reservoir has sparse inflow/outflow data (less than 30% non-zero in the last 2 years), the model automatically falls back to pure ARIMA without exogenous variables.
- All predictions are clamped to [0, reservoir capacity].
Groundwater Map
The choropleth map shows depth to water table in metres below ground level (mbgl) for each of Chennai’s 200 GCC wards. Lower values mean the water table is closer to the surface (healthier). Thresholds are based on CGWB classification for South Indian alluvial aquifers.
Year-over-year trends compare the same month across consecutive years. A change of more than 0.5m is classified as improving or declining.
Water Bodies Map
The Water Bodies page shows two overlapping datasets: surviving water bodies sourced live from OpenStreetMap, and a curated set of 15 historically significant water bodies that have been lost or severely encroached upon.
Lost & Encroached Water Bodies - Per-Record Sources
River Health Map
The river map shows four rivers - Cooum, Adyar, Buckingham Canal, and Kosasthalaiyar - colour-coded by overall water quality status derived from CPCB monitoring data.
Lake Restoration Priority
The restoration ranker scores all 1,635 water bodies on restoration priority using a 5-component spatial analysis model. Each component is scored 0-100 and combined as a weighted average:
Water Body Size (25%): Larger water bodies provide greater groundwater recharge and flood mitigation impact.
Proximity to Lost Water Bodies (20%): Water bodies near historically lost lakes are in stressed areas where restoration compensates for lost water surface.
Proximity to Polluted Rivers (20%): Water bodies near dead or degraded river stretches (by dissolved oxygen readings) could serve as settling or treatment wetlands.
Industrial Pollution Proximity (15%): Water bodies near industrial discharge zones face greater contamination risk; restoring them helps protect groundwater.
Water Body Type (20%): Reservoirs and natural lakes are prioritised over canals, drains, and wastewater infrastructure.
Scores are computed from static spatial data and do not account for population density, land ownership, or restoration cost. Designed to support GCC budget allocation for lake restoration programmes.
Disclaimer
Not an official government tool. Neer Vazhvu is an independent, open-source project. It is not affiliated with, endorsed by, or connected to CMWSSB, GCC, CGWB, or any government body.
Informational purposes only. All data, estimates, and forecasts are provided “as is” for general awareness. Always refer to official CMWSSB advisories for critical decisions.
No personal data collected. Neer Vazhvu does not collect, store, or process any personal information. There are no user accounts, cookies, or analytics trackers.
Known Limitations
- Estimates are approximations. Actual water availability depends on factors not modeled (groundwater extraction, Krishna water transfer, distribution losses, industrial use).
- CMWSSB data may occasionally be stale (weekends, holidays). The dashboard shows a freshness indicator.
- Groundwater data from OpenCity may lag by months. The map always shows the most recent available period.
- Forecasts use ARIMAX (AutoARIMA with inflow/outflow as exogenous regressors) and work best with 2+ years of daily data.
- Risk scores are relative indicators for comparison between wards, not absolute measures of water safety.
Known Data Quality Issues
Government census data is invaluable but not perfect. We document known issues here for transparency. If you spot an error, please report it on GitHub.
Census: Mixed units in water_spread_area
The MoJS census methodology specifies hectares for water spread area, but 39 of 286 Chennai records appear to use square meters instead. Example: RETTAI ERI is recorded as 1,053,177 - this is sq m (~105 ha), confirmed against satellite imagery and Wikipedia (87–114 ha actual). Due to this inconsistency, census markers on the map use a uniform size as location indicators rather than representing actual water body area.
Census: Encroachment vs. storage capacity mismatch
Storage capacity and encroachment were surveyed independently. Some water bodies show 70%+ encroachment but 100% storage capacity remaining - the capacity figure was not revised to reflect lost area. These cases are flagged with an amber warning in the detail panel.
Census: Point coordinates only, no boundary polygons
The census provides only a single lat/lon point per water body, not boundary shapes. Where possible, census records are matched to nearby OpenStreetMap polygons (within 200m) so the actual water body shape is shown and census metadata (ownership, encroachment, capacity) appears in the detail panel. Unmatched census records are shown as small dots at the reported location.
Open Source
Neer Vazhvu is fully open source. The code, data pipeline, and methodology are transparent and available on GitHub. Contributions, bug reports, and data corrections are welcome.
View on GitHub