About Neer Vazhvu
An open-source water intelligence platform for Tamil Nadu cities - starting with Chennai and Madurai - built to make public data accessible and actionable.
Sections are collapsed by default. Click any heading to expand it.
Reading this dashboard
How “Days of Water Left” Works
We compute three scenarios based on current reservoir storage, daily consumption, and inflow patterns:
Default Assumptions
Users can adjust consumption and desalination values via sliders on the dashboard.
What each page shows
Dashboard & reservoirs
The dashed violet line on the storage trend chart shows an ARIMAX-based forecast for each reservoir, extending 6 months into the future. The shaded band around it represents an 80% confidence interval - the range within which actual storage is expected to fall, 4 out of 5 times.
Technique
We use AutoARIMA library with statsforecast exogenous regressors (ARIMAX). AutoARIMA automatically selects the best ARIMA(p,d,q) order and seasonal component by testing multiple model configurations and choosing the one with the lowest information criterion (AICc).
- Each reservoir is forecasted independently; six separate models are fitted.
- The model is re-trained daily as new data arrives from the CMWSSB scraper.
- Exogenous variables: Inflow and outflow (cusecs) are fed as external regressors alongside storage.
- Future flow estimation: Since future inflow/outflow are unknown, the model uses historical seasonal averages as proxy values for the forecast horizon.
- Graceful fallback: If a reservoir has sparse inflow/outflow data (less than 30% non-zero in the last 2 years), the model automatically falls back to pure ARIMA without exogenous variables.
- All predictions are clamped to [0, reservoir capacity].
How we measure reservoir catchment rainfall
This is used for the dashboard catchment rainfall card for the four core Chennai supply reservoirs.
- We use reviewed operational catchment polygons for Poondi, Red Hills, Chembarambakkam, and Cholavaram. These are hybrid review geometries built from HydroBASINS, MERIT Hydro, and local drainage review rather than simple circles around reservoir centroids.
- For each catchment, we sum CHIRPS rainfall over the last 7, 30, and 90 days.
- We compare those totals with a same-season historical baseline built from the prior 20 years of CHIRPS windows.
- The app does not expose the raw rainfall rasters. It only shows the bucketed result: well below, below, near normal, above, or well above normal.
Groundwater page
The choropleth map shows depth to water table in metres below ground level (mbgl) for each of Chennai’s 200 GCC wards. Lower values mean the water table is closer to the surface (healthier). Thresholds are based on CGWB classification for South Indian alluvial aquifers.
Year-over-year trends compare the same month across consecutive years. A change of more than 0.5m is classified as improving or declining.
Live CGWB station overlay
The ward choropleth is sourced from OpenCity's monthly ward-level groundwater dataset, which is authoritative but usually lags by weeks to months. To pair it with ground-truth readings, we also plot ~35 CGWB (Central Ground Water Board) monitoring stations in Chennai district as circle markers, pulled directly from the India WRIS Ground Water Level API.
Manual vs Telemetric stations - why the two can differ sharply
- Manual stations are quarterly CGWB field-crew readings, usually from shallow dug wells (~5-11 m deep) sampling the unconfined aquifer. This is the water table residents actually pump from.
- Telemetric stations are DWLR sensors that transmit readings daily. They are usually installed on deeper bore wells or piezometers tapping confined or semi-confined aquifers (often 19-200 m deep), so they answer a different question than the manual dug wells and the two readings should not be conflated.
- The station panel surfaces the well type, total well depth, and aquifer type from WRIS metadata so you can tell which well is which before comparing readings.
Sensor quality flags
DWLRs fail silently - a broken sensor keeps reporting the same depth forever, which would poison a naive dashboard. The groundwater_wris_latest database view scores every station with a data_quality_flag so suspect readings are surfaced explicitly rather than averaged into the ward colours:
- stuck stuck - Telemetric station with >=10 readings in the last 60 days whose median daily delta is under 1cm. This is robust to one-off step changes: a genuinely steady aquifer still passes, but a flat-lined sensor gets caught.
- stale stale - The latest reading is older than the station's expected cadence. Mode-aware: Telemetric becomes stale after 14 days (a DWLR should report daily), manual only after 180 days (CGWB resurveys it seasonally).
- ok ok - Station has at least one recent reading and is neither stuck nor stale.
On the map, suspect stations render with a neutral grey fill and a dashed amber ring so they never get confused with trustworthy readings, and the station panel shows an amber 'Possible sensor failure' banner with the exact 60-day range and reading count. The legend's sensor-status sub-section exposes toggles so reviewers can hide stuck or stale markers entirely.
Water bodies & restoration
The Water Bodies page combines current OpenStreetMap polygons, a curated set of 15 historically significant lost or encroached water bodies, and a new satellite context layer for a reviewed Phase 1 target set. For selected lakes and reservoirs, the detail panel now shows historical persistence, current spread versus the usual seasonal baseline, and an observation freshness/confidence label.
How we measure water-body spread by season
This is used for the "Satellite Context" block shown on selected lakes and reservoirs.
- We start from a curated Phase 1 target list instead of all 1,787 mapped water bodies, so we can QA the outputs and avoid noisy tiny ponds or industrial water features.
- Current spread is estimated from a 45-day Sentinel-2 NDWI composite. NDWI compares green and near-infrared light to detect water, including turbid and dark water that other classifiers miss. We turn the water signal inside each polygon into an observed water-spread area in hectares.
- Seasonal baseline comes from JRC Global Surface Water monthly recurrence for the same calendar month. This gives us an expected wet-area footprint for March vs April vs monsoon months, instead of comparing everything to one annual average.
- We compare observed spread to the seasonal baseline, compute a simple anomaly ratio, and label it as much lower, lower, near normal, higher, or much higher. We also compute historical persistence as the share of months where the water body meaningfully holds water.
How we produce reviewed satellite evidence frames
For flagship lakes and reservoirs, the detail panel offers a "See Satellite Evidence" button that opens a dialog with actual Sentinel-2 true-color imagery and a toggleable NDWI water-mask overlay.
- For each flagship water body and each monthly reference date, the pipeline searches Sentinel-2 imagery within a configurable window.
- Scenes are ranked by usable coverage, proximity to the reference date, and cloud percentage. The best scene is selected and downloaded as a true-color thumbnail.
- An NDWI water-mask overlay is computed from the same Sentinel-2 scene's green and near-infrared bands, clipped to the water body boundary, and stored alongside the true-color image.
- Frames are visually reviewed before being published. Only reviewed frames appear in the evidence dialog by default.
How we simplify the outputs for users
- Water-body spread uses an observed/baseline ratio: below 0.60 = much lower, 0.60-0.85 = lower, 0.85-1.15 = near normal, 1.15-1.40 = higher, above 1.40 = much higher.
- Catchment rainfall uses anomaly buckets against the historical seasonal baseline: <= -50% well below, <= -20% below, < 20% near normal, < 50% above, and >= 50% well above.
- Low-confidence satellite rows are hidden from the water-body detail panel. We only show the summary when optical coverage is good enough to be useful.
- Not every mapped water body shows this yet. Phase 1 is limited to a reviewed target set so we can quality-check the outputs before expanding coverage.
Lake Restoration Priority
The restoration ranker scores all 1,787 water bodies on restoration priority using a 5-component spatial analysis model. Each component is scored 0-100 and combined as a weighted average:
Water Body Size (25%): Larger water bodies provide greater groundwater recharge and flood mitigation impact.
Proximity to Lost Water Bodies (20%): Water bodies near historically lost lakes are in stressed areas where restoration compensates for lost water surface.
Proximity to Polluted Rivers (20%): Water bodies near dead or degraded river stretches (by dissolved oxygen readings) could serve as settling or treatment wetlands.
Industrial Pollution Proximity (15%): Water bodies near industrial discharge zones face greater contamination risk; restoring them helps protect groundwater.
Water Body Type (20%): Reservoirs and natural lakes are prioritised over canals, drains, and wastewater infrastructure.
Scores are computed from static spatial data and do not account for population density, land ownership, or restoration cost. Designed to support GCC budget allocation for lake restoration programmes.
Lost & Encroached Water Bodies - Per-Record Sources
Rivers page
The river map shows four rivers - Cooum, Adyar, Buckingham Canal, and Kosasthalaiyar - colour-coded by overall water quality status derived from CPCB monitoring data.
Flood page
Overlays historical flood hazard zones from OpenCity on the ward map together with the Greater Chennai Corporation storm water drain (SWD) network. The goal is to let residents see whether their street sits inside a documented hazard footprint and whether a drain is mapped nearby.
My Ward page
A single-page rollup of every spatial layer the dashboard knows about - reservoirs, groundwater depth, ward risk score, water bodies, rivers, flood zones, drains, sewerage, CGWB stations - all filtered to one ward. It is the deep-link target when you click a ward anywhere in the app.
Ward Report Card
Each of Chennai's 200 wards is ranked on 5 governance-quality metrics. Percentile-based A-F grades compare every ward against the full city. All density metrics are area-normalized (per sq km). The composite score is a weighted sum of per-metric percentiles.
| Metric | Weight | Direction |
|---|---|---|
| Drainage coverage | 25% | Higher = better |
| Sewerage infrastructure | 25% | Higher = better |
| Flood exposure | 25% | Lower = better |
| Water body health | 15% | Lower = better |
| Water body density | 10% | Higher = better |
Grades: A (80th+ percentile), B (60-79th), C (40-59th), D (20-39th), F (below 20th). The overall grade uses the same thresholds on the composite score's percentile rank.
Uplift Planner
The uplift planner answers: "If I had INR X crore for my ward, where should I invest it?" It uses a greedy budget optimizer to allocate a hypothetical budget across 5 intervention types, maximizing the ward's composite improvement per crore spent.
How it works
- Gap analysis: compares the ward's current value on each metric against the city distribution to identify where it lags.
- Greedy optimizer: at each step, evaluates every feasible intervention and picks the one with the highest composite-score improvement per crore. Repeats until the budget is spent or all caps are hit.
- Exact projection: builds a modified ward profile with the projected metric values and reruns the full ranking engine (computeWardRankings) to determine the exact after-state grade and percentile - not an approximation.
Data-backed caps
Each intervention is capped by real ward data: flood mitigation limited to the actual number of high/very-high hazard zones; water body restoration limited to bodies rated critical or high; revival limited to documented lost bodies. Infrastructure interventions (drains, sewerage) have practical per-ward caps.
Cost estimates
All cost ranges come from published GCC, CMWSSB, Smart Cities Mission, NDMA, and NGO project reports. Each allocation shows a low-high range; the optimizer uses the midpoint. These are illustrative - actual costs depend on site conditions, land, and procurement.
| Intervention | Cost/unit | Metric |
|---|---|---|
| Build storm drains | 1.5-3.0 Cr/km | Drainage coverage |
| Extend sewage network | 3.0-6.0 Cr/km | Sewerage infrastructure |
| Flood zone mitigation | 5-15 Cr/zone | Flood exposure |
| Restore water bodies | 2-8 Cr/body | Water body health |
| Revive lost water bodies | 10-25 Cr/body | Water body density |
Chennai Water Facts page
A journalist-ready snapshot page at /facts that surfaces Chennai's water state as quotable numbers with sources, dates, and methodology attached. Organised into four freshness tiers so staleness is never hidden: Today (live from monitoring feeds), This Year (latest government publications with vintage year), Chennai Water History (milestone events and peak records), and Infrastructure (structural capacity). Every card has copy-quote, tweet, and copy-link buttons. Powered by Schema.org Dataset + Observation structured data for search engines, with a public JSON API at /api/facts.
Intelligence & AI Narratives
Beyond raw data display, Neer Vazhvu runs three intelligence modules daily to generate actionable insights.
Cascade reconstruction methodology - Chennai
Chennai's tanks were once organised into chained cascades (system kanmoi): water from upper tanks overflowed through feeder channels into lower tanks, which fed the next, and so on. Most cascade channels are now broken by encroachment. The cascade overlay surfaces a terrain-derived hypothesis of how the cascade structure should have been organised, given the actual elevation and flow direction of the land.
See cascade health scores: Tank cascades at risk - Chennai→ ranks every documented and auto-derived cascade by fragility + priority, with citations and court / restoration anchors where known.
What you are seeing
- Sky-blue circles (720 tanks): one per OpenStreetMap water-body polygon at least 1 ha in size. Size encodes cascade depth (deeper-in-the-chain tanks render larger).
- Sky-blue lines (430 edges): predicted tank-to-tank cascade links. Each upstream tank has at most one outflow.
- Amber lines (50 outflows): tanks whose flow direction points to a river within ~2 km, modelling the river itself as the terminal sink.
Inputs
- Tank polygons: OpenStreetMap
water=*features.water_typein{river, canal, stream, drain, ditch, wastewater}is excluded so river segments don't get treated as tanks. - Elevation:
WWF/HydroSHEDS/03CONDEM- HydroSHEDS conditioned DEM at 3 arc-second (~90 m) resolution. "Conditioned" means sinks have been pre-filled so flow routing behaves predictably. - Flow direction:
WWF/HydroSHEDS/03DIR- the corresponding ESRI D8 flow-direction raster. Each pixel encodes which of its eight neighbours water drains to. - River barriers: the
{city}-rivers.geojsonwe already use on the map.
Algorithm (per tank)
- Compute centroid; sample DEM elevation and D8 flow direction at that point in a single batched Earth Engine call.
- Find all other tanks within 3 km whose elevation is lower.
- Reject candidates that fall outside ±67.5° of the upstream tank's flow-direction bearing - terrain-aware directionality, not just "is downhill".
- Reject candidates whose straight-line edge would cross a mapped river segment - water doesn't flow across rivers.
- Pick the single steepest remaining candidate (elevation drop / distance) as this tank's outflow.
- For tanks with no tank-to-tank outflow but a flow direction pointing to a river within 2 km: mark
drains_to_riverand draw an amber arrow to the nearest in-cone river point.
What this is NOT
- Not a registry of historical channels.We don't claim that any specific cascade link historically existed; we claim the terrain would have organised water this way.
- Not full hydrological flow accumulation.A stricter approach would trace flow paths pixel-by-pixel through the DEM. We use a "downhill within a flow-direction cone" heuristic that's correct for most obvious cases but can miss subtle terrain features that aren't river-mapped.
- Not a real-time water transport model. Edge existence does not imply current water flow.
- Not a model of any inflow that isn't tank-to-tank. Reservoirs receive water from at least four sources that this graph cannot represent: (a) direct rainfall on the lake surface, (b) catchment runoff via unmapped channels and overland flow, (c) the river the reservoir dams (rivers are deliberately excluded from cascade nodes), and (d) engineered canals, pipelines, and trans-basin diversions. A reservoir showing 0 cascade inflows here is not isolated in real life - Chembarambakkam Lake, for example, is fed by all four kinds of inflow (its 71.6 km2 Adyar catchment, the upper Adyar itself, plus Krishna water via the Kandaleru-Poondi canal and Cauvery water from Veeranam) yet none of those appears in this layer. The cascade graph is solely about tank-to-tank structure derived from terrain.
Known limitations
- DEM resolution ~90 m. Adequate for district-scale cascade structure; may miss very small channels. In flat terrain (e.g. coastal Chennai) elevation differences often round to the same integer metre, so the flow-direction cone does most of the work.
- Single outflow per tank (default). Real tanks often have one feeder channel and one separate surplus channel; the V1 algorithm models only the steepest candidate edge per upstream. A per-district
allow_multi_outflowopt-in relaxes this and keeps near-tied candidates (within 30% of the best score by default), modelling tanks with both feeder and surplus. Off by default for Chennai; we plan to enable it for plateau-geography districts where terrain gradients are weaker and multi-branch cascades are documented in the historical record. - River-coverage gaps. The river-crossing barrier is only as complete as the OSM river polylines. Where the polyline is sparse, edges may slip through.
- Edges are labelled
predictedonly. A future iteration will cross-check predicted edges against OSMwaterway=*tags and Sentinel-1/2 monsoon imagery, then label each edge asintact / partial / broken / encroached. - OSM
water_type=reservoiris ambiguousin this region. In Madurai roughly 87% of cascade nodes carry that tag, including many traditional kanmoi tanks that historically fed downstream cascades. The algorithm therefore does NOT auto-classify reservoirs as terminal sinks. A per-district curation hook (terminal_sink_osm_ids) exists for marking specific known engineered reservoirs (large dams whose outflow is via spillway / canal rather than via gravity to another tank); it is currently empty pending validation against TN PWD / DHAN inventories.
Reading cascade_position = 1: headwater, not source
Tanks at cascade_position = 1have no tank-to-tank inflow in this graph. They are the shallowest nodes in the network, not the literal source of water in the basin. Real inflow into these tanks comes from rainfall on the lake surface, surface runoff from the surrounding catchment via channels not in OpenStreetMap, and (in dammed basins) the river itself - none of which are modelled here.
We call these headwatertanks rather than "sources" to avoid implying the cascade graph accounts for where water actually originates. A reservoir with cascade_position = 1 is not isolated from rainfall and runoff; it just sits at the top of whatever tank-to-tank chain the terrain organises.
Edge confidence
Each predicted edge carries a confidence field bucketed by its score_m_per_km (elevation drop normalised by edge length). Thresholds:
- HIGH(≥ 5 m/km): a clear downhill gradient unambiguous even given HydroSHEDS 90 m elevation noise.
- MEDIUM(1-5 m/km): plausible cascade link with moderate confidence. Most kanmoi-cascade edges fall here.
- LOW(< 1 m/km): below 0.2 m drop per 200 m. Near the noise floor of the conditioned DEM; the edge may be terrain noise as much as real flow.
For Chennai: 126 high (29%), 248 medium (58%), 56 low (13%).
Isolated tanks: why each one is isolated
A tank is "isolated" in this graph when it has no tank-to-tank inflow, no tank-to-tank outflow, and no river sink. The pipeline re-walks the candidate-evaluation gates for each such tank and stamps it with one of these reasons, surfaced in the on-map hover tooltip:
elevation_sampling_failed- the HydroSHEDS DEM returned no value at the tank's centroid, so the algorithm has nothing to compare against. Usually data-coverage at the DEM's 90 m resolution boundaries.no_neighbors_in_range- no other tanks within the 3 km radius the cascade window uses. Real geographic effect, common on the rural fringe of the district.all_neighbors_uphill- in-range tanks exist but every one of them is at a higher elevation. The tank sits at a local basin low; water has nowhere downhill to go through the tank network in this window.all_neighbors_out_of_cone- downhill tanks exist in range, but all sit outside the ±67.5° cone aligned with the upstream tank's D8 flow direction. The terrain wants water to go somewhere other than where the nearest downhill tank is.all_neighbors_river_blocked- downhill, in-cone, in-range tanks exist, but every edge to them would cross a mapped river LineString. May indicate either real river-cut isolation or a gap where the OSM river polylines are over-segmented relative to ground truth.unknown_isolation- defensive fallback. Should be empty in practice.
What you can use it for today
- Spot likely historical hubs: tanks with high in-degree are where multiple terrain-driven flow paths converge. Maximum cascade depth in Chennai is 6.
- Surface river-front tanks: anything with an amber outflow is a tank that drains directly into a river - useful for restoration prioritisation since the ecological functions differ from internal-cascade tanks.
- Identify isolated tanks: tanks with neither inflow, outflow, nor river sink carry an
isolation_reasonfield distinguishing genuine basin orphans from data-coverage gaps. See the bucket-by-bucket breakdown above.
Parameter rationale + sensitivity
Each tunable parameter was chosen with a stated rationale. The sensitivity tables below show how each output statistic responds when the parameter is varied. Generated by the cascade pipeline's sensitivity stage; raw data at public/data/cascade/chennai-cascade-sensitivity.json.
max_downstream_distance_km default 3How far an upstream tank looks for a downhill neighbour. 3 km is the historical median spacing between tanks in well-documented kanmoi networks (DHAN Vayalagam field data). Below 1.5 km the graph fragments sharply; above 5 km the algorithm starts connecting tanks that have no historical relationship.
| value | nodes | edges | isolated | max depth | outlets |
|---|---|---|---|---|---|
| 1.5 | 720 | 228 | 325 | 5 | 58 |
| 2 | 720 | 315 | 228 | 5 | 54 |
| 3 | 720 | 430 | 130 | 6 | 50 |
| 4 | 720 | 488 | 90 | 7 | 48 |
| 5 | 720 | 525 | 73 | 8 | 46 |
cone half-angle (degrees) default 67.5How wide the directional cone around the upstream tank's D8 flow direction must be for a candidate to qualify. 67.5 degrees admits the principal D8 cell plus its two neighbours on each side (5 of 8 D8 cells). The default trades local D8 instability (the 90 m DEM produces noisy flow directions in flat terrain) against false-positive edges (a 90-degree cone admits half-plane candidates that the water would never actually reach).
| value | nodes | edges | isolated | max depth | outlets |
|---|---|---|---|---|---|
| 22.5 | 720 | 262 | 306 | 5 | 33 |
| 45 | 720 | 372 | 178 | 6 | 45 |
| 67.5 | 720 | 430 | 130 | 6 | 50 |
| 90 | 720 | 468 | 98 | 7 | 51 |
min_tank_area_ha default 1Minimum OSM water_type polygon size to enter the graph. 1 ha excludes most temple tanks, garden ponds, and roadside catchments while preserving the structural cascade. Raising the threshold thins the graph rapidly: at 5 ha Madurai keeps 72% of nodes; at 10 ha only 56%.
| value | nodes | edges | isolated | max depth | outlets |
|---|---|---|---|---|---|
| 1 | 720 | 430 | 130 | 6 | 50 |
| 2 | 570 | 324 | 107 | 6 | 45 |
| 5 | 418 | 214 | 98 | 6 | 32 |
| 10 | 324 | 135 | 104 | 5 | 29 |
max_river_outlet_distance_km default 2Distance budget within which a tank with no tank-to-tank outflow can register a 'drains to river' arrow. 2 km matches typical surplus-channel lengths in TN sub-basin engineering. Tightening to 1 km loses ~30% of river-outlet arrows; loosening to 3 km adds plausible-but-uncertain outlets that may be drainage rather than designed surplus.
| value | nodes | edges | isolated | max depth | outlets |
|---|---|---|---|---|---|
| 1 | 720 | 430 | 140 | 6 | 32 |
| 2 | 720 | 430 | 130 | 6 | 50 |
| 3 | 720 | 430 | 122 | 6 | 62 |
Data Source Index
All operational data is collected by the Python pipeline and supporting scripts that power the dashboard. Raw source data and Earth Engine summaries are upserted into Supabase (PostgreSQL) and then exposed as small, readable product signals.
Reservoirs & weather
Daily reservoir levels for 6 reservoirs: Poondi, Cholavaram, Red Hills, Chembarambakkam, Veeranam, and Kannankottai. Includes storage (mcft), water level (ft), inflow/outflow (cusecs), and rainfall (mm).
Primary weather source for Chennai (13.08°N, 80.27°E): precipitation, temperature, humidity, reference evapotranspiration (ET₀), and wind speed. Zero data lag, no API key required. ET₀ is used in the ARIMAX forecasting model to account for reservoir evaporation losses.
Fallback weather source. Satellite-derived data for Chennai: precipitation, max/min temperature, and relative humidity. Activated automatically when Open-Meteo is unreachable. 2-day data lag.
Monthly reservoir storage data (mcft) for all 6 reservoirs, spanning 2003-2021. Used as historical seed for the forecasting model.
56-year monthly rainfall history (1970-2025) from IMD's 0.25-degree gridded dataset, extracted for the Chennai grid cell. Includes annual totals and long-term monthly normals for drought/flood/Day Zero year identification.
Groundwater
Station-level groundwater time series for ~35 CGWB monitoring stations in Chennai district, pulled daily from the India WRIS Ground Water Level API. Mix of Manual (quarterly dug wells, unconfined aquifer) and Telemetric (daily DWLR bore wells, confined aquifer) stations with well type, well depth, and aquifer metadata. Each station is scored server-side with a stuck/stale/ok data quality flag.
Block-level groundwater exploitation data (2011-2024) from CGWB via India WRIS ArcGIS API. Shows classification (Safe to Over-Exploited), development percentage, net availability, and extraction draft for ~15 blocks in and around Chennai.
Ward-wise depth to water table (metres below ground level) for all 200 GCC wards across 15 zones. Sourced from CGWB/GCC monitoring wells. Data available from 2021 onwards.
Water bodies & historical
305 Chennai water bodies from the First Census of Water Bodies (2018-19) by the Ministry of Jal Shakti. Includes ownership, storage capacity (original vs present), encroachment status, depth, construction year, and basin information. Overlaid as markers on the Water Bodies map.
15 years of daily reservoir data (2004-2019) compiled by Sudalai Rajkumar. Used as additional historical training data for the forecasting model.
15 manually curated lost or encroached water bodies, compiled from published research, court records, and environmental organisation reports. See the Water Bodies Map section below for per-record provenance.
Rivers & pollution
9 restoration projects across Adyar, Cooum, Buckingham Canal, and Kosasthalaiyar rivers from the Chennai Rivers Restoration Trust. Includes project status, budget, area, implementing agencies, and outcome metrics.
Annual reports from the Central Pollution Control Board's National Water Monitoring Programme. Source for DO, BOD, pH, and conductivity readings at monitoring stations on the Cooum, Adyar, Buckingham Canal, and Kosasthalaiyar rivers. Supplemented by IIT Madras / Anna University peer-reviewed studies and NGT Chennai bench orders.
31 geo-located sewage inlets along the Cooum river, Otteri Nullah tributary, and Buckingham Canal. Discharge volumes (m3/day) from PWD Chennai, published in Nature Environment and Pollution Technology, Vol. 16, No. 3. Supplemented by Sheriff & Hussain (2012) groundwater contamination study.
7 major industrial facilities in the Ennore-Manali corridor, curated from NGT Southern Bench orders (2017-2022), TNPCB enforcement records, CPCB industrial monitoring reports, and academic studies. Each facility entry includes pollutant types, documented incidents with volumes and dates, and NGT order summaries.
Flood & drainage
CFLOWS 1.0 flood hazard zones (5 categories, operationalized Nov 2019 by IIT Bombay + IIT Madras + NCCR; not publicly updated since), 2015 Chennai flood hotspots with vulnerability ratings, 2015 inundation depth readings, 2020 Cyclone Nivar hotspots, and return period flood maps (5-200yr). Newer models (JICA Chennai Flood Control Master Plan 2024; TN RTFF & SDSS live Oct 2025 at chennaifloodmonitor.tn.gov.in) are not publicly redistributable as GIS.
10,308 official storm water drain segments from GCC survey (2023) across 197 wards, with street name, drain type, depth, width, length, material, and condition status.
CMWSSB sewerage infrastructure: 13 operational sewage treatment plants (STPs) with 745 MLD total installed capacity across 6 major campuses (Kodungaiyur, Koyambedu, Nesapakkam, Perungudi, Alandur, Sholinganallur). Map shows 8 treatment-site points; several campuses have multiple plant units commissioned in different years. Also 348 pumping stations (SPS) linked to STPs, and 3,834 pumping main segments with pipe material and size.
Satellite & Earth Engine
NDWI (Normalized Difference Water Index) water masks computed from Sentinel-2 green and near-infrared bands via Google Earth Engine. Used for both the water spread summary numbers and the satellite evidence overlay, replacing Dynamic World for more accurate detection of turbid and dark water.
JRC Global Surface Water monthly recurrence used as the seasonal baseline for the same calendar month. This is how we judge whether current spread is lower or higher than usual for this time of year.
CHIRPS daily rainfall over reviewed reservoir catchments. We use this for 7, 30, and 90 day rainfall totals and seasonal anomaly buckets on the dashboard.
True-color satellite imagery for reviewed evidence frames. Sentinel-2 captures 10 m resolution optical imagery every 5 days, used to produce visual evidence of water presence at flagship water bodies.
HydroBASINS and MERIT Hydro support the reviewed operational catchment polygons used for the four core Chennai supply reservoirs. These geometries are reviewed for storytelling use, not presented as formal legal boundaries.
Base geography
All current water bodies (lakes, tanks, reservoirs, ponds, marshes) within the Chennai metropolitan bounding box. Queried via the Overpass API and saved as a static GeoJSON. 1,635 polygon features, ~95,000 ha total surface. Also source for river polyline geometry (Cooum, Adyar, Buckingham Canal, Kosasthalaiyar) and industrial zone polygons in the north Chennai corridor. Data reflects OSM contributor edits as of the last script run.
AI narratives
AI-generated city and ward narratives connecting reservoir, groundwater, and risk data (Claude Sonnet for city, Haiku for wards)
Data quality & limitations
How we classify river health
CPCB publishes two parallelriver-water-quality classification systems, and they don't always agree. Knowing which one drives our river status badges matters for reading the dashboard honestly.
Designated Best-Use classes (A-E)
Computed from current dissolved-oxygen, BOD and coliform thresholds at each NWMP station. Updates every reading. Class A = drinking with disinfection only; Class B = outdoor bathing; Class C = drinking with conventional treatment; Class D = fisheries/wildlife; Class E = irrigation only. Below E = practically dead.
Polluted River Stretch (PRS) Priority I-V
A historical, multi-year stretch-level designation reflecting cumulative pollution. Slow to update; once a stretch is on the list it tends to stay there even if recent readings improve. Priority I = worst, Priority V = least bad of the polluted stretches.
Our status badges ("dead", "severely degraded", "degraded", "stressed", "healthy") are computed from current readings via the Designated Best-Use thresholds— not from the PRS Priority list. We take the worst classification across a river's monitored stations and surface that as the river-level status. Cooum, Adyar, and Buckingham Canal hold their labels under both signals (data and PRS Priority I agree); rivers where the two disagree (Vaigai's PRS Priority III vs Class C/D NWMP readings) reflect what the data shows now.
Methodology lives in src/lib/utils/river-classification.ts; readings come from CPCB NWMP annual River Water Quality reports.
Known Data Quality Issues
Government census data is invaluable but not perfect. We document known issues here for transparency. If you spot an error, please report it on GitHub.
Census: Mixed units in water_spread_area
The MoJS census methodology specifies hectares for water spread area, but 39 of 286 Chennai records appear to use square meters instead. Example: RETTAI ERI is recorded as 1,053,177 - this is sq m (~105 ha), confirmed against satellite imagery and Wikipedia (87–114 ha actual). Due to this inconsistency, census markers on the map use a uniform size as location indicators rather than representing actual water body area.
Census: Encroachment vs. storage capacity mismatch
Storage capacity and encroachment were surveyed independently. Some water bodies show 70%+ encroachment but 100% storage capacity remaining - the capacity figure was not revised to reflect lost area. These cases are flagged with an amber warning in the detail panel.
Census: Point coordinates only, no boundary polygons
The census provides only a single lat/lon point per water body, not boundary shapes. Where possible, census records are matched to nearby OpenStreetMap polygons (within 200m) so the actual water body shape is shown and census metadata (ownership, encroachment, capacity) appears in the detail panel. Unmatched census records are shown as small dots at the reported location.
Satellite: seasonal baseline is month-level, not day-level
The current satellite context compares a recent 45-day observation window to JRC monthly recurrence for the same calendar month. This is a strong seasonal reference, but it does not mean we know the exact expected spread for every day of the month.
Catchments: reviewed operational geometry, not legal survey boundary
Poondi, Red Hills, Chembarambakkam, and Cholavaram catchments are built from a mix of HydroBASINS, MERIT Hydro, and local drainage review. They are appropriate for rainfall context and inflow-support storytelling, but should not be treated as official cadastral boundaries.
Known Limitations
- Estimates are approximations. Actual water availability depends on factors not modeled (groundwater extraction, Krishna water transfer, distribution losses, industrial use).
- CMWSSB data may occasionally be stale (weekends, holidays). The dashboard shows a freshness indicator.
- Groundwater data from OpenCity may lag by months. The map always shows the most recent available period.
- Forecasts use ARIMAX (AutoARIMA with inflow/outflow as exogenous regressors) and work best with 2+ years of daily data.
- Risk scores are relative indicators for comparison between wards, not absolute measures of water safety.
- Satellite spread is a summary of surface water extent, not a direct measure of storage volume, water quality, or inflow source. A lake can look broad and still hold less usable water than expected.
- Reservoir catchment polygons are reviewed operational geometries for rainfall context, not official legal boundaries. This matters especially in Chennai's managed canal and transfer system.
- Current satellite context relies on optical Sentinel-2 observations. During persistently cloudy periods, some water bodies may temporarily lose this insight until a radar fallback is added.
About the project
Disclaimer
Not an official government tool. Neer Vazhvu is an independent, open-source project. It is not affiliated with, endorsed by, or connected to CMWSSB, GCC, CGWB, or any government body.
Informational purposes only. All data, estimates, and forecasts are provided “as is” for general awareness. Always refer to official CMWSSB advisories for critical decisions.
No personal data collected. Neer Vazhvu does not collect, store, or process any personal information. There are no user accounts, cookies, or analytics trackers.
Open Source
Neer Vazhvu is fully open source. The code, data pipeline, and methodology are transparent and available on GitHub. Contributions, bug reports, and data corrections are welcome.
View on GitHubSupport this project
Neer Vazhvu is free and open source. If you find it useful, consider supporting us on Patreon to help cover satellite data, hosting, and API costs.
Support on Patreon