/ 01 — Research · 2026-06-09

Where the real data comes from.

Every source the pipeline needs — rentals, geocoding, schools, air quality, metro, commute — researched live with cost, legal, and reliability notes. Confidence is labelled honestly; nothing here is dressed up as certain when it isn't.

VERIFIED — confirmed live this session PARTIAL — exists, terms not fully read KNOWLEDGE — not re-verified, spot-check

/ 02 — Supply

Rental listings

The broken Apify path. Current: one hardcoded housing.com URL, node disabled. Apify Store actors queried live — all pay-per-event, run counts shown as a reliability proxy.

Apify · easyapi/housing-com-scraper

verified
pay-per-event · 640 runs

Matches the current target portal. Drop-in replacement for the hardcoded actor.

Pick 1 — fastest fix, parameterize the locality.

Apify · stealth_mode/99acres

verified
pay-per-event · 2,059 runs

Highest usage of any India rental actor — most battle-tested.

Pick 2 — fallback if housing.com flakes.

Apify · thirdwatch/nobroker

verified
pay-per-event · 213 runs

Owner-direct listings — mirrors bengaluru.rent's zero-brokerage data.

Strategic — owner-direct fits the moat.

Portal official APIs

knowledge
none / partner-only

Housing / 99acres / MagicBricks / NoBroker have no public dev API. Affiliate deals need a business agreement.

Dead end for a solo dev.
Legal · be honest

Scraping these portals violates their ToS (med–high risk). Apify shifts operational risk to the actor; commercial-use legal exposure remains. Fine for an MVP / personal tool — get counsel before a public launch, or pivot to crowdsourced owner-direct data once trust is restored.

/ 03 — Geocoding

Places · geocoding · POI

Current: Mapbox geocoding v5 (legacy) + Foursquare for metro/schools. Foursquare pricing read live; Ola Maps pricing read live.

Ola Maps (Krutrim)

verified
500,000 free calls/month · all APIs

India-native. Geocoding, Places Nearby, Distance Matrix, Directions, tiles. "Data stays in India", SOC2/ISO27001. Caching allowed.

Best value — could replace Mapbox + Foursquare in one vendor.

OSM Overpass + Nominatim

partial
free · self-hostable

Free POI + geocoding. ODbL (attribution). Cacheable. Bangalore metro is in OSM. Community-variable coverage.

Best free — POI + caching-friendly fallback.

Foursquare Places

verified
10,000 free Pro calls · then PAYG

Pivoted to GIS "Spatial" products. New service-key model (app already on it). Text-search quality for "ICSE school" is poor.

Keep small — weak for board-level schools.

Google Places (New)

knowledge
monthly credit · then PAYG

Best global data. But ToS prohibits caching most results — fights the APICache design. Cost adds up.

Avoid as primary — no-cache rule conflicts.

/ 04 — Board data

Schools — with board (ICSE / CBSE)

The honest-differentiator data. Requirement: schools by BOARD with lat/lon. POI text-search cannot do this — "school" is the only tag, no board attribute. This is the core problem behind the fabricated claims.

UDISE+ (govt)

knowledge
free · downloadable

National school dataset: name, UDISE code, district, lat/lon (variable). Board affiliation weak/absent.

Base layer — all schools + coords, not board alone.

CISCE directory

knowledge
free · scrape (ToS risk)

Authoritative ICSE/ISC list with board. No lat/lon, no bulk download. ~200–300 Bangalore schools.

Only ICSE source — geocode after extract.

CBSE directory

knowledge
free · scrape (ToS risk)

Authoritative CBSE affiliated-school list. Name, affiliation no., address, board. No bulk export.

Only CBSE source — geocode after extract.

POI APIs for board

verified
n/a

Google / OSM / Foursquare tag schools as "school" only. Board filtering is NOT achievable from any POI API.

Can't — the current fabrication's root cause.
Key finding

No single source has board + reliable lat/lon. The honest path is a one-time hybrid ingest into Postgres: UDISE+ base + scraped CISCE/CBSE board lists, merged and geocoded via Ola/Nominatim. Until that exists, the report must say "N schools within X km (board not verified)" — never claim ICSE counts it can't prove.

/ 05 — Environment

Air quality · metro · commute

AQI already works (Open-Meteo). CPCB official confirmed live on data.gov.in. Metro + commute options below.

Open-Meteo Air Quality

knowledge
free · no key · in use

us_aqi, pm2_5, pm10 per lat/lon. Modelled (CAMS/ECMWF), coarse but consistent for relative ranking.

Keep — free per-locality ranking source.

CPCB via data.gov.in

verified
free key · catalog live

Official Indian AQI, all-India stations, ~10–15 in Bangalore. Catalog updated 2026-06-09, has an API.

Cross-check — official number when a station is near.

Namma Metro · OSM Overpass

partial
free · cacheable

Station coords + line geometry in OSM. Confirm a GTFS feed (mobilitydatabase.org) if Purple/Green/Yellow/Pink route structure is needed.

Pragmatic — free station source.

Commute · Ola Matrix / OSRM

partial
free tier / self-host

Revive the dead get_travel_times via Ola Distance Matrix (free tier) or self-hosted OSRM (free, unlimited).

Horizon 2 — powers what-if commute.

/ 06 — Build steps

Implementation playbook

Buildable next time — exact endpoints, working query shapes, sample functions. Decision: no Google Maps scraping (can't return school board, ToS bans caching, brittle anti-bot). The robust free path is OSM Overpass + UDISE/CISCE/CBSE + Nominatim/Ola. Everything caches.

6.1 OSM Overpass — metro / schools / hospitals verified

Free, no key, ODbL (caching allowed). Endpoint + Bangalore bbox:

POST https://overpass-api.de/api/interpreter (body: data=<Overpass QL>) bbox (lat_min,lon_min,lat_max,lon_max) = 12.83,77.45,13.14,77.78
Gotcha — use a BBOX, never area["name"="Bengaluru"]. The area-name lookup returned 0 (boundary match unreliable). BBOX returned 1,803 hospital+supermarket elements live. Public instance rate-limits back-to-back heavy queries (429) — space them ~5–10s or self-host.
# metro stations [out:json][timeout:30];node["station"="subway"](12.83,77.45,13.14,77.78);out tags center; # schools (POI only — NO board; nwr catches ways/relations) [out:json][timeout:40];nwr["amenity"="school"](12.83,77.45,13.14,77.78);out center tags; # hospitals + supermarkets, one call (bbox repeated in each filter) [out:json][timeout:40];(nwr["amenity"="hospital"](12.83,77.45,13.14,77.78);nwr["shop"="supermarket"](12.83,77.45,13.14,77.78););out center tags;
# backend/sources/osm.py — routes through APICache def overpass_pois(amenity_filter, bbox=BLR_BBOX, kind="nwr"): cache_key = f"overpass:{kind}:{amenity_filter}:{bbox}" data = get_cached_response("osm_overpass", cache_key) if data is None: q = f'[out:json][timeout:60];{kind}{amenity_filter}({bbox});out center tags;' r = requests.get(OVERPASS_URL, params={"data": q}, timeout=90, headers={"User-Agent": "ooru/1.0 (contact@maahaa.dev)"}) data = r.json(); save_cached_response("osm_overpass", cache_key, data) # map elements -> {name, lat|center.lat, lon|center.lon, tags} return [...] # nearest-N: haversine over the cached list in Python — zero extra API calls

6.2 Schools WITH board (ICSE/CBSE) — one-time ingest partial

No POI API gives board. Build a schools(name, board, lat, lon, source) table ONCE, then query locally. Reachability: udiseplus.gov.in 200, cisce.org 403 (bot-blocked, use browser), saras.cbse.gov.in 503 (intermittent).

Steps: (1) UDISE+ Karnataka CSV → base (coords + existence). (2) CISCE + CBSE Bangalore lists via browser → board + address. (3) geocode missing coords (Nominatim/Ola). (4) fuzzy-match board onto UDISE base (rapidfuzz ≥85). (5) "3 ICSE within 2km" becomes a local SQL + haversine — fully honest.

Interim — until this table exists, report "N schools within X km (board not verified)" from OSM. Never claim ICSE/CBSE counts the data can't prove.

6.3 Nominatim — geocode landmark verified

Free; 1 req/sec, descriptive User-Agent required, cacheable. Live: "Koramangala, Bengaluru" → 12.9357, 77.6241.

GET https://nominatim.openstreetmap.org/search?q=<landmark>&format=json&limit=1&countrycodes=in

6.4 Ola Maps — geocode / places / matrix tier verified

India-native. 500k free calls/month all APIs, caching allowed, data stays in India. maps.olakrutrim.com 200. Verify exact paths in their API ref before wiring:

GET https://api.olamaps.io/places/v1/geocode?address=<q>&api_key=<KEY> GET https://api.olamaps.io/routing/v1/distanceMatrix?...&api_key=<KEY> (revives dead get_travel_times)

6.5 AQI — Open-Meteo + CPCB reachable

Keep Open-Meteo for ranking ALL localities. Add CPCB official number when a station is near.

GET https://air-quality-api.open-meteo.com/v1/air-quality?latitude=&longitude=¤t=us_aqi,pm2_5,pm10 GET https://api.data.gov.in/resource/<resource_id>?api-key=<KEY>&format=json (CPCB; 429=live)
Gotcha — data.gov.in ROOT returns 404. You must hit /resource/<id>, not the root host.

6.6 Rentals — Apify actor, NOT DIY-Google verified

Google has no rental data. A maintained Apify actor is cheaper-to-own than fighting housing.com anti-bot yourself. Parameterize one, cache by locality. Test one run first — output field names differ per actor.

POST https://api.apify.com/v2/acts/<user~actor>/runs?token=<APIFY_TOKEN> GET https://api.apify.com/v2/datasets/<id>/items (easyapi/housing-com-scraper · pick 1)

/ 07 — Verdict

Recommended source stack

What to swap, and why. Everything routes through the existing APICache — OSM, Ola and Open-Meteo all permit caching; Google and Mapbox restrict it.

Pipeline need Current Recommended Why
Geocode landmark Mapbox v5 Ola Maps Geocoding 500k/mo free, India data, cacheable
Nearby metro / POI Foursquare (discarded) OSM Overpass free, cacheable, real coords
Schools + board Foursquare text (fabricated) UDISE+ + CISCE/CBSE hybrid only honest path to board claims
AQI ranking Open-Meteo (keep) + CPCB cross-check free ranking + official accuracy
Rentals broken Apify (1 URL) Apify housing/99acres actor cheap pay-per-event, cacheable
Commute (future) dead get_travel_times Ola Matrix / OSRM free tier / self-host

One India-native vendor — Ola Maps — could consolidate geocoding, POI and matrix under a 500k/month free tier. Simpler and cheaper than the Mapbox + Foursquare split it replaces.

/ 08 — Read next