dev.gxuri.in

Tue Feb 10 2026

Breathe Better - AI-Powered Air Quality Platform

Breathe Better - AI-Powered Air Quality Platform

Breathe Better is an environmental intelligence platform that makes real-time air quality data understandable for everyone through voice, maps, risk scoring, and AI-generated future visualizations.

What is Breathe Better?

An AI-powered air quality platform that turns invisible data into something anyone can feel.


Most people have no idea what's in the air they're breathing right now. They know it's bad. They've heard the warnings, seen the haze. But "PM2.5: 87 µg/m³" means nothing to them. It doesn't help you decide whether to take your kids to the park. It doesn't tell you what's happening to your lungs. It just sits there, a number without a body.

I built Breathe Better to fix that. It's a full-stack environmental intelligence app that takes real-time air quality data and makes it immediately understandable: not just visually, but conversationally, predictively, and viscerally.


The Problem

India has some of the worst air quality in the world, and yet the existing tools to understand it (government dashboards, weather apps) are built for meteorologists, not people. They show raw AQI indices, acronyms, and maps full of colored dots with no explanation of what any of it means for your daily life.

The questions people actually have (Is it safe to go for a run? Should my child be outside? What's causing this? Is it getting worse?) go entirely unanswered.


What It Does

Breathe Better pulls from multiple live data sources (AQICN, OpenWeatherMap) and combines them into a coherent picture of your local environment. But the interesting part is what it does with that data.

Talk to it. The /aqi page is built around a voice interface. You tap a button, ask your question out loud, and get a spoken answer back. The app understands and responds in over 22 regional Indian languages, powered by Sarvam AI, a TTS provider built specifically for natural-sounding Indian language output. Under the hood, this is a multi-model pipeline: your voice is transcribed by Whisper (via Groq) for near-instant accuracy, the question and environmental data are sent to Gemini 2.5 Flash to generate a plain-language answer, and that answer is spoken back to you in the language you used. The whole flow takes a few seconds. The response arrives the same way the question did: as a voice.

See the risk. The dashboard computes a composite outdoor activity score in real time, factoring in not just AQI but temperature, humidity, wind speed, and weather conditions together. It also pulls 24 hours of historical air data to show whether conditions are improving or worsening, and how many hours you've already been exposed to PM2.5 above WHO limits. Risk isn't presented as a number. It's presented as a verdict: Good day to run. Stay inside.

See the future. This was the technically ambitious part. Upload a photo of any outdoor space (a street, a park, your neighborhood) and the app shows you what it would look like if current air quality trends continue for another decade. The pipeline sends the image through Gemini 2.5 Flash to generate a precise scene description, builds a Stable Diffusion prompt calibrated to the current AQI severity (from "light haze" to "apocalyptic toxic atmosphere"), then passes everything to a Cloudflare Workers AI image generation endpoint. What comes back is the same place, transformed: vegetation dying, skies turning brown, surfaces darkening. It's not a chart. It's a picture of a future you can actually understand.

See where it's worst. The map view samples air quality across a multi-ring grid of points around your location (concentric circles plus a diagonal lattice) and renders a live heat map using Leaflet. You can see in one glance whether the air two kilometers away is any better than where you're standing.


Accessibility First

Breathe Better was designed with accessibility as a core constraint, not a checkbox. The most important decision was making voice the primary interface. Reading and interpreting environmental data is a barrier for people with low literacy or visual impairments. By letting users ask questions in their own language and hear the answer spoken back, the app becomes usable for people who would be locked out of a traditional dashboard entirely.

The voice interface supports 22+ regional Indian languages, which matters in a country where hundreds of millions of people are not comfortable in English or Hindi. Sarvam AI was chosen specifically because it produces natural, accent-accurate speech in languages like Marathi, Tamil, Bengali, and Kannada, not robotic transliterations.

On the animation side, a useReducedMotion hook propagates through the entire GSAP and Lottie animation system. Users who have enabled reduced motion in their OS preferences get a calm, static experience automatically, with no opt-out required. Color choices across the interface are contrast-aware, and interactive controls are sized and spaced for touch-first use on mobile.


How It's Built

The stack is Next.js 16 with React 19 and TypeScript throughout. The backend lives entirely in Next.js API routes, six of them, each handling a distinct concern: AQI fetching, grid sampling for the heatmap, the voice Q&A pipeline, the image transformation pipeline, the risk scoring engine, and text-to-speech.

The frontend uses GSAP and ScrollTrigger for the landing page, a Lottie animation synchronized to scroll position, with separate implementations for desktop and mobile (pinned side-by-side layout vs. pinned top with horizontal scroll). The voice UI is a custom component with an animated blob that responds to recording and playback state, a real-time waveform visualization, and a processing indicator that transitions between "Listening," "Thinking," and "Speaking" states.

The AI layer integrates six separate providers: OpenWeatherMap and AQICN for environmental data, Groq (Whisper) for transcription, Gemini 2.5 Flash for language understanding and image analysis, Sarvam AI for multilingual TTS, and Cloudflare Workers AI for image generation. Coordinating all of them reliably, with graceful degradation when any one fails, was most of the backend work.


What I Learned

The hardest part wasn't any single feature. It was the image generation pipeline. Getting Stable Diffusion to produce a photorealistic degradation of a specific real-world scene (rather than a generic "polluted city") required understanding how SD models parse prompts. Multi-line structured text doesn't work; dense, comma-separated tags do. Getting Gemini's scene description into the right format, calibrating the visual effects vocabulary to AQI severity bands, and tuning the strength and guidance parameters to preserve scene identity while transforming atmosphere required a lot of iteration.

The voice pipeline taught me something different: how much latency matters to the feel of an interaction. Groq's Whisper endpoint is fast enough that transcription doesn't feel like a bottleneck. Gemini 2.5 Flash is fast enough that the answer arrives before the wait becomes uncomfortable. When you string enough fast models together, the interaction starts to feel conversational rather than transactional.


Built for Techathon 3.0.