I Tried to Predict Singapore's Rain and It Humbled Me
In a country with three weather modes – hot, hotter, and raining sideways – I built a real-time weather dashboard with LightGBM rainfall forecasting, SHAP explainability, animated radar, and 8 years of tropical weather data. Now with interactive charts so you can explore the data yourself.
Why build a weather app?
Honestly? Because I wanted a playground to learn a bunch of things at once. Not just "call an API and show a number" but the full loop: real-time data ingestion, exploratory data analysis, feature engineering, ML training, model explainability, frontend visualisation, and deployment. A weather app turned out to be the perfect vehicle because the data is free, always updating, and everyone has an opinion about whether the forecast is right.
The idea started at NUSS, drinking Ang Moh Liang Teh when the sky opened up and we got completely stuck in the rain. Apple Weather had been showing "Cloudy" all day — no warning, no umbrella, just vibes and regret. NEA's forecasts are actually solid, but the default weather app on our phones didn't quite catch this one. We decided right there that we had to take things into our own hands — if we could predict the rain even a few hours out, we'd know whether to grab an umbrella or when it's safe to head back. Predicting rain in Singapore is genuinely hard, and I wanted to see how far I could push a gradient-boosted model against that chaos.
This project was also a chance to get serious about EDA — not just
df.describe() on a toy dataset, but actually understanding 8 years
of messy real-world sensor data.
Try it yourself
Head to lionweather.kooexperience.com. Drop a pin anywhere on the Singapore map and the dashboard shows you:
- Current conditions from NEA: temperature, humidity, wind, rainfall, UV index, pressure, and visibility
- 4-day and hourly forecasts from NEA's official API
- ML rainfall predictions for the next 1, 3, 6, and 12 hours with confidence levels
- Animated radar overlay showing real-time rain movement across the island
- Sun/moon arc with live position tracking (sunrise, sunset, golden hour)
- Full ML analysis dashboard with EDA charts, SHAP plots, confusion matrices, and NEA benchmarks
Your location never leaves the browser. Everything is stored in localStorage. I'm not tracking you; I just want to show you if it's going to rain.
How it all fits together
LionWeather is a two-service setup on Railway: a FastAPI backend and a React frontend. They talk through a Vite proxy in dev, and Railway handles routing in production.
┌─────────────────────────────────────────────────────┐
│ Backend (FastAPI + Uvicorn) │
│ │
│ 17 API routers │
│ ├── /api/weather Current conditions │
│ ├── /api/forecasts 4-day + hourly │
│ ├── /api/ml/rain-forecast ML predictions │
│ ├── /api/ml/full-analysis EDA + SHAP + benchmarks │
│ ├── /api/radar Animated rain imagery │
│ └── /admin/* Retrain, export, health │
│ │
│ APScheduler (background jobs) │
│ ├── Every 10 min → Collect NEA observations │
│ ├── Every 1 hour → Collect official forecasts │
│ ├── Every 2 min → Fetch radar frames │
│ └── Sunday 2 AM → Retrain ML model │
│ │
│ PostgreSQL (Railway) / SQLite (local) │
└─────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────┐
│ Frontend (React 18 + Vite + Tailwind) │
│ │
│ ├── Interactive Leaflet map with pin placement │
│ ├── Detailed weather cards (feels-like, UV, wind) │
│ ├── ML forecast comparison panel │
│ ├── Animated radar layer │
│ ├── Full EDA + SHAP analysis dashboard │
│ └── Browser notifications on rain start/stop │
└─────────────────────────────────────────────────────┘
Yes, 17 routers is a lot. The project grew organically as I kept adding features. If I were starting over I'd probably consolidate a few, but honestly each one does one thing and they're easy to find, so I'm at peace with it.
Collecting the data
The backend polls multiple data sources on different intervals:
- NEA (Singapore) via data.gov.sg: temperature, humidity, rainfall, wind speed/direction, pressure, UV, and visibility. Polled every 10 minutes.
- Open-Meteo for Malaysia and Indonesia readings: gives regional context for incoming weather systems. Same 10-minute interval.
- weather.gov.sg for radar imagery: no official API, just direct image fetches from their CDN. Polled every 2 minutes.
- NEA forecasts: 2-hour nowcast, 24-hour outlook, 4-day forecast. Polled hourly.
Everything normalises to a WeatherRecord dataclass before hitting the
database. There's bounds checking too: if the temperature comes in at 60 degrees C or
rainfall is negative, something went wrong and the reading gets flagged. Singapore is
tropical but not that tropical.
# Validation bounds (Singapore tropical range)
TEMP_MIN, TEMP_MAX = 15.0, 42.0 # °C
HUMIDITY_MIN, HUMIDITY_MAX = 20, 100 # %
WIND_MAX = 60.0 # km/h
# Plus 3-sigma statistical outlier detection
The database uses a unique constraint on (timestamp, country, location)
with upsert logic, so duplicate polls are harmless. This matters because sometimes
NEA returns the same data twice, and I'd rather handle that at the DB level than
add brittle deduplication logic everywhere.
Meet the data
Before we get into the heavy statistical analysis, let's look at what the data actually feels like. The charts below are rendered from LionWeather's live analysis API — the same data that powers the ML dashboard.
Start with the hourly pattern. Singapore's rainfall has a very distinctive daily rhythm that anyone living here knows instinctively: mornings are usually dry, and the afternoon hours bring the thunderstorms. The data confirms it.
Hourly patterns
Annual rainfall trends
How much does it actually rain each year? And is it getting worse? The annual totals tell a story of variability — some years are significantly wetter than others, driven by El Niño / La Niña cycles and shifting monsoon patterns.
EDA: learning what the data actually looks like
If you're new to time series analysis, this section is for you. EDA (Exploratory Data Analysis) is the step where you look at your data before building any model. It sounds obvious, but skipping it is one of the most common reasons ML projects go sideways.
I had 8 years of NEA historical data (2016 to 2024) and I wanted to understand it
properly before throwing it at a model. The training script runs a full statistical
workup and saves everything to a full_analysis.json that the frontend
renders as an interactive dashboard. Here's each technique, what it means, and what
it revealed.
Step 1: Descriptive statistics
Start simple. Before any fancy analysis, compute the basics: mean, min, max, standard deviation, and percentile distributions. For LionWeather, this meant looking at annual rainfall totals, percentage of rainy hours, breakdown by intensity category (no rain, light, heavy, thundery), temperature ranges, and humidity.
What I found: In 2017, Singapore recorded 3,198 rainy hours (36.7% of all hours) with 790 thundery events. Mean temperature across years hovered around 27 to 28°C, with max readings touching 33 to 34°C. Average humidity sat around 78 to 81%. These numbers set your baseline expectations before you go deeper.
If you want to learn more about starting with descriptive statistics, this Towards Data Science guide walks through a solid six-step EDA framework for time series.
Step 2: Time-series decomposition (STL)
What is STL? STL stands for Seasonal and Trend decomposition using Loess. It splits a time series into three components: trend (the long-term direction), seasonal (repeating patterns at fixed intervals), and residual (everything left over that the first two can't explain). Think of it like separating a song into bass (trend), melody (seasonal), and noise (residual).
Why it matters: If the seasonal component is strong, your model can exploit those repeating patterns. If the residual is large, your data has a lot of randomness and your model will struggle. STL tells you upfront how predictable your data is.
What I found: The seasonal component was crystal clear. The northeast monsoon (November to January) brings sustained rain, while the southwest monsoon (May to September) brings shorter, more intense afternoon showers. But the residual was large, meaning rainfall has a lot of unexplained variance. This told me early on: don't expect 90% accuracy. Tropical rain is inherently noisy.
I used the statsmodels STL implementation. For a more detailed walkthrough of decomposition techniques, Sandeep Pawar's forecasting series covers STL alongside other decomposition methods with code examples.
Step 3: Autocorrelation (ACF and PACF)
What is ACF? Autocorrelation Function (ACF) measures how correlated a time series is with itself at different lag intervals. If the ACF at lag 1 is high, it means the current value is strongly related to the value one time step ago. PACF (Partial Autocorrelation) does the same but removes the indirect effects of intermediate lags, giving you the direct relationship at each lag. Think of it like asking a chain of people how accurately a message reaches person N — ACF measures the total distortion, PACF measures how much each individual person adds.
Why it matters: ACF and PACF tell you which lag features are worth creating. They also help determine the order of traditional time series models (AR, MA, ARMA). Even if you're using gradient boosting like I did, understanding autocorrelation guides your feature engineering.
What I found: Rainfall's ACF showed significant correlation at lag 1 (0.55) that dropped sharply to 0.20 at lag 2 and faded to near-zero by lag 6 to 8. This means: if it's raining now, it will probably still be raining in an hour (lag 1), maybe in two hours (lag 2), but by six hours the signal is gone. Temperature showed strong 24-hour periodicity, which makes sense since days are warm and nights are cool, even in Singapore.
These results directly informed my lag features: I created rainfall lags at 1h, 3h, 6h, and 24h. The ACF told me that anything beyond 6h for rainfall is mostly noise. For a deeper dive into reading ACF/PACF plots, this Towards Data Science article explains the interpretation with visual examples.
Step 4: FFT spectral analysis
What is FFT? Fast Fourier Transform converts your time-domain data into the frequency domain. Instead of asking "what happened at each time step?" you're asking "what cycles exist in this data and how strong are they?" It's like using a prism on sunlight — it separates the signal into its constituent frequencies. FFT is foundational to audio processing, signal analysis, and many areas of engineering.
Why it matters: FFT reveals hidden periodicities that might not be obvious from just plotting the raw data. Annual cycles, weekly patterns, or diurnal (day/night) rhythms all show up as distinct peaks.
What I found: The diurnal (~24h) cycle was the dominant peak for rainfall, driven by afternoon convective storms. Longer-period peaks corresponding to monsoon transitions also appeared in the spectrum. Toggle to temperature to see an even sharper 24-hour spike. These periodicities validated my decision to include time-of-day encoding and monsoon flags as features.
All of this lives in the ML Analysis tab of the app. I wanted it visible, not buried in a Jupyter notebook, because the whole point was to learn how to present EDA findings in a way that's useful to anyone curious. You can explore the charts yourself at lionweather.kooexperience.com.
Stationarity: can we even model this?
Before throwing data at a model, there's a fundamental question: is the data stationary? A stationary time series has a constant mean and variance over time. If it's not stationary, many statistical techniques fall apart, and even ML models can be fooled by drifting distributions.
The Augmented Dickey-Fuller (ADF) test checks this. It tests the null hypothesis that the series has a unit root (non-stationary). A very negative test statistic and a small p-value (< 0.05) means we can reject the null and conclude the data is stationary. Good news: it means the patterns we see are stable enough to learn from.
Feature engineering
Raw sensor readings aren't enough. The model needs context. Here's what I engineered from the base data:
Temporal features
Why sin/cos encoding? If you feed "hour = 23" and "hour = 0" to a model as raw integers, it thinks they're 23 units apart. But in reality, 11 PM and midnight are one hour apart. By encoding time as sin(2π × hour/24) and cos(2π × hour/24), you preserve the circular nature of time. Same idea for day of year. This is a standard trick in time series ML and it genuinely improves model performance.
I also added monsoon flags: NE monsoon (November to January) and SW monsoon (May to September). These are binary features that tell the model which seasonal regime it's operating in.
Lag features
What are lag features? A lag feature is simply a past value of a
variable used as input. rainfall_lag_1h is "what was the rainfall one
hour ago." The ACF analysis told me which lags carry signal (1h, 3h have strong
correlation; beyond 6h it fades). So I created lags at 1, 3, 6, and 24 hours.
The data leakage trap: This is the most common mistake in time series
ML. If you don't use .shift() correctly, your lag features can accidentally
include future data. I learned this the hard way when my first model hit 95% accuracy
and I got suspicious. Sure enough, it was peeking into the future. After fixing the
alignment, accuracy dropped to ~73%. That's the real number. Painful, but honest.
Thunderstorm indicators
Singapore's rain often comes from convective thunderstorms that build up in the afternoon. I engineered features to catch the warning signs:
- Pressure drop (3h): a falling barometer is the classic signal
- Humidity change (1h): sudden spikes before a storm
- Temperature drop (1h): cold air from a downdraft
- Wind direction change (1h): with 360-degree wrap-around handling, because wind doesn't know about our number line
- Afternoon flag: 2 PM to 6 PM, peak convective storm window
- Westerly wind flag: 225 to 315 degrees, associated with squall lines
Spatial features
Latitude, longitude, distance from CBD (Haversine), and a coastal flag. Coastal stations tend to get different rain patterns due to sea breeze convergence.
The star feature: dry spell hours
How many consecutive hours without rain before the prediction window. This ended up being the most important feature by LightGBM's built-in gain metric, and ranks in the top 5 by SHAP. Makes intuitive sense: a long dry spell in tropical Singapore often means conditions are building toward a release.
Training the model
I went with LightGBM over XGBoost or random forests for a few reasons: it handles categorical features natively, trains fast on large datasets, and has good support for multi-class classification. Also, I'd used XGBoost before and wanted to try something different. Learning was the whole point.
Data split
This matters more than people realise. A random split would leak temporal patterns. I used a strict year-based split:
- Train: 2016 to 2022 (6 years)
- Validation: 2023
- Test: 2024 (completely held out until final evaluation)
No shuffling. No interleaving. If your lag features reference data from 3 hours ago, and that 3-hour-ago sample is in the test set while you're training, congratulations: you've just built a time machine, not a model.
Two types of models
I trained both regression (predict rainfall amount in mm) and classification (predict rain category) models. The classification models use a 3-class scheme for the app's UI:
- Class 0: No rain (< 0.1 mm/hr)
- Class 1: Light rain (0.1 to 7.6 mm/hr)
- Class 2: Heavy + thundery (≥ 7.6 mm/hr)
And a 4-class scheme for benchmarking against NEA's own forecast categories, which separates heavy rain from thundery showers.
Four prediction horizons
Separate models for 1h, 3h, 6h, and 12h ahead. Each was trained on 48,687 samples, validated on 8,296, and tested on 8,765 (the held-out 2024 data). As expected, accuracy drops with the horizon:
- 1 hour: 75.2% accuracy (3-class) – the model does well short-term
- 3 hours: 66.0% accuracy – still useful, but uncertainty grows
- 6 hours: 61.0% accuracy – getting noisy
- 12 hours: 58.7% accuracy – barely better than a weighted coin flip
Loss curves
Every model's training story is told by its loss curves. These show how the model improved over training rounds, and more importantly, whether it started overfitting (when validation loss starts climbing while training loss keeps falling).
Reading the confusion matrix: The 1-hour model correctly identified 4,616 of its "no rain" predictions but misclassified 905 light rain samples as "no rain." It's conservative: it would rather tell you it's dry than cry wolf. Recall for heavy rain is lower since these events are rarer. Class imbalance is a real challenge — most hours in Singapore are technically dry.
The 12-hour model is barely better than a coin flip. In Singapore's climate, that's not surprising — a thunderstorm can form and dissipate in 30 minutes. Some things aren't meant to be predicted 12 hours ahead.
Hyperparameters
# LightGBM config (classification)
params = {
"objective": "multiclass",
"num_class": 3,
"learning_rate": 0.05,
"num_leaves": 63,
"min_child_samples": 50,
"feature_fraction": 0.8,
"bagging_fraction": 0.8,
"bagging_freq": 5,
"reg_alpha": 0.1, # L1
"reg_lambda": 0.1, # L2
"n_estimators": 500,
}
Nothing exotic. Conservative regularisation to avoid overfitting on 8 years of data. The model artifacts range from 90 KB (12h) to 14 MB (1h classification), small enough to commit to git and load at startup.
SHAP: why did you predict that?
What is SHAP? SHAP (SHapley Additive exPlanations) is a method from game theory that explains individual predictions. Instead of just saying "the model predicts heavy rain," SHAP tells you why: "the model predicts heavy rain because humidity is at 95%, there hasn't been rain for 8 hours, and wind just accelerated." Each feature gets a score showing how much it pushed the prediction up or down. It turns a black box into something you can actually reason about.
Why it matters: If your model is wrong, SHAP helps you figure out why. Maybe it's relying too heavily on a leaky feature, or ignoring something important. Without explainability, you're just trusting a number. With SHAP, you can validate whether the model's reasoning makes physical sense.
Here are the top features from LionWeather's 1-hour regression model, ranked by mean absolute SHAP value (higher means the feature has more influence on predictions):
- rain_spatial_std (6.65) – how unevenly rain is distributed across stations. High variance means localised showers, which strongly predicts nearby rainfall.
- rain_max_station (3.43) – the heaviest rainfall at any single station. If one station is getting hammered, neighbours are next.
- rain_west (3.08) – rainfall in western Singapore. Weather systems often approach from the west, so this acts as an early warning.
- rain_region_max (1.96) – peak rainfall across all regions.
- dry_spell_hours (1.76) – how long since the last rain. A long dry spell in tropical Singapore often means conditions are building for a release.
What surprised me: pressure features ranked much lower than I expected. In temperate climates, a falling barometer is the go-to rain predictor. In the tropics, spatial rainfall patterns and humidity dynamics matter more. Singapore's pressure variance is tiny compared to, say, London's. The data showed me this; I wouldn't have guessed it.
The monsoon flags also appear in the importance rankings, suggesting the model distinguishes between monsoon regimes. Combined with the strong showing of spatial features (rain_west, rain_spatial_std), this indicates the model picked up on the directional nature of Singapore's weather systems without me explicitly encoding it.
For a deeper introduction to SHAP in the context of time series, this Towards Data Science article on SHAP for time series covers the key concepts and pitfalls.
Results: confusion matrices and benchmarks
Numbers are nice, but a confusion matrix tells the full story. It shows not just overall accuracy, but what kinds of mistakes the model makes. Think of it like a fire alarm — high recall catches every fire but may give false alarms (annoying but safe), while high precision means every alarm is real but you might miss some fires (dangerous).
3-Class confusion matrix
NEA benchmark: ML vs. the national forecast
The real test: how does a gradient-boosted model compare to Singapore's official weather agency? NEA uses numerical weather prediction (NWP) with satellite data, doppler radar, and human meteorologists. My model uses 8 years of historical sensor data and some clever features. I also tested a 60/40 ensemble blend of ML and NEA.
The radar overlay
This was the "fun feature that turned out to be annoying to build" part. NEA publishes radar imagery on weather.gov.sg, but there's no official API. The backend fetches the images directly from their CDN, caches them with a 120-second TTL, and serves them to the frontend.
The Leaflet map overlays these frames as an animation, so you can watch rain systems move across the island. The bounds are hardcoded to Singapore's coordinates: 1.155°N to 1.475°N, 103.565°E to 104.130°E. There's a 500ms throttle between image fetches to be polite to their server.
It's one of those features that took a disproportionate amount of time but makes the app feel alive. Watching a rain blob drift toward your location on the map is weirdly satisfying, even if it means your laundry is about to get wet.
Frontend: making it feel like a real app
The frontend is React 18 with Vite, Tailwind, Leaflet for maps, and Recharts for the EDA visualisations. A few things I'm proud of:
NEA area snapping
When you drop a pin, it snaps to the nearest official NEA neighborhood (Ang Mo Kio, Tampines, etc.) via lat/lon distance matching. This means the weather readings actually correspond to a real station, not some random point in a park.
Sun and moon arc
Calculated client-side using SunCalc. No API needed. The card shows a live arc with the sun's current position, golden hour, and after sunset flips to show the moon and tomorrow's sunrise. A small detail, but it makes the dashboard feel polished.
Rain notifications
The app sends browser notifications when rain starts or stops at your saved locations. Crucially, it only fires on state transitions with a 1-hour cooldown. Nobody wants 50 notifications during a patchy drizzle.
The ML analysis dashboard
This is the EDA nerd section. It renders the entire full_analysis.json
as interactive charts: annual rainfall breakdown, STL decomposition, ACF/PACF with
confidence intervals, FFT spectrograms, SHAP waterfall plots, confusion matrices by
horizon, precision/recall/F2 scores, and sortable NEA benchmark tables.
Is it too much? Probably. But this was the whole point of the project: to take EDA from a notebook and put it in front of anyone who's curious. Not everyone cares about SHAP waterfall plots, but everyone wants to know if they should bring an umbrella. The app works for both audiences.
Deployment on Railway
Two services on Railway: backend on a Python buildpack, frontend served through Vite's preview mode. A few things I learned:
- Database auto-detection: SQLAlchemy checks if
DATABASE_URLstarts with "postgresql". Locally it falls back to SQLite. This means I develop on a single file and deploy to Postgres without changing any code. - Connection pooling matters: PostgreSQL with a pool size of 10 and max overflow of 20. SQLite gets
StaticPoolwithcheck_same_thread=False. Getting this wrong in production means random connection errors at 3 AM. - Model artifacts in git: The joblib files are small enough (largest is 14 MB) to commit directly. Railway builds from the repo, so the models are available at startup. No S3, no model registry. Keep it simple.
- Scheduled retraining: Every Sunday at 2 AM, the backend kicks off a retrain job using the data collected that week. It runs in a subprocess so the main API stays responsive. The admin dashboard shows training progress and logs.
Lessons learned
This project taught me more than any course I've taken. Here are the big ones:
EDA is not optional, it's the job. Before I did proper EDA, my first model was garbage. After spending a week on decomposition, autocorrelation, and feature analysis, the second model was meaningfully better. Not because I used a fancier algorithm, but because I understood the data.
Data leakage is sneaky and will flatter your metrics.
My first model hit 95% accuracy. I was thrilled for about ten minutes, then
realised my lag features were leaking future data. After fixing the split and
using proper .shift() alignment, accuracy dropped to ~75%. That's the
real number. It hurt, but it's honest.
Tropical weather is humbling. Convective thunderstorms can form in 15 minutes and vanish in 30. A model trained on 8 years of data still only hits ~59% at the 12-hour horizon. To be fair, if Singapore's weather were predictable, we wouldn't all carry umbrellas in our bags 365 days a year. Sometimes the best answer is "I don't know, check the radar."
SHAP makes you a better engineer.
When you can see that your model relies on dry_spell_hours more than
pressure, you start questioning your assumptions. The features that
"should" matter don't always match the features that do. Let the data tell you.
Ship the messy version. LionWeather started as a single-page app with three API calls. Now it has 17 routers, 4 prediction horizons, animated radar, and a full EDA dashboard. None of that would exist if I'd waited until it was "ready." I shipped early, showed friends, got feedback, and iterated. That's the only way to build something real.
If you made it this far, go open the app and see what Singapore's weather is doing right now. And if the model says "no rain" and it's pouring outside your window, well, welcome to Singapore. The weather here has been ignoring forecasts since before machine learning existed. At least my wrong predictions come with SHAP explanations.
References
- LionWeather (live app)
- Source code on GitHub
- NEA data.gov.sg API
- Open-Meteo for regional weather data
- LightGBM documentation
- SHAP documentation
- statsmodels STL decomposition
- Leaflet for interactive maps
- SunCalc for sun/moon position calculations
- Recharts for React charting
- FastAPI
- Railway for deployment
- weather.gov.sg for radar imagery