Go Deeper
The Hypothesis was simple on the surface: See if whitewater kayakers can use AI to predict when a rain fed creek or river is going to be runnable. The best creeks are often deep in the mountains, down dirt logging roads, and don't have any weather service rain gauges or civilization nearby to report if they're runnable. The few places where gauges do exist, are often at the *bottom* of a whitewater run, so by the time the gauge spikes it's already too late. We know a run was in, only after it was in. Boaters could often make educated guesses about conditions, but those guesses were only confirmed by physically driving to the site, and looking at the water in the creek. Public weather radar maps are available, but are generally not granular enough to see the rainfall in a particular watershed in the mountains, that might only be a few miles in diameter. Commercial products that provide much better granularity also exist, but are cost prohibitive for a hobby like whitewater. We knew raw radar data was available, but developing a system to parse that data and turn it into something usable for boaters would have required a full development team, which is also not reasonable for a hobby. I knew what my whitewater community wanted, I just didn't know how to get there... that is, until I discovered Claude Code.
I started with non-agentic data collection, just to see if it was possible. After some back and forth, I was able to streamline data collection of all the USGS gauge and precipitation data, as well as bring in the 'raw' radar data I mentioned above (Thru products known as the NOAA QPF and QPE - National Oceanic and Atmosphere Administrations' Quantitative Preciptation Forecast and Quantitative Preciptation Estimate). Simply cracking this problem was a story all unto itself, but there's nothing really 'agentic' about it, so I'll just say Claude Code helped me solved this problem that I would have been able to solve otherwise. Once I had the data set though, the question was what to do with it, and this is where the agents came in. I set trigger conditions based on my own personal knowledge for when I think a creek would be 'close' to being in. Usually based off of a certain amount of rainfall over a certain timeframe in a certain area. When those conditions would get met, a larger workflow gathering *all* the data possible for the area (The QPE, QPF, USGS Gauge data, soil moisture content, topography, and personal notations from boaters) was fed into an API call to Anthropic, and it was asked a simple question "Based on this data, will this creek reachable boatable conditions, and when?". The result of this question would then make 'triggers' appear on the website, and fire off alert into a signal group chat with my local boating community. All of this was built, and we had our first live triggers a few weeks back, which were about 70% correct in predictions. A vast improvement over no predictions, but still there was room for improvment. So I built a second agentic workflow, the Creek Calibrator Bot. This bot's job was also super simple in theory - review the predictions, see which ones worked (where a gauge could confirm the boatable conditions or not, or my manual boater notation) and then make adjustments to the trigger variables for the next event. This bot is also now doing it's job based on time based triggers, and builds recursive self improvement into the prediction model, and has already made adjustments over the past few weeks to the predictor algorithm. The final agent came about after the first two were in place, and came from asking a bigger question - What is this same method for predicting when whitewater creeks were boatable, could be used for public safety on the popular Buffalo River. While not a whitewater river per se, it is used by 10s of thousands of visitors in the spring, but still subject to large and somewhat unpredictable flood conditions due to heavy rainfall deep in the same mountains that feed the whitewater streams. The same learning was applied to this final agent.. collect data, determine trends, classify trends, and built a database of what is actually happening with rainfall and river levels. Make predictions, see if those predictions come true, and if they don't improve the next prediction. I'm calling this final agent a 'study' because the goal is more loosely defined than just simple 'can we boat a creek' or not... the ultimate goal here would to be an automated app that sits on the phone of every park ranger, rental outfitter, and public boaters on the Buffalo, giving them real time, tested and proven alerts for abrupt changes in river conditions.
Stack Used
Hardware is a pair of Raspberry Pi5's, One that does all the data downloads, parsing, LLM calls, and trigger events, and the other just runs the public facing website. In generally the data collection scripts all run on cron jobs, either 15 minutes, 1 hour, or 3 hours based on the type of data being pulled. Then there is a 15 minute 'assemble and push' cron that brings all the data together into a single json which is sent over to the web server, and also triggers the LLM workflows for deeper analysis and predictions if some trigger criteria is met. The scripts are all pretty much all just simple python, and the prompts are pretty much all YAML and Markdown files. Longer term data sets (like the hourly QPE rainfall amounts, that are NOT stored long term by NOAA because of their size) are stored in a local numpy database for analysis if needed. For Models I'm pretty well using the Claude suite for most everything. Predictions are done by Sonnet (Quick cheap, and just needs a Yes, when, and how much output) and the Calibrations are done by Opus batch jobs (More nuanced reasoning required, and not time sensitive).