The Bitter Lesson Leads To Evolutionary Computation

I saw a note about the bitter lesson go by and I left a comment. It’s an idea I’ve had a for a long time but not really said out loud. Recall the bitter lesson (via claude): The Bitter Lesson, articulated by Rich Sutton in 2019, argues that in artificial intelligence research, methods that leverage computation and large amounts of data have historically outperformed approaches based on human knowledge and hand-crafted rules. The “bitter” part is that our human intuitions about how to solve problems often turn out to be less effective than simple methods that scale well with computing power. This has been demonstrated repeatedly in areas like computer chess, speech recognition, and machine translation, where brute-force computational approaches ultimately surpassed carefully designed human-engineered solutions. ...

January 25, 2025 · 3 min · Jason Brownlee

Ergodicity and Path Dependence in Machine Learning

I was thinking about ergodicity yesterday in the context of machine learning. Specifically the path-dependency of training the “best” model and our typical solution to this challenge to train a monte carlo ensemble of final models (e.g. same learning algorithm + varied random seeds, same training data) and combine their predictions. Could we do better? Are we really stuck and is this really the best way out? My introduction to ergodicity came initially from Taleb’s books. I think it was Antifragile that dug into (the related idea) of Jensen’s inequality for the payoff in games, and his book Skin in the Game that dug into ergodicity in games with ruin. ...

January 25, 2025 · 8 min · Jason Brownlee

Stacking Ensemble With Dropout Regularization

I was thinking about stacking ensembles (stacked generalization) in the sauna. Stacked ensembles overfit, so we need to regularize. Generally, we use cross-validation to ensure that the meta model is fit on out-of-fold predictions. This is to avoid data leakage, but we could say it has a “regularizing” effect. For reference, here’s how that works via claude sonnet 3.5 (light editing from me): Let me explain how the meta-model in stacked generalization is fit using cross-validation: ...

January 24, 2025 · 6 min · Jason Brownlee

Magical Tokens for LLMs

Prompts for LLMs matter, a lot. I’m no expert at prompting (see this), but there are additional input tokens that offer a lot of leverage in producing better output. For example: think through this step by step And: As an expert in [this specific topic] explain [this thing i want to know] And: Output in bullet points And: Explain [this specific topic] for the audience [described here] And on. This might be an older idea (e.g. gpt3.5/gp4 - where my thinking is probably stuck), these sequences are probably baked in now. ...

January 24, 2025 · 9 min · Jason Brownlee

Conceptual Blending and LLMs

I tripped over the concept of “Conceptual Blending” (thanks Kat). From wikipedia: According to this theory, elements and vital relations from diverse scenarios are “blended” in a subconscious process, which is assumed to be ubiquitous to everyday thought and language. Again related back to ensembles/multiple perspectives, connected to the bag of analogies thinking from the other day. The best presentation of the idea is the book “The Way We Think: Conceptual Blending and The Mind’s Hidden Complexities” by Gilles Fauconnier and Mark Turner. ...

January 23, 2025 · 6 min · Jason Brownlee

Incomprehensible Artifacts From Our AIs

LLMs, or models like them, are going to start giving us artifacts that we cannot (easily) comprehend. We’ve been in this boat for a while, first with stochastic optimization algorithms (the classic nasa evolved antenna), and later with automated theorem proving. Algorithms optimize for an objective solution and we get a thing that looks like it solves the problem we want, but it’s opaque (strange, large, complex, etc.). This came to mind because of a tweet the other day (January 20 2025) by George Hotz. ...

January 23, 2025 · 4 min · Jason Brownlee

LLM Prompt Optimization

LLM prompts matter. The quality and nature of the prompt influences the quality and nature of the response. There is a space of candidate input prompts and a corresponding map of LLM responses, some of which are better/worse for whatever problem we are working on. We can frame a black box optimization problem that proposes and tunes candidate prompts for an LLM to optimize a target response. In effect, we would be finding good/better starting points in latent space from which to retrieve the desired output. ...

January 23, 2025 · 4 min · Jason Brownlee

Abstraction and Analogies with LLMs

I was listening to a recent episode of the Machine Learning Street Talk podcast (a very fine podcast!). Specifically: How Do AI Models Actually Think?, Laura Ruis. Fantastic episode! Just great. I need a re-listen. Early in the conversation they touch on (I’m paraphrasing, probably wrongly) whether LLMs think/reason as Douglas Hofstadter suggests, abstracting via a collection of analogies. It’s agreed they do. The host touches on Hofstadter’s often repeated quote: ...

January 22, 2025 · 9 min · Jason Brownlee

Gyms For All The Skills That LLMs Are Eating?

Use it or loose it. We used to do manual labor which had the dual benefit of getting the things we needed (hunt->food, work->money, etc.) and keeping us in reasonable physical (and mental) condition. No longer for many of us, so our bodies atrophy. To fight the entropy, many of us go to the gym. We simulate the labor we used to do in order to keep our bodies in good condition and reap the rewards (energy, look/feel better, longer life, etc.). ...

January 22, 2025 · 4 min · Jason Brownlee

LLMs as Fitness Functions in Stochastic Optimization

The hard part of stochastic optimization is the evaluation function. You get whatever you’re optimizing-for or toward and it’s always a trade-off, even if you can’t see it at first. This got me thinking, there must be tons of problems that we cannot optimize easily because we don’t have good (cheap) fitness functions where we could use an LLM to step in. I know I’ve read papers on something like this in the openendedness literature. ...

January 22, 2025 · 5 min · Jason Brownlee