Fair question. While I'm not an expert on AI Alignment, I'd be surprised if any AI alignment approach did not involve real math at some point, given that all machine learning algorithms are inherently mathematical-computational in nature.
Like I would imagine one has to know things like how various reward functions work, what happens in the modern variants of attention mechanisms, how different back-propagation strategies affect the overall result etc. in order to come up with (and effectively leverage) reinforcement learning with human feedback.
I did a little searching, here's a 2025 review I found by entering "AI Alignment" into Google Scholar, and it has at least one serious looking mathematical equation: https://dl.acm.org/doi/full/10.1145/3770749 (section 2.2). This being said, maybe you have examples of historical breakthroughs in AI Alignment that didn't involve doing / understanding the mathematical concepts I mentioned in the previous paragraph?
In the context of the above article, I think it's possible that some people are talking to ChatGPT on a buzzword level end up thinking that alignment can be solved via "fractal recursion of human in the loop validation sessions" for example. It seems like a modern incarnation of people thinking they can trisect the angle: https://www.ufv.ca/media/faculty/gregschlitt/information/Wha...
> maybe you have examples of historical breakthroughs in AI Alignment that didn't involve doing / understanding the mathematical concepts I mentioned in the previous paragraph?
Multi agentic systems appear to have strong potential. Will that work out? I don’t know. But I know the potential there.
Many alignment problems are solved not by math formulas, but by insights into how to better prepare training data and validation steps.