Why do you think a breakthrough in AI Alignment should require doing math? Many ...

YossarianFrPrez · 2026-01-05T21:36:47 1767649007

Fair question. While I'm not an expert on AI Alignment, I'd be surprised if any AI alignment approach did not involve real math at some point, given that all machine learning algorithms are inherently mathematical-computational in nature.

Like I would imagine one has to know things like how various reward functions work, what happens in the modern variants of attention mechanisms, how different back-propagation strategies affect the overall result etc. in order to come up with (and effectively leverage) reinforcement learning with human feedback.

I did a little searching, here's a 2025 review I found by entering "AI Alignment" into Google Scholar, and it has at least one serious looking mathematical equation: https://dl.acm.org/doi/full/10.1145/3770749 (section 2.2). This being said, maybe you have examples of historical breakthroughs in AI Alignment that didn't involve doing / understanding the mathematical concepts I mentioned in the previous paragraph?

In the context of the above article, I think it's possible that some people are talking to ChatGPT on a buzzword level end up thinking that alignment can be solved via "fractal recursion of human in the loop validation sessions" for example. It seems like a modern incarnation of people thinking they can trisect the angle: https://www.ufv.ca/media/faculty/gregschlitt/information/Wha...

DenisM · 2026-01-05T22:51:41 1767653501

> maybe you have examples of historical breakthroughs in AI Alignment that didn't involve doing / understanding the mathematical concepts I mentioned in the previous paragraph?

Multi agentic systems appear to have strong potential. Will that work out? I don’t know. But I know the potential there.

gaigalas · 2026-01-05T23:18:44 1767655124

> maybe you have examples of historical breakthroughs in AI Alignment

OpenAI confessions is a good example of largely non-mathematical insight:

https://arxiv.org/abs/2512.08093

I don't know, I think it's good stuff. Would you agree?

> I think it's possible that some people are talking to ChatGPT on a buzzword level

I never said this is not happening. This definitely happens.

What I said is very different. I'm saying that you don't need to be a mathematician to have good insights into novel ways of improving AI alignment.

You definitely need good epistemic intuition though.