do you know about the dopamine reward error hypothesis? https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6721851/ is it so wrong? what does cognitive psychology have to say about how these neurons work? this is a lot more recent than the 40s and behaviorism.
dopamine rewards operate on a different time scale vs. that required by these error correction models. I don't remember the exact paper, will need to look it up, but it was orders of magnitude difference in response times.
Edit: for authoritative reference on biologically-plausible learning see anything by Edmund Rolls [1]. He explicitly stated in his recent book [2] that something like back-propagation, or similar error correction mechanisms have no supporting evidence in experimental data collected so far