It started with the x86 and PDP-11 thing. Some CPU architectures have a specific stack pointer register, and some don't. IBM mainframes don't have a dedicated stack pointer; the stack is a software construct. Neither do SPARC machines.
There are also real stack machines, where instructions take operands from the stack and push results on the stack. The Burroughs 5000 and the English Electric LEO-Marconi KDF9 introduced that architecture around 1958. It lives on in the x86 FPU register stack.
FORTRAN and COBOL originally forbade recursion. Compilers could allocate all temporary memory at compile time. Stack overflow is impossible. Some real-time systems still work that way.
The x87 FPU "stack" is not truly a stack though: it's rather a ring buffer of 8 registers.
And you can model it pretty well even without a "head pointer", just with moves between all registers, e.g. push(x) is ST(7) = ST(6), ST(6) = ST(5), ..., ST(1) = ST(0), ST(0) = x.
I've implemented it as such in a lifter from x86 machine code and all the redundant moves just disappear, if the uses of the pseudo-stack are balanced, and you're left with traditional registers.
Some microcontrollers (e.g. some PICs) even have a hardware stack for storing return addresses. This stack is on the processor die like the registers and not in the RAM which makes it even more fundamental.
Regarding BM mainframes don't have a dedicated stack pointer; the stack is a software construct., Gene Ahmdahl, when asked why the 360 architecture didn't have a stack, he said "Too expensive". One might suspect that wasn't the whole story, as the 8085 (back when $5 retail quantity one) did have a stack.
The Burroughs 5000 and the rest of their line was quite interesting architecture. Arthur Sale once told me that he would use the B6700 architecture as a "universal counterexample" when talking about language conventions, such as C using zero as NULL.
There are also real stack machines, where instructions take operands from the stack and push results on the stack. The Burroughs 5000 and the English Electric LEO-Marconi KDF9 introduced that architecture around 1958. It lives on in the x86 FPU register stack.
FORTRAN and COBOL originally forbade recursion. Compilers could allocate all temporary memory at compile time. Stack overflow is impossible. Some real-time systems still work that way.