Could something like this enable SSI distributed computing entirely in user space? Is it practical to run some sort of distributed shared memory, relying on user-managed page faults to ensure coherence?
Possibly dumb question, but if I have some code that tries to blindly jump to, say, address 0x4000, would userfaultfd let me redirect accesses to 0x4000 to an address I've allocated elsewhere?
Sort of, but not exactly. You can't redirect the program to a different address, but you can respond to the fault by mapping memory at address 0x4000 and placing whatever you want in it. Perhaps, you can map the memory as an alias of some other page, so that they refer to the same underlying physical memory. But the program will still see the memory as living at 0x4000, it won't jump to a different address. (Though I suppose if you're specifically thinking of the instruction pointer faulting at 0x4000, you could respond by filling in that address with a jump instruction jumping to the address of your choice...)
To map memory at a specific address, you use mmap() with the MAP_FIXED flag. This lets you tell the kernel: "Please put these pages at this virtual address, replacing what's already there."
Note that userfaultfd is not the only way to detect and handle faults. You can also register a signal handler for SIGSEGV. The signal handler function receives information from the kernel specifying what address caused the fault. If, in the signal handler, you use mmap() to map memory at that address, and then return, the program will continue as if nothing happened.
userfaultfd allows you to do the same but from a different thread or process. The SIGSEGV signal handler would actually run inside the thread that faulted, but with userfaultfd the faulting thread pauses and some other thread or process receives the notification.
I think specifically for 0x4000 you can't map anything there with a default Linux kernel configuration. The minimum address where you can map things now is 0x10000. This broke my StoneKnifeForth and Max Bernstein fixed it: https://github.com/tekknolagi/stoneknifecpp/commit/905105b44...
Thank you for in-depth answer! It was partially motivated by seeing if userfaultfd could work around mmap_min_addr (it can’t) and partially to see if it is possible to use it to use it for things like emulation.
You don't need userfaultfd for that. Just mmap the same file twice, at different addresses:
int mem_fd = memfd_create("", 0);
ftruncate(fd, 4096);
void* mem = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_SHARED, mem_fd, 0);
void* alias = mmap((void*) 0x4000, 4096, PROT_READ | PROT_WRITE, MAP_FIXED | MAP_SHARED, mem_fd, 0);
close(mem_fd);
// alias now points to the same physical memory as mem
This is commonly used to create ring buffers that wrap around transparently.
I could see it being useful to map the same area twice, adjacently; not specifically at 0x4000 though. You'd still wrap pointers at some point, but as long as sizes are limited to fit into your buffer, you could do memcpy and what not without having to check boundaries. If you were mapped at 0x4000 and aliased at 0x5000, you could memcpy something to 0x4FFF -> 0x5003, and not have to do a split write of 0x4fff and then 0x4000 -> 0x4003.
(If your buffer is bigger than a page, this will still work, just use bigger addresses, etc. if your buffer is smaller than a page, you're stuck)
https://tech.nextroll.com/blog/data/2016/11/29/traildb-mmap-...