There are several softhsm's, they just share the address space with your frontline daemon which (IMHO) defeats the purpose.
While webserver's support for PKCS#11 is annoying, it's well supported by lots and lots of other stuff (usually client side stuff like ssh, browsers etc tho). You can get webservers to do PKCS#11 today, there are docs on how to do it. They usually start with "download the source, and run configure with this pile of options."
Isn't that just because PKCS#11 is an in-process API so not really meant for calls over a socket? So wouldn't you need to actually write a PKCS#11 compliant library to plug into the server, a software HSM and then some form of serializing protocol to talk between the two? Or is there a standard way to do PKCS#11 over a socket? A quick look at the spec made it look like a "here's how our struct's are packed" kind of standard.
Yup, that pretty much sums it up. I'm currently trying to figure out if dbus could be that serialisation since it takes care of a reasonable amount of the hard work for you. But I'm no expert on GObject, so slow going. (Also, I'm not sure that I'm the best person to be writing this... I don't really have that much security knowledge, I just spent a whole pile of time trying to figure out how to secure my (client) keys recently and wondered why we didn't do something sensible for server keys.
In that case why not skip the middle man and just implement enough soft-HSM for whatever Apache/nginx needs with a simple serialization protocol just for that? Emulating all of PKCS#11 sounds like a chore for very little gain.
Having looked at PKCS#11, I'm not sure what bits you could get away with not implementing. It does have functions for things like "get random bytes", which I guess you might not want, but that's just barely any code: (int get_random_bytes() { return CKR_NOT_SUPPORTED; }).
All the complexity in this proposal is the serialisation/deserialisation which is about the same amount of work if it's pkcs#11 or some custom thing.
Custom API:
Pro: Marginally simpler to implement.
Pro: If the webserver fork()'s it by default, then more users get the benefit for the case that you can read the webserver memory.
Con: Doesn't protect against attacks that can read files readably by the webserver.
Con: Becomes complicated when you want to move to a real HSM.
Con: Isn't reusable between webservers, let alone for your mail server, xmpp server, webbrowsers, ssh clients and so on.
Using PKCS#11:
Pro: Can start with a PKCS#11 softhsm running as a seperate user today, migrate to hardware HSM with little change tomorrow.
Pro: Reusable across multiple webservers, already usable by browsers and ssh clients.
Pro: A well defined, maintained, open standard with a wide variety of implementations that already exist.
Con: Slightly more complex than a custom protocol, but I'd argue that the custom protocol would grow to cover at least what PKCS#11 supports. I'm currently investigating using dbus for the protocol, so serialisation/deserialisation is mostly taken care of.
>Con: Doesn't protect against attacks that can read files readably by the webserver.
This isn't a con for the custom API, it's a con for a forking solution instead of a serialization between two different users.
If PKCS#11 is indeed a good fit for that protocol then it's indeed a much better solution that something custom for the reasons you mention. Good luck with the implementation.
While webserver's support for PKCS#11 is annoying, it's well supported by lots and lots of other stuff (usually client side stuff like ssh, browsers etc tho). You can get webservers to do PKCS#11 today, there are docs on how to do it. They usually start with "download the source, and run configure with this pile of options."