Rustler is great. Though this gets me thinking about how you can maintain as many Elixir invariants and conventions as possible, even while escaping them under the covers. Being able to call FeGraph.set/2 and have db actually be mutated violates Elixir's common call patterns, even if it's technically allowed.
For example: I wonder if it wouldn't be more "erlangy"/"elixiry" to model the mutable ops behind a genserver that you send messages to. In the Elixir world it's perfectly normal to make GenServer.call/3 and expect the target PID to change its internal state in a non-deterministic way. It's one of the only APIs that explicitly blesses this. The ETS API is another.
Alternatively, you could have the ref store both a DB sequence and a ref ID (set to the last DB sequence), and compare them on operations. If you call FeGraph.set/2 with the same db ref two times, you compare the ref ID to the sequence and panic if they aren't equal. They always need to operate with the latest ref. Then at last the local semantics are maintained.
Maybe this is less relevant for the FeGraph example, since Elixir libs dealing with data are more willing to treat the DB as a mutable thing (ETS, Digraph). But the it's not universal. Postgrex, for example, follows the DB-as-PID convention. Defaulting to an Elixiry pattern by default for Rustler implementation is probably a good practice.
That's an interesting point that I should perhaps have covered in the original article.
The real code that this is based on is in fact hidden behind a GenServer for this exact reason -- to maintain the expectations of other Elixir code that has to interact with it. The advantage of the escape hatch, as another commenter mentions, is allowing efficient sparse mutations of a large chunk of data, without having to pay a copy penalty every time. I definitely wouldn't recommend sharing the db handle widely.
Did you consider a port (written in Rust) instead of a NIF?
When you're presenting a GenServer like message passing interface a port is a natural fit, with none of the risks related to linking a NIF into the VM itself.
(admittedly those risks are much lower with Rust than C)
In our case one of these NIF stores is created per user for a specific task; ironically, with the amount of polish that Rustler puts around NIFs I suspect it would have been more work and more risk to go down the port route and manage everything manually.
Have you measured performance? If mutating from Elixir like this can bring serious benefits, maybe there's a place for mutable versions of libraries like Explorer and Nx.
Explorer does actually use Rust (and polars) for a lot of its work -- its one on the libraries I looked at while figuring out my memory management issues.
No, it doesn't -- looking at the website that's an explicit trade-off of pure performance vs 'Elixir-ish-ness'. It would certainly break a lot of expectations to have data mutating like that without it being hidden away somewhere, so I can understand why they went that way.
In my case the data I'm dealing with is more of a store than a single data item, so I'm leaning on the example of things like ETS. Also it's within a single application rather than being a large generally-available library, so the trade-offs are different. It would be interesting to know if they did tests though.
> For example: I wonder if it wouldn't be more "erlangy"/"elixiry" to model the mutable ops behind a genserver that you send messages to.
It depends on the use case. For example, when creating a resource (basically a refcounted datastructure), it might make sense to allow mutable access only through a process as the "owner" of the resource. But if you have only read-only data behind that resource, sharing the resource similar to ETS might be what you want.
For example: I wonder if it wouldn't be more "erlangy"/"elixiry" to model the mutable ops behind a genserver that you send messages to. In the Elixir world it's perfectly normal to make GenServer.call/3 and expect the target PID to change its internal state in a non-deterministic way. It's one of the only APIs that explicitly blesses this. The ETS API is another.
Alternatively, you could have the ref store both a DB sequence and a ref ID (set to the last DB sequence), and compare them on operations. If you call FeGraph.set/2 with the same db ref two times, you compare the ref ID to the sequence and panic if they aren't equal. They always need to operate with the latest ref. Then at last the local semantics are maintained.
Maybe this is less relevant for the FeGraph example, since Elixir libs dealing with data are more willing to treat the DB as a mutable thing (ETS, Digraph). But the it's not universal. Postgrex, for example, follows the DB-as-PID convention. Defaulting to an Elixiry pattern by default for Rustler implementation is probably a good practice.