Is there no need to tune the number of threads with an approach like this? Or is there a general notion of the appropriate number of threads given the number of CPU cores?
Correct. ForkJoinPool defaults to Runtime.getRuntime().availableProcessors() threads (but can be adjusted). The reducers library (https://github.com/clojure/clojure/commit/89e5dce0fdfec4bc09...) seems to initialize the pool with the default constructor.