Absolutely, pick a complicated problem and keep breaking it down with an existing model (whatever sota) until you have a consistent output for each step of your problem.
And then stitch all the outputs together into a coherent single response for your training pipeline.
After that you can do things like create q&a pairs about the input and output values that will help the model understand the relationships involved.
With that, your training loss should be pretty reasonable for whatever task you are training.
The other thing is, don't try and embed knowledge. Try and train thought patterns when specific knowledge is available in the context window.
It's a 13b param model that isn't meant to be general purpose, but is meant to excel on the limited tasks I've trained on.
You'll see more like this soon.