Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Try out my model vs gpt4 for the same tasks (I explicitly trained on) and compare. https://huggingface.co/Tostino/Inkbot-13B-8k-0.2

It's a 13b param model that isn't meant to be general purpose, but is meant to excel on the limited tasks I've trained on.

You'll see more like this soon.



Any suggestions for creating training data? Did you just manually create your own dataset or did you use any synthetic methods?


Absolutely, pick a complicated problem and keep breaking it down with an existing model (whatever sota) until you have a consistent output for each step of your problem.

And then stitch all the outputs together into a coherent single response for your training pipeline.

After that you can do things like create q&a pairs about the input and output values that will help the model understand the relationships involved.

With that, your training loss should be pretty reasonable for whatever task you are training.

The other thing is, don't try and embed knowledge. Try and train thought patterns when specific knowledge is available in the context window.


Interesting, thank you!


Is this a Llama 2 fine tune?


Yeah, check out some of the showcase I posted above with some more info: https://news.ycombinator.com/item?id=38482347


OpenAI has the benefit that it's a hosted service. Even if you can set something up at home, not everybody wants to do that.


I'm not competing with OpenAI... I did a whole bunch of work, and released it for anyone who wants to use it.

It does what I trained it on well. Use it if you want to, or don't. Either way.


Never meant to imply anything against your model. The fact that you released one at all is still more than I have to say for myself.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: