Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If I am not mistaken, they actually did release their code. Yesterday there was a change to the repo that added a train.py file. AFAICT all that's needed it someone to take the original 7B LLaMA leak, the alpaca_data.json file and run train.py on some beefy hardware. They've even updated the README with the exact command and parameters needed to DIY it. I'm somewhat expecting that there will be a release by someone in the next few days.


That's awesome! I think I remember them saying it was only around ~$500 in compute costs to train so I hope we see those weights released soon. I am hoping someone releases the 13B model fine-tuned.


$100.

“For our initial run, fine-tuning a 7B LLaMA model took 3 hours on 8 80GB A100s, which costs less than $100 on most cloud compute providers. We note that training efficiency can be improved to further reduce the cost.”

($500 was what they paid OpenAI to generate the fine-tuning dataset.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: