They didn't release their code or weights, but they did release the training dat...

gorbypark · on March 16, 2023

If I am not mistaken, they actually did release their code. Yesterday there was a change to the repo that added a train.py file. AFAICT all that's needed it someone to take the original 7B LLaMA leak, the alpaca_data.json file and run train.py on some beefy hardware. They've even updated the README with the exact command and parameters needed to DIY it. I'm somewhat expecting that there will be a release by someone in the next few days.

doctoboggan · on March 16, 2023

That's awesome! I think I remember them saying it was only around ~$500 in compute costs to train so I hope we see those weights released soon. I am hoping someone releases the 13B model fine-tuned.

throwaway1851 · on March 16, 2023

$100.

“For our initial run, fine-tuning a 7B LLaMA model took 3 hours on 8 80GB A100s, which costs less than $100 on most cloud compute providers. We note that training efficiency can be improved to further reduce the cost.”

($500 was what they paid OpenAI to generate the fine-tuning dataset.)

yieldcrv · on March 16, 2023

ah, right I did notice that because people were running queries into the training data.

why is there a general assumption that unreleased unreleased weights are better? is that something we can do, a free-weights community that solves this recurring issue?