Even more than the complex prompts, they give the bot access to a variety of functions implementing primitive actions, which are crafted to give helpful hints if the bot is struggling. For example, the code for mining a block:
This function keeps a global count of how many times it's been called to mine a block that doesn't exist nearby, and will warn the bot that it should try exploring first instead.
There's nothing necessarily wrong with all that - it's an important research question to understand how much hand-holding the agent needs to be able to do these sorts of tasks. But readers should be aware that it's hardly dropping an agent into the world and leaving it to its own devices.
I do love their use of `bot.chat()` as, like, throwing an exception to the AI. maybe this finally gives developers an incentive to document their code and throw intelligible errors - the machines need it to figure out what they're doing wrong!
ChatGPT: I am sorry for the confusion. I am a large language model (LLM) trained on natural language and my training data cutoff is September 2021. Hexadecimal references to memory addresses are not part of my training data set.”
https://github.com/MineDojo/Voyager/blob/main/voyager/contro...
This function keeps a global count of how many times it's been called to mine a block that doesn't exist nearby, and will warn the bot that it should try exploring first instead.
There's nothing necessarily wrong with all that - it's an important research question to understand how much hand-holding the agent needs to be able to do these sorts of tasks. But readers should be aware that it's hardly dropping an agent into the world and leaving it to its own devices.