Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

An often overlooked feature of the Gemini models is that they can write and execute Python code directly via their API.

My llm-gemini plugin supports that: https://github.com/simonw/llm-gemini

  uv tool install llm
  llm install llm-gemini
  llm keys set gemini
  # paste key here
  llm -m gemini-2.5-flash-preview-04-17 \
    -o code_excution 1 \
    'render a mandelbrot fractal in ascii art'
I ran that just now and got this: https://gist.github.com/simonw/cb431005c0e0535343d6977a7c470...

They don't charge anything extra for code execution, you just pay for input and output tokens. The above example used 10 input, 1,531 output which is $0.15/million for input and $3.50/million output for Gemini 2.5 Flash with thinking enabled, so 0.536 cents (just over half a cent) for this prompt.



See a example full in a few commands using uv think "wow I bet that Simon guy from twitter would love this" ... it's already him.


> An often overlooked feature of the Gemini models is that they can write and execute Python code directly via their API.

Could you elaborate? I thought function calling is a common feature among models from different providers


The Gemini API runs the Python code for you as part of your single API call, without you having to handle the tool call request yourself.


This is so much cheaper than re-prompting each tool use.

I wish this was extended to things like: you could give the model an API endpoint that it can call to execute JS code, and the only requirement is that your API has to respond within 5 seconds (maybe less actually).

I wonder if this is what OpenAI is planning to do in the upcoming API update to support tools in o3.


I imagine there wouldn’t bd much of a cost to the provider on the API call there so much longer times may be possible. It’s not like this would hold up the LLM in any way, execution would get suspended while the call is made and the TPU/GPU will serve another request.


They need to keep KV cache to avoid prompt reprocessing, so they would need to move it to ram/nvme during longer api calls to use gpu for another request


This common feature requires the user of the API to implement the tool, in this case, the user is responsible to run the code the API outputs. The post you replied suggests that Gemini will run the code for the user behind the API call.


That was how I read it as well, as if it had a built-in lambda type service in the cloud.

If we're just talking about some API support to call python scripts, that's pretty basic to wire up with any model that supports tool use.


I wish Gemini could do this with Go. It generates plenty of junk/non-parseable code and I have to feed it the error messages and hope it properly corrects it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: