I've also been creating HTML tools over the years when I couldn't find client-side only websites for PDF to SVG, CSV to Sheet, Audio to Video, Video to MP4.
I myself have wondered why VisionOS Safari wasn't more leaning into the idea that the DOM has semantics (e.g. <header> <footer> <nav>) and CSS is already able to convey depth information.
vllm and ollama assume different settings and hardware. Vllm backed by the paged attention expect a lot of requests from multiple users whereas ollama is usually for single user on a local machine.
> Google's pricing also assumes you're running it 24/7 for an entire month
What makes you think that?
Cloud Run [pricing page](https://cloud.google.com/run/pricing) explicitly says : "charge you only for the resources you use, rounded up to the nearest 100 millisecond"
Because the pricing when creating an instance shows me the cost for the entire month, then works out the average hourly price based on that. This is just creating a GPU VM instance, I don't see how to see the cost of different NVidia GPUs without it.
If you wanted to show hourly pricing, you would show that first, then calculate the monthly price from the hourly rate. I've no idea if the monthly cost includes sustained usage discount and what the hourly cost is for just running it for an hour.
reply