More

newman314 · 2026-01-16T04:26:40 1768537600

There is no support for ed25519 host keys (confirmed using ssh-audit). Would be nice to have though.

As an aside, you should use ssh-audit to get recommendations for what to disable as far as less than ideal options/configs go.

newman314 · 2025-11-23T23:15:13 1763939713

Here is the CO2 monitor https://github.com/oseiler2/co2monitor

newman314 · 2025-11-22T04:13:17 1763784797

I currently use Banktivity which is OK. Would love to hear from any others that have used Banktivity and migrated to something else. Ideally, there should be OFX support.

newman314 · 2025-11-14T20:59:50 1763153990

Missed a chance to call this swiftamp instead and avoid namespace collision.

brulard · 2025-11-14T22:23:10 1763158990

Or swamp to be shorter

debo_ · 2025-11-14T21:07:04 1763154424

machamp

lacy_tinpot · 2025-11-14T21:39:47 1763156387

Isn't that a Pokeman?

ghssds · 2025-11-14T21:43:04 1763156584

Seems fishy

newman314 · 2025-10-14T05:38:39 1760420319

Agreed. I also wonder why they chose to test against a Mac Studio with only 64GB instead of 128GB.

yvbbrjdr · 2025-10-14T05:48:20 1760420900

Hi, author here. I crowd-sourced the devices for benchmarking from my friends. It just happened that one of my friend has this device.

ggerganov · 2025-10-14T05:59:34 1760421574

FYI you should have used llama.cpp to do the benchmarks. It performs almost 20x faster than ollama for the gpt-oss-120b model. Here are some samples results on my spark:

  ggml_cuda_init: found 1 CUDA devices:
    Device 0: NVIDIA GB10, compute capability 12.1, VMM: yes
  | model                          |       size |     params | backend    | ngl | n_ubatch | fa |            test |                  t/s |
  | ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | --------------: | -------------------: |
  | gpt-oss 20B MXFP4 MoE          |  11.27 GiB |    20.91 B | CUDA       |  99 |     2048 |  1 |          pp4096 |       3564.31 ± 9.91 |
  | gpt-oss 20B MXFP4 MoE          |  11.27 GiB |    20.91 B | CUDA       |  99 |     2048 |  1 |            tg32 |         53.93 ± 1.71 |
  | gpt-oss 120B MXFP4 MoE         |  59.02 GiB |   116.83 B | CUDA       |  99 |     2048 |  1 |          pp4096 |      1792.32 ± 34.74 |
  | gpt-oss 120B MXFP4 MoE         |  59.02 GiB |   116.83 B | CUDA       |  99 |     2048 |  1 |            tg32 |         38.54 ± 3.10 |

rajatgupta314 · 2025-10-14T06:41:05 1760424065

Is this the full weight model or quantized version? The GGUFs distributed on Hugging Face labeled as MXFP4 quantization have layers that are quantized to int8 (q8_0) instead of bf16 as suggested by OpenAI.

Example looking at blk.0.attn_k.weight, it's q8_0 amongst other layers:

https://huggingface.co/ggml-org/gpt-oss-20b-GGUF/tree/main?s...

Example looking at the same weight on Ollama is BF16:

https://ollama.com/library/gpt-oss:20b/blobs/e7b273f96360

yvbbrjdr · 2025-10-14T06:03:57 1760421837

I see! Do you know what's causing the slowdown for ollama? They should be using the same backend..

alecco · 2025-10-14T09:01:03 1760432463

Dude, ggerganov is the creator of llama.cpp. Kind of a legend. And of course he is right, you should've used llama.cpp.

Or you can just ask the ollama people about the ollama problems. Ollama is (or was) just a Go wrapper around llama.cpp.

ilc · 2025-10-14T11:34:02 1760441642

Was. They've been diverging.

xs83 · 2025-10-14T10:34:10 1760438050

Now this looks much more interesting! Is the top one input tokens and the second one output tokens?

So 38.54 t/s on 120B? Have you tested filling the context too?

ggerganov · 2025-10-14T14:56:47 1760453807

Yes, I provided detailed numbers here: https://github.com/ggml-org/llama.cpp/discussions/16578

nialse · 2025-10-14T18:51:37 1760467897

Makes sense you have one of the boxes. What's your take on it? [Respecting any NDAs/etc/etc of course]

__mharrison__ · 2025-10-14T06:22:18 1760422938

Curious to how this compares to running on a Mac.

xs83 · 2025-10-14T10:34:55 1760438095

TTFT on a Mac is terrible and only increases as the context increases, thats why many are selling their M3 Ultra 512GB

Eggpants · 2025-10-15T04:01:01 1760500861

So so many… eBay search shows only 15 results, 6 of them being ads for new systems…

https://www.ebay.com/sch/i.html?_nkw=mac+studio+m3+ultra+512...

newman314 · 2025-10-01T19:19:08 1759346348

Does anyone know what was used to produce the graphs?

input_sh · 2025-10-01T19:38:05 1759347485

Do you mean charts? If so, it's Datawrapper: https://www.datawrapper.de/charts

One of the quite expensive paid plans, as the free one has to have "Created with Datawrapper" attribution at the bottom. I would guess they've vibe-coded their way to a premium version without paying, as the alternative is definitely outside individual people's budgets (>$500/month).

nirewen · 2025-10-01T19:28:10 1759346890

Inspecting the page, I can see some classes "dw-chart" so I looked it up and got to this: https://www.datawrapper.de/charts. Looks a bit different on the page, but I think that's it.

singhkays · 2025-10-02T23:06:39 1759446399

it is indeed Datawrapper like other posters have said. It works well for interactivity on a static blog like Hugo.

newman314 · 2025-09-10T03:32:50 1757475170

I saw a TikTok of someone saying that farmers are not stupid (due to the wide variety of skills to successfully farm) and were just betting on Trump not actually going through with tariffs.

It's hard to have any sympathy for such cynical behavior while simultaneously asking for handouts. Especially since the same people probably voted against others getting social services.

newman314 · 2025-09-10T03:29:42 1757474982

It also hurts when I drop the iPad mini on my face. In fact, I was considering getting a Pro Max to replace both a iPhone Pro and iPad mini combo but figured it might too big of a compromise.

I wonder if anyone has successfully gone down this path.

newman314 · 2025-08-28T18:12:14 1756404734

Do you have a link to a GitHub repo for this? Also, will it be mosh compatible?

just_human · 2025-08-28T19:10:34 1756408234

Not yet, but hope to have something up in September! It’s unfortunately not most compatible - I thought about that but didn’t see a lot of value and there were some downsides like re-implementing an encryption layer that doesn’t make sense if you use WebRTC. Just curious, what’s your use case for mosh compatibility?

newman314 · 2025-08-18T18:29:57 1755541797

Does Whispering support semantic correction? I was unable to find confirmation while doing a quick search.

braden-w · 2025-08-18T18:46:44 1755542804

Hmm, we support prompts at both 1. the model level (the Whisper supports a "prompt" parameter that sometimes works) and 2. transformations level (inject the transcribed text into a prompt and get the output from an LLM model of your choice). Unsure how else semantic correction can be implemented, but always open expand the feature set greatly over the next few weeks!

joshred · 2025-08-18T18:57:06 1755543426

They might not now how whisper works. I suspect that the answer to their question is 'yes' and the reason they can't find a straightforward answer through your project is that the answer is so obvious to you that it's hardly worth documenting.

Whisper for transcription tries to transform audio data into LLM output. The transcripts generally have proper casing, punctuation and can usually stick to a specific domain based on the surrounding context.