More

breisa · 2026-03-22T19:24:30 1774207470

Just did a quick write-up on how to modify Firefox's AI features after spending some time figuring out how to modify the prompts/commands.

breisa · 2026-03-02T06:23:15 1772432595

Isn't that just Google trying to dark-pattern you into finally clicking that checkboxes you unchecked during setup?

breisa · 2026-02-25T06:35:15 1772001315

Alibaba released a whole set of new Qwen 3.5 models including a ~120B and a ~35B MoE.

jasonjmcghee · 2026-02-25T15:33:06 1772033586

Nice. 27B looks reasonable too.

ianlpaterson · 2026-02-26T02:49:01 1772074141

First impressions of Qwen3.5 35b are that it's amazing

jasonjmcghee · 2026-02-26T03:33:49 1772076829

Amazing compared to local models or? How would you compare it to, say, Sonnet 4.5 and Qwen3 Coder Next?

I'm guessing it's fast due to 3B experts

breisa · 2026-02-20T20:33:41 1771619621

Maybe look into model finetuning/distilation. Unsloth [1] has great guides and provides everything you need to get started on Google Colab for free. [1] https://unsloth.ai/

breisa · 2026-02-20T20:25:49 1771619149

I mean they already supported the project quite a bit. @ngxson and maybe others? from Huggingface are big contributors to llama.cpp.

breisa · 2026-02-16T20:24:57 1771273497

The thing is, with most banks you aren't even allowed to use the Wero app that has this play integrity restriction. The banks integrate Wero directly into their own apps. So its mostly up to your bank.

breisa · 2026-02-16T20:09:16 1771272556

Those Ryzen AI Max+ 395 systems are all more or less the same. For inference you want the one with 128GB soldered RAM. There are ones from Framework, Gmktec, Minisforum etc. Gmktec used to be the cheapest but with the rising RAM prices its Framework noe i think. You cant really upgrade/configure them. For benchmarks look into r/localllama - there are plenty.

aruggirello · 2026-02-16T22:01:54 1771279314

Minisforum, Gmktec also have Ryzen AI HX 370 mini PCs with 128Gb (2x64Gb) max LPDDR5. It's dirt cheap, you can get one barebone with ~€750 on Amazon (the 395 similarly retails for ~€1k)... It should be fully supported in Ubuntu 25.04 or 25.10 with ROCm for iGPU inference (NPU isn't available ATM AFAIK), which is what I'd use it for. But I just don't know how the HX 370 compares to eg. the 395, iGPU-wise. I was thinking of getting one to run Lemonade, Qwen3-coder-next FP8, BTW... but I don't know how much RAM should I equip it with - shouldn't 96Gb be enough? Suggestions welcome!

cpburns2009 · 2026-02-17T14:20:29 1771338029

I benchmarked unsloth/Qwen3-Coder-Next-GGUF using the MXFP4_MOE (43.7 GB) quantization on my Ryzen AI Max+ 395 and I got ~30 tps. According to [1] and [2], the AI Max+ 395 is 2.4x faster than the AI 9 HX 370 (laptop edition). Taking all that into account, the AI 9 HX 370 should get ~13 tps on this model. Make of that what you will.

[1]: https://community.frame.work/t/ai-9-hx-370-vs-ai-max-395/736...

[2]: https://community.frame.work/t/tracking-will-the-ai-max-395-...

aruggirello · 2026-02-18T14:12:44 1771423964

Thanks! I'm... unimpressed.

Tepix · 2026-02-17T04:19:26 1771301966

The Ryzen 370 lacks the quad channel RAM. Stay away.

paulsmal · 2026-02-17T00:01:09 1771286469

Ryzen AI HX 370 is not what you want, you need strix halo APU with unified memory

breisa · 2026-02-04T19:13:43 1770232423

Thats just the result of the model only supporting russian (and 12 other languages) and not urkainian. It maps to the closest words from training data.

breisa · 2026-02-04T19:10:54 1770232254

Not sure if its "realtime" but the recently released VibeVoice-ASR from Microsoft does do diarization. https://huggingface.co/microsoft/VibeVoice-ASR

breisa · 2026-02-04T19:05:21 1770231921

Deepinfra offers Whisper V3 at 0.00045$ / minute of transcribed audio.