Hacker Newsnew | past | comments | ask | show | jobs | submit | breisa's commentslogin

Just did a quick write-up on how to modify Firefox's AI features after spending some time figuring out how to modify the prompts/commands.

Isn't that just Google trying to dark-pattern you into finally clicking that checkboxes you unchecked during setup?


Alibaba released a whole set of new Qwen 3.5 models including a ~120B and a ~35B MoE.


Nice. 27B looks reasonable too.


First impressions of Qwen3.5 35b are that it's amazing


Amazing compared to local models or? How would you compare it to, say, Sonnet 4.5 and Qwen3 Coder Next?

I'm guessing it's fast due to 3B experts


Maybe look into model finetuning/distilation. Unsloth [1] has great guides and provides everything you need to get started on Google Colab for free. [1] https://unsloth.ai/


I mean they already supported the project quite a bit. @ngxson and maybe others? from Huggingface are big contributors to llama.cpp.


The thing is, with most banks you aren't even allowed to use the Wero app that has this play integrity restriction. The banks integrate Wero directly into their own apps. So its mostly up to your bank.


Those Ryzen AI Max+ 395 systems are all more or less the same. For inference you want the one with 128GB soldered RAM. There are ones from Framework, Gmktec, Minisforum etc. Gmktec used to be the cheapest but with the rising RAM prices its Framework noe i think. You cant really upgrade/configure them. For benchmarks look into r/localllama - there are plenty.


Minisforum, Gmktec also have Ryzen AI HX 370 mini PCs with 128Gb (2x64Gb) max LPDDR5. It's dirt cheap, you can get one barebone with ~€750 on Amazon (the 395 similarly retails for ~€1k)... It should be fully supported in Ubuntu 25.04 or 25.10 with ROCm for iGPU inference (NPU isn't available ATM AFAIK), which is what I'd use it for. But I just don't know how the HX 370 compares to eg. the 395, iGPU-wise. I was thinking of getting one to run Lemonade, Qwen3-coder-next FP8, BTW... but I don't know how much RAM should I equip it with - shouldn't 96Gb be enough? Suggestions welcome!


I benchmarked unsloth/Qwen3-Coder-Next-GGUF using the MXFP4_MOE (43.7 GB) quantization on my Ryzen AI Max+ 395 and I got ~30 tps. According to [1] and [2], the AI Max+ 395 is 2.4x faster than the AI 9 HX 370 (laptop edition). Taking all that into account, the AI 9 HX 370 should get ~13 tps on this model. Make of that what you will.

[1]: https://community.frame.work/t/ai-9-hx-370-vs-ai-max-395/736...

[2]: https://community.frame.work/t/tracking-will-the-ai-max-395-...


Thanks! I'm... unimpressed.


The Ryzen 370 lacks the quad channel RAM. Stay away.


Ryzen AI HX 370 is not what you want, you need strix halo APU with unified memory


Thats just the result of the model only supporting russian (and 12 other languages) and not urkainian. It maps to the closest words from training data.


Not sure if its "realtime" but the recently released VibeVoice-ASR from Microsoft does do diarization. https://huggingface.co/microsoft/VibeVoice-ASR


Deepinfra offers Whisper V3 at 0.00045$ / minute of transcribed audio.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: