More

snovv_crash · 2026-03-28T07:13:48 1774682028

My vote: AI induced psychosis via sycophantic assurances that the results are real. Plus a heap of Dunning-Kruger by allowing someone with just enough knowledge to be dangerous to get far enough to waste everyone's time.

snovv_crash · 2026-03-27T18:49:15 1774637355

I wonder if they also only want agents to read it, not people.

snovv_crash · 2026-03-25T15:57:01 1774454221

LLM prose is very bland and smooth, in the same way that bland white factory bread is bland and smooth. It also typically uses a lot of words to convey very simple ideas, simply because the data is typically based on a small prompt that it tries to decompress. LLMs are capable of very good data transformation and good writing, but not when they are asked to write an article based on a single sentence.

TeMPOraL · 2026-03-25T16:29:33 1774456173

That's true. I.e. it's not that they're not capable of doing better, it's just whoever's prompting them is typically too lazy to add an extra sentence or three (or a link) to steer it to a different region of the latent space. There's easily a couple dozen dimensions almost always left at their default values; it doesn't take much to alter them and nudge the model to sample from a more interesting subspace style-wise.

(Still, it makes sense to do it as a post-processing style transfer space, as verbosity is a feature while the model is still processing the "main" request - each token produced is a unit of computation; the more terse the answer, the dumber it gets (these days it's somewhat mitigated by "thinking" and agentic loops)).

snovv_crash · 2026-03-25T10:55:04 1774436104

Capex vs. opex

snovv_crash · 2026-03-24T07:28:33 1774337313

Unless you're aware of hyperspectral image adapters for LLMs they aren't capable of that either.

snovv_crash · 2026-03-23T15:57:12 1774281432

The real improvement will be when the software engineers get into the training loop. Then we can have MoE that use cache-friendly expert utilisation and maybe even learned prefetching for what the next experts will be.

zozbot234 · 2026-03-23T16:29:44 1774283384

> maybe even learned prefetching for what the next experts will be

Experts are predicted by layer and the individual layer reads are quite small, so this is not really feasible. There's just not enough information to guide a prefetch.

yorwba · 2026-03-23T17:26:36 1774286796

It's feasible to put the expert routing logic in a previous layer. People have done it: https://arxiv.org/abs/2507.20984

snovv_crash · 2026-03-23T16:34:43 1774283683

Manually no. It would have to be learned, and making the expert selection predictable would need to be a training metric to minimize.

zozbot234 · 2026-03-23T16:40:08 1774284008

Making the expert selection more predictable also means making it less effective. There's no real free lunch.

snovv_crash · 2026-03-20T11:57:23 1774007843

For CPU with bigger K you would put the centroids in a search tree, so take advantage of the sparsity, while a GPU would calculate the full NxK distance matrix. So from my understanding the bottleneck they are fixing doesn't show up on CPU.

xavxav · 2026-03-20T12:19:11 1774009151

search trees tend not to scale well to higher dimensions though, right?

from what I've seen I had the impression that Yinyang k-means was the best way to take advantage of the sparsity.

snovv_crash · 2026-03-20T17:20:15 1774027215

Most data I've used is for geospatial with D<=4 (xyzt) so for me search trees worked great. But for things like descriptor or embedding clustering yes, trees wouldn't be useful.

snovv_crash · 2026-03-20T09:35:19 1773999319

Models, however, can reproduce copyleft code verbatim, and are being redistributed. Doesn't that count?

Licences like AGPL also don't have redistribution as their only restriction.

shagie · 2026-03-20T13:54:07 1774014847

Stack Overflow has verbatim copied GPL code in some of its questions and answers. As presented by SO, that code is not under the GPL license (this also applies to other licenses - the BSD advertising clause and the original json will cause similar problems).

Arguably, the use of the code in the Stack Overflow question and answer is fair use.

The problem occurs not when someone reads the Q&A with the improperly licensed code but rather when they then copy that code verbatim into their own non GPL product and distribute that without adherence to the GPL.

It's the last step - some human distributing the improperly licensed software that is the violation of the GPL.

This same chain of what is allowed and what is not is equally applicable to LLMs. Providing examples from GPL licensed material to answer a question isn't a license violation. The human copying that code (from any source) and pasting it into their own software is a license violation.

---

Some while back I had a discussion with a Swiss developer about the indefinite article used before "hobbit" in a text game. They used "an hobbit" and in the discussion of fixing it, I quoted the first line of The Hobbit. "In a hole in the ground there lived a hobbit." That cleared it up and my use of it in that (and this) discussion is fair use.

If someone listening to that conversation (or reading this one) thought that the bit that I quoted would be great on a T-shirt and them printed that up and distributed it - that would be a copyright violation.

Google's use of thumbnails for images was found to be fair use. https://en.wikipedia.org/wiki/Perfect_10,_Inc._v._Amazon.com...

    The Ninth Circuit did, however, overturn the district court's decision that Google's thumbnail images were unauthorized and infringing copies of Perfect 10's original images. Google claimed that these images constituted fair use, and the circuit court agreed. This was because they were "highly transformative."

If I was to then take those thumbnails from a google image search and distribute that as an icon library, I would then be guilty of copyright infringement.

I believe that Stack Overflow, Google Images, and LLM models and their output constitutes an example of transformative fair use. What someone does with that output is where copyright infringement happens.

My claim isn't that AI vendors are blameless but rather that in the issue of copyright and license adherence it is the human in the process that is the one who has agency and needs to follow copyright (and for AI agents that were unleashed without oversight, it is the human that spun them up or unleashed them).

snovv_crash · 2026-03-17T15:08:46 1773760126

Curious how this would deal with things like Kahan Summation, which corrects floating point errors that theoretically wouldn't exist if you had infinite precision representations.

snovv_crash · 2026-03-15T16:50:37 1773593437

But it's like crypto then, good for buying other crypto, or illegal stuff.

Also people are using CC for the cheap access to the model, otherwise they'd be using opencode.