eric15342335's comments

eric15342335 · 2026-03-01T16:34:35 1772382875

just kidding: will someone make a ublock filter list for this website?

eric15342335 · 2026-02-19T16:47:53 1771519673

My first impression is that the model sounds slightly more human and a little more praising. Still comparing the ability.

eric15342335 · 2026-02-17T17:46:05 1771350365

Interesting. Have already sent 6 emails :)

eric15342335 · 2026-02-07T17:30:04 1770485404

Correct me if I am wrong, I think this is related to the term "covariate shift" (change in model input distribution x) and "concept drift".

KYRRO · 2026-02-07T18:26:15 1770488775

The interesting part is that its then not possible for true AGI with the current approach, since there is no ceiling/boundaries to "contain" it?

eric15342335 · 2025-11-18T15:52:14 1763481134

Update: it is available at https://aistudio.google.com now!

eric15342335 · 2025-10-08T17:58:36 1759946316

correlation is not causation

eric15342335 · 2025-10-05T16:37:50 1759682270

What about Pylint? iirc pylint has code duplication check as well. is it the same thing?

d-yoda · 2025-10-05T17:51:20 1759686680

Pylint's duplication check is text-based (compares lines), while pyscn uses tree edit distance on ASTs. This means pyscn can catch structural clones even when variable/function names differ.

eric15342335 · 2025-09-29T18:20:21 1759170021

Not sure if I get it correctly:

They trained a thing to learn mimicking the full attention distribution but only filtering the top-k (k=2048) most important attention tokens so that when the context window increases, the compute does not go up linearly but constantly for the attention->[query,key] process (it does grow up linearly in the graph because you still need to roughly scan the entire context window (which an "indexer" will do), but just very roughly here in order to speed up things, which is O(L) here).

eric15342335 · on Nov 2, 2024

I am a university year 2 student learning about basic mathematics and statistics related to neural networks. One thing that shocks me is that there isn't an "incremental" solution for building larger (more parameters) AI models (like GPT-4) despite having one in a smaller size e.g. GPT-3.5 (I saw the term "incremental (compiling)" nearly everywhere in the software engineering industry). I am curious how is this not possible theortically?

mynegation · on Nov 2, 2024

It is possible, just not practical in many cases. For incremental computations you should be able to either reverse the computation or store the inputs _and_ intermediate results. And you have to repeat some non trivial share of computations anyway, possibly all of it. For AI training this is prohibitively expensive and it is simpler to train from scratch. Not saying it is impossible but demand so far is not there.