Hacker Newsnew | past | comments | ask | show | jobs | submit | eric15342335's commentslogin

just kidding: will someone make a ublock filter list for this website?


My first impression is that the model sounds slightly more human and a little more praising. Still comparing the ability.


Interesting. Have already sent 6 emails :)


Correct me if I am wrong, I think this is related to the term "covariate shift" (change in model input distribution x) and "concept drift".


The interesting part is that its then not possible for true AGI with the current approach, since there is no ceiling/boundaries to "contain" it?


Update: it is available at https://aistudio.google.com now!


correlation is not causation


What about Pylint? iirc pylint has code duplication check as well. is it the same thing?


Pylint's duplication check is text-based (compares lines), while pyscn uses tree edit distance on ASTs. This means pyscn can catch structural clones even when variable/function names differ.


Not sure if I get it correctly:

They trained a thing to learn mimicking the full attention distribution but only filtering the top-k (k=2048) most important attention tokens so that when the context window increases, the compute does not go up linearly but constantly for the attention->[query,key] process (it does grow up linearly in the graph because you still need to roughly scan the entire context window (which an "indexer" will do), but just very roughly here in order to speed up things, which is O(L) here).


I am a university year 2 student learning about basic mathematics and statistics related to neural networks. One thing that shocks me is that there isn't an "incremental" solution for building larger (more parameters) AI models (like GPT-4) despite having one in a smaller size e.g. GPT-3.5 (I saw the term "incremental (compiling)" nearly everywhere in the software engineering industry). I am curious how is this not possible theortically?


It is possible, just not practical in many cases. For incremental computations you should be able to either reverse the computation or store the inputs _and_ intermediate results. And you have to repeat some non trivial share of computations anyway, possibly all of it. For AI training this is prohibitively expensive and it is simpler to train from scratch. Not saying it is impossible but demand so far is not there.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: