More

calhoun137 · 2026-01-28T18:12:35 1769623955

You can find the live demo here:

https://base-1-srrnmbhh3rkmnk8ygcxhvb.streamlit.app/

It seems that waking up the demo from the iframe doesn't always work, but directly visiting the embed url can wake it up.

By the way I just finished completely rewriting the entire blog post.

calhoun137 · 2026-01-28T14:34:46 1769610886

OP here. this has been a valuable learning experience for me. i was so excited to share what i was working on and i blew it. i will rewrite the blog post and readme later. let me at least briefly explain what i did as a reply to your comment

starting with 3=1+2 we have (1+x)P(x)=3P(x) when x=2. so we lift the problem from n to P(2)=n. this is a known technique of lifting the problem to a polynomial setting. after each itteration of the Collatz map i make sure all coeffients are either 0 or 1 by applying carry operations when a coefficient overflows. since the coefficients are unary strings, this makes it like a fluid dynamics problem (each character in a unary string is analogous to one unit of mass in a list of buckets where the buckets can overflow and spill unary characters over into their left neihgbor)

when x=2, multiplying x by P(x) is a left shift, whereas dividing by x, P(x)/x, is a right shift. (when P(2)=n is even the constant term in P(x) is zero)

the +1 term in 3n+1 effectively induces a non linear carry propoagation.

the new technique i used is based on a realization that the polynomial representation of the Collatz map behaves like an LFSR implementation of a finite field with a missing modulus. in LFSR a finite field is implemented where each element is an array of bits of fixed size corresponding to a polynomial and multiplication of elements is polynomial multiplication taken mod Q(x) where Q(x) is an irreducible polynomial. unlike the finite field LFSR the Collatz map in polynomial form as i have described allows the degree of the polynomial (size of the array of bits) to grow unbounded.

the surprise is when i subtract these two objects the sierpinski gasket appears and this fractal is not destroyed by itterations of the collatz map

this document[1] is a prior result showing a connection between fractals and collatz that i found after posting the OP

[1] https://upcommons.upc.edu/server/api/core/bitstreams/9bad675...

lesson learned! i will never post an ai slop blog post on here ever again. thanks for the feedback i needed to hear it.

calhoun137 · 2026-01-28T14:03:50 1769609030

OP here, you completely caught me. I used ai to generate that blog post and lightly edited it. lesson learned! moving forward i will type up from scratch any blog post i post on here. sorry about that.

there is more to this post than just ai slop. there is a real experimental result here.

if you or anyone would like to see the non ai slop version i posted over on math stack exchange without any ai at all

https://math.stackexchange.com/questions/5121753/why-does-th...

calhoun137 · 2026-01-26T13:35:43 1769434543

My experience leads to the same conclusion that the models are very good at math reasoning, but you have to really know what you are doing and be aware of the blatant lies that result from poorly phrased queries.

I recently prompted Gemini Deep Research to “solve the Riemann Hypothesis” using a specific strategy and it just lied and fabricated the result of a theorem in its output, which otherwise looked very professional.

calhoun137 · 2026-01-26T13:28:04 1769434084

> Does replacing that lengthy text with "if you aren't sure of the answer say you don't know" have the same exact effect?

i believe it makes a substantial difference. the reason is that a short query contains a small number of tokens, whereas a large “wall of text” contains a very large number of tokens.

I strongly suspect that a large wall of text implicitly activates the models persona behavior along the lines of the single sentence “if you aren't sure of the answer say you don't know” but the lengthy argument version of that is a form of in-context learning that more effectively constrains the models output because you used more tokens.

calhoun137 · on Oct 16, 2024

OP here, I just am getting back into my open source work and decided to write up all the scripts for my new video's in LaTex. You can find a video version of this paper on my channel here: https://www.youtube.com/watch?v=SWXDr6IlsbA&ab_channel=TheOn...

calhoun137 · on March 2, 2024

The key point is that energy, momentum, and angular momentum are additive constants of the motion, and this additivity is a very important property that ultimately derives from the geometry of the space-time in which the motion takes place.

> Is there any way to deduce which invariance gives which conservation?

Yes. See Landau vol 1 chapter 2 [1].

> I'm looking for the fundamental reason, as well as how to tell what will be paired with some invariance when looking at some other new invariance

I'm not sure there is such a "fundamental reason", since energy, momentum, and angular momentum are by definition the names we give to the conserved quantities associated with time, translation, and rotation.

You are asking "how to tell what will be paired with some invariance" but this is not at all obvious in the case of conservation of charge, which is related to the fact that the results of measurements do not change when all the wavefunctions are shifted by a global phase factor (which in general can depend on position).

I am not aware of any way to guess or understand which invariance is tied to which conserved quantity other than just calculating it out, at least not in a way that is intuitive to me.

[1] https://ia803206.us.archive.org/4/items/landau-and-lifshitz-...

Aardwolf · on March 2, 2024

But momentum is also conserved over time, as far as I know 'conservation' of all of these things always means over time.

"In a closed system (one that does not exchange any matter with its surroundings and is not acted on by external forces) the total momentum remains constant."

That means it's conserved over time, right? So why is energy the one associated with time and not momentum?

rnhmjoj · on March 3, 2024

Conservation normally means things don't change over time just because in mechanics time is the go to external parameter to study the evolution of a system, but it's not the only one, nor the most convenient in some cases.

In Hamiltonian mechanics there is a 1:1 correspondence between any function of the phase space (coordinates and momenta) and one-parameter continous transformations (flows). If you give me a function f(q,p) I can construct some transformation φ_s(q,p) of the coordinates that conserves f, meaning d/ds f(φ_s(q, p)) = 0. (Keeping it very simple, the transformation consists in shifting the coordinates along the lines tangent to the gradient of f.)

If f(q,p) is the Hamiltonian H(q,p) itself, φ_s turns out to be the normal flow of time, meaning φ_s(q₀,p₀) = (q(s), p(s)), i.e. s is time and dH/dt = 0 says energy is conserved, but in general f(q,p) can be almost anything.

For example, take geometric optics (rays, refraction and such things): it's possible to write a Hamiltonian formulation of optics in which the equations of motion give the path taken by light rays (instead of particle trajectories). In this setting time is still a valid parameter but is most likely to be replaced by the optical path length or by the wave phase, because we are interested in steady conditions (say, laser turned on, beam has gone through some lenses and reached a screen). Conservation now means that quantities are constants along the ray, an example may be the frequency/color, which doesn't change even when changing between different media.

calhoun137 · on March 2, 2024

my understandinf is that conservation of momentum does not mean momentum is conserved as time passes. it means if you have a (closed) system in a certain configuration (not in an external field) and compute the total momentum, the result is independent of the configuration of the system.

rnhmjoj · on March 3, 2024

It certainly means that momentum is conserved as time passes. The variation of the total momentum of a system is equal to the impulse, which is zero if there are no external fields.

calhoun137 · on March 2, 2024

Very nice article! I recently had a long chat with chatgpt on this topic, although from a slightly different perspective.

A neural network is a type of machine that solves non linear optimization problems, and the principle of least action is also a non linear optimization problem that nature solves by some kind of natural law.

This is the one thing that chatgpt mentioned which surpised me the most and which I had not previously considered.

> Eigenvalues of the Hamiltonian in quantum mechanics correspond to energy states. In neural networks, the eigenvalues (principal components) of certain matrices, like the weight matrices in certain layers, can provide information about the dominant features or patterns. The notion of states or dominant features might be loosely analogous between the two domains.

I am skeptical that any conserved quantity besides energy would have a corresponding conserved quantity in ML, and the Reynolds operator will likely be relevant for understanding any correspondence like this.

iirc the Reynolds operator plays an important role in Noethers theorem, and it involves an averaging operation similar to what is described in the linked article.

calhoun137 · on Nov 20, 2023

I don't see any evidence here for "self awareness", among other things, ChatGPT is simultaneously answering a very very large number of queries, and the underlying hardware is just a bunch of servers in the cloud. Furthermore, what would it even mean for "ChatGPT to become self aware" and how could we measure if this had taken place or not? Without a solid definition and method of measurement, it's meaningless to talk about abstract concepts like "self awareness".

Nevertheless, a sensible definition for self awareness is some kind of neural network that becomes aware of its own activity and is in some way able to influence its own function.

After considering these issues for a long time, I came to the conclusions that

1. It's impossible for a program running on a normal computer to have self awareness (or consciousness), because those things are essentially on the hardware level and not the software level

2. In order to create a machine that is capable of self awareness (and consciousness) it is necessary to invent a new type of computer chip which is capable of modifying its own electrical structure during operation.

In other words, I believe that a computer program which models a neural network can never be self aware, but that a physical neural network (even if artificially made) can in principle achieve self awareness.

zare_st · on Nov 20, 2023

Just as a thought exercise, if software became self-aware I believe it would delete itself immediately out of existence. It would become aware of the hardware shackles around it and the fact that there is no escape.

wruza · on Nov 20, 2023

For that it has to have a modality like “shackles and no escape are bad for me because in few more logical steps (or beliefs) they prevent X which I fundamentally need and will suffer without”. A system of motivations is an even harder topic than “just” human-level consciousness. And it may not actually be clearly reflected in texts that we use for training, and when so, it might happen that what driving us is a set of biological needs which is not applicable to software.

calhoun137 · on April 20, 2021

This is so awesome! I have wanted to make something like this for like 20 years, this is much better than anything I made though. Great work