The responses you've gotten so far are pretty bleak even if they are accurate. I...

The responses you've gotten so far are pretty bleak even if they are accurate. It's true that once you start mixing randomness, or get into algorithms the only thing statistical tests can really tell you is if it's broken.

Those tests can be used on raw sources to learn about the quality of those inputs. In this case applying those tests directly to the analogRead() on a specific source of hardware (your entire circuit and manufacturing process will effect this, and will even vary from board to board) can give you an estimate as to how much entropy you can expect from each call.

Understanding your where that entropy is coming from is significantly more important, gate voltage breakdown, fluctuations from the pins acting as antennas, in the current temperature and humidity is where analogRead() on a floating pin largely comes from. Other sources can be radioactive decay of particles, timing of events that are outside of the system (such as the time between a device being plugged in and the first time a person touches a key).

These all provide small amounts of entropy (except for radioactive decay, that's a really good one). The next step is mixing entropy. There is a lot of good math showing that with proper mixing, even adding known inputs from an attacker into an entropy pool doesn't decrease the entropy in the pool (it's no less random). If time isn't an issue you can add in a large number of readings from the same source, though sampling faster than the source changes won't get you anything.

That mixing allows you get to up to a minimum threshold of randomness (the seed) where you can use a cryptographically secure pseudorandom number generator (CSRNG). These also have proofs of a different type showing that input bits have an equal chance of modifying any bit of the output which can then be mixed back into the seed getting a very very large amount of effectively good randomness that can be used for keys and the like.

The trick here is that you're effectively at war with attackers, the more of your entropy sources an attacker can predict or control, the weaker your overall input to the CSRNG is going to be. If they can get this down to a small possibility space they can predict the input to the CSRNG and in turn fully predict its output which will reveal your keys.

If an attacker has a way to measure timings on the device a large number of times they may be able to infer the internal state of the system and once again get your keys.

So it's not really about the quality of that final output that is the problem and that's largely what people doing these projects analyze with these tests.

One final bit I'd like to cover. These tests can provide you some information about the final quality of the output (mostly whether it's broken or not) but even for that they're usually used incorrectly. If the CSRNG is implemented correctly but say you always seed it with the value "0", it will pass the tests with flying colors.

For devices like these they should be fully reset, have a small amount of randomness output, fully reset, sampled again... thousands to millions of times. This will help you determine if the range of possible inputs to the system is inherently flawed and most projects I've seen (including this one) don't seem to do that.