I've done a bit of NES programming and really enjoy trying to cram stuff into such a tiny system.
One benefit of developing games for these old systems is that they are not moving targets. For hobby programming, you probably don't want to keep testing and recompiling old stuff to keep up with OS updates.
Even better, you'll have a small army of emulator developers making sure your games will work forever on every new platform. That includes browsers, since there are Javascript emulators for many systems. If your games are particularly tricky to emulate, that's no problem. They will probably be added to everyone's test suites.
Something like an STM32 Discovery board is a good option for recapturing the mid-90s magic. You can get a ~200-MHz Cortex-M4 or M7 with a few MB of flash, external SDRAM, and a display for less than $100. They have really basic hardware 2D accelerators.
The on-chip peripherals are well-documented, but off-chip peripherals require some digging to figure out how to program correctly.
You can debug with GDB surprisingly easily, or find a Forth to throw on there and just start poking registers.
No. Microcontrollers are the improper solution for this problem.
You can run full blown Linux efficiently at 500MHz or 600MHz processors like STM32MP1 processors, powered by AA batteries or other small battery packs.
There's also SAMA5D2, and a few other competitors in this space (both above, and below, the STM32MP1).
When we're talking about "consoles", that's "plug-and-play executables", meaning you now want a proper compile / library -> ELF + loader == Linux kernel, security, etc. etc.
Besides, a DDR2 chip gets you like 512MB of RAM for $4 and easily fits within the power-constraints of AA-batteries. There's very little benefit to going to the microwatt-scale devices like STM32 Discovery.
----------
Microprocessors for the win. Entry-level MPUs exist for a reason, and there's a ton of them well below Rasp. Pi in terms of power / performance.
There's many at the 2D level of graphical performance, but 500MHz is still a bit low for this. You'll probably want to reach into faster 1000MHz / 1GHz MPUs and push into STM32MP2 if you're reaching into 3d levels of performance. (Which is beginning to look like a cut-down cellphone chip really)
I guess it depends on which part you think is fun. Using a big microcontroller is more about pushing the hardware to its limits. Using a small Linux system is about taking advantage of existing libraries. The Playdate has an STM32F7 and it seems to do pretty well as a console.
I liked the 32F746GDISCOVERY which is $56 at Digikey. It has a Cortex-M7 CPU, 1 MB built-in flash, 8 MB of SDRAM, and a 480x272-pixel touchscreen. Games can go on a microSD card. There's a USB OTG port you can use for input.
A low-res screen like this works well because the chip can't rescale its video output.
ST provides libraries for all the peripherals so it's pretty easy to jump in if you know C. I think microPython works on a lot of these boards, too.
I've been bitten by reordering. In my case, the toolchain developers implemented the reordering step in the assembler as an extra optimization step (on by default of course), so I had to disassemble the binary to even find the problem. They had redefined the assembly language semantics to require "volatile" keywords wherever you needed ordering maintained. I turned that particular optimization off.
It's tough but doable. You have to get on people's calendars a week or two out whenever you can, and if you're lucky, it eventually turns into easy, low-stress, open-ended playdates.
Meeting other parents is a huge effort, though! It's basically dating all over again. If your kids ride the school bus, that's a big help because you automatically meet nearby parents who are home in the afternoon. Otherwise, you have to go to lots of events and ask parents for their phone numbers, but the majority don't work out for random reasons.
I think the parent is not so much complaining that it's doable, more that the concept of play dates as a thing at all is what's infuriating. I agree with this as well.
When I was young, my parents knew the parents of maybe one or two of my friends, and that was only largely because they knew each other from somewhere else. We didn't need to have our parents organize and sync up their schedules to go play together. We'd just go and meet up. If no one was around outside, might go up and knock on the door of a few friends see if they wanted to do something. But ultimately, we had largely free reign to ourselves.
Now, at a later age, I also really wouldn't want to have to get to know the parents of my kids' friends either. Meet once or twice to get to see them face-to-face, maybe get some basic contact info just in case, but for the most part, I don't really want my kids' social relationships to be based on how well I can get along with other parents (with a few small exceptions.)
A company I worked at had one of the guys behind the TrackPoint come in and give a technical talk once. He was great, absolutely obsessed with building a great user experience. The amount of thought and testing that went into that little nub was incredibly impressive. I'm not surprised that people love it so much.
Touchpads have a hard time competing on user experience because they need to be physically large to work well, and that's expensive.
I interned at Sandia Livermore and Los Alamos in college, then worked at Sandia's main site for a few years before moving out to the Bay Area to work in the more dynamic world of consumer electronics.
The labs are not for everyone, but it's the perfect job for some. If you want to work with fantastically smart people and don't mind following a lot of arbitrary rules, it can be a lot of fun. Most of my coworkers intended to spend their entire careers there.
Just like anywhere else, a lot of the day-to-day experience depends on the group you work with. In general, it's somewhere between a university campus and a defense contractor, and the mix is different for each project. The good part is that once you get a security clearance and make some friends in other groups, you can move around.
There might be some culture shock. Most employees have to be US citizens, so the labs are probably less diverse places than you might be used to. And you will really be hitting the brakes while you wait for a security clearance.
I'd say look at the job postings and give it a try! It didn't end up being for me, I don't regret the time I spent at the labs. And it's tough to beat the work-life balance. You can't take a lot of the work home, and most people take every other Friday off (9/80 schedule).
But do consider the location carefully. For example, Sandia and Los Alamos are both huge and have a huge variety of projects, but you're stuck in Albuquerque or Los Alamos which can be limiting unless you really enjoy hiking.
If you can afford it, do it! The worst that can happen is you'll just get another job.
I quit a job about 15 years ago with no plans and enough money to last me a year or two. It taught me a lot about myself. I had planned on hiking and skiing and maybe starting my own company, but mostly I just sat around catching up on TV shows and movies because all my friends were working full-time. I also did some volunteer work to fill the time and keep up contact with other people.
After about a year of that, I was really bored, and it only took a month or two to line up another job, even in the 2008 recession. I took a pay cut relative to my previous job, but I met great new people, learned a lot of new stuff, and eventually got enough equity to more than make up the salary difference.
Google asked if I would do a phone screen a few years ago for an SRE position, and I thought that process was hilarious. In my case the questions had nothing to do with my background (microcontroller firmware for sensors) and it was only by incredible luck that I could answer them. For example, I had just read about some data structure the day before.
The recruiter didn't spend much effort telling me what the job was or why I should consider leaving my current great position for it. I said, "no thanks!" when I found out what an SRE actually does.
It seems like Google's interview process is designed to measure how much a candidate wants to work at Google. This is probably okay for them, but it's going to result in them overpaying for good people.
You can do a little better with a minimax polynomial that you can derive with the Remez algorithm. If you have Matlab, you can use the excellent Chebfun package's remez command to do this almost instantly for any well-behaved function.
The best approximation I could find calculated sin(x) and cos(x) for [-pi/4,pi/4] and then combined these to cover all other inputs. These functions need only 4 or 5 terms to get +/-1 ULP accuracy for 32-bit floats.
I thought the harder problem was the argument reduction to enable calculating something like sin(10^9). It turns out that you need a lot of bits of pi to do this accurately. While testing my implementation, I was surprised to learn that the x87 fsin implementation only has 66 bits of pi, so it gives extremely wrong answers for large inputs.
> You can do a little better with a minimax polynomial that you can derive with the Remez algorithm.
Yup! This was something I meant to mention in a footnote. If I could get equivalent [infinite precision] error bounds using a minimax polynomial of a lower degree, I think this would make a difference. Otherwise, even if the theoretical error bounds are lower for a minimax polynomial of the same degree, it wouldn't do anything to combat the noise introduced by rounding errors, assuming you keep everything else the same - and that noise seems to be by far the limiting factor. But that's just conjecture - no way to know for certain without trying it.
I do still like the Chebyshev polynomial approach because it's the only non-iterative approach I know of. It's something that I could implement without the aid of a CAS program pretty trivially, if I wanted to.
> The best approximation I could find calculated sin(x) and cos(x) for [-pi/4,pi/4] and then combined these to cover all other inputs.
Yes, that also seems to be the way to go (at least for precision). I didn't mention it in the article, but even though the code I showed was rust, the actual implementation is in the form of an AST for a plugin system that only supports f32 addition, multiplication, division, and modulo as operations (no branching, f32<->int conversions, etc). My entire motivation was to create a sine function for this system. It's known that the inputs to the function will be in (-pi, pi), so I don't have to perform reduction over any larger range. I don't see a practical way to reduce down to (-pi/4, pi/4) using the operations available to me, but I could be overlooking something.
Under those constraints, Chebyshev approximation will probably outperform minimax. Minimax approximations have non-zero error at the endpoints, so they wouldn't quite hit zero at sin(pi) and your relative error would be horrible.
If you want exactly zero at the end points, you could do something like the post does, of approximating sin(x) / x(pi+x)(pi-x), or similar. You can still do that with the Remez algorithm.
Also, a while ago I realized that you can tweak the Remez algorithm to minimize relative error (rather than absolute error) for strictly-positive functions - it's not dissimilar to how this blog post does it for Chebyshev polynomials, in fact. I should really write a blog post about it, but it's definitely doable.
So combining those two, you should be able to get a good "relative minimax" approximation for pi, which might be better than the Chebyshev approximation depending on your goals. Of course, you still need to worry about numerical error, and it looks like a lot of the ideas in the original post on how to deal with that would carry over exactly the same.
For nice enough smooth functions, the minimax approximation can only improve Chebyshev by at most ~2 bits; this http://www.uta.edu/faculty/rcli/papers/li2004.pdf paper shows that the improvement is in fractional of bits for elementary functions. I'm really not sure it's worth the effort and incurring more error around every pole to tamp things down around Chebyshev's worst case. For single floats, I did find it interesting to restrict the search to coefficients that can be represented exactly as single floats. For doubles, rounding of coefficients is probably noise unless you're writing a libm.