We've been experimenting with combining durable execution with debugging tasks, and it's working incredibly well! With the added context of actual execution data, defined by the developer as to which functions are important (instead of individual calls), it give the LLM the data it needs.
I know there are AI SRE companies that have discovered the same -- that you can't just throw a bunch of data at a regular LLM and have it "do SRE things". It needs more structured context, and their value add is knowing what context and what structure is necessary.
You just reminded me of my time working at Sendmail, where I often had to telnet to port 25 of some machine, and pretend to be a mail server sending email.
I used to be able to send all the commands without having to look them up. Not sure I could still do that today.
I think can still do it, 30 years after I last had to. The trauma of debugging sendmail m4 config issues for
hours while the company e-mail remained dysfunctional has permanently etched it into my mind.
EHLO example.com
MAIL FROM:<foo@example.com>
RCPT TO:<bar@example.com>
DATA
Subject: Hello, World
I have crawled through the depths of hell to deliver unto you this message.
.
I haven't worked at sendmail or even anything e-mail related, and I can do that… just enough e-mail fixing as side work. Let's call it sysadmin calluses.
What made me stumble recently was having to talk LMTP to fix a mailman setup. Cheeky fuckers changed EHLO into LHLO for LMTP. (To avoid any mixups between the protocols, which is fair.)
Also TO doesn't need to match. When you send to a group of BCC the envelope To has to specify the exact recipient, but the DATA doesn't. Similar with the envelope From and the one in the DATA - also useful to control bounces or who gets a reply.
Yeah I know the protocol and can do that manual, because I had to debug it often enough.
As is the case with most vibe coded software, it wasn't polished, didn't work very well, had lots of edge cases, and was pretty much bespoke to my one use case. :)
It answered the question "what the heck is this software sending to the LLM" but that was about all it was good for.
> You realize that stamina is a core bottleneck to work
There has been a lot of research that shows that grit is far more correlated to success than intelligence. This is an interesting way to show something similar.
AIs have endless grit (or at least as endless as your budget). They may outperform us simply because they don't ever get tired and give up.
Full quote for context:
Tenacity. It's so interesting to watch an agent relentlessly work at something. They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day. It's a "feel the AGI" moment to watch it struggle with something for a long time just to come out victorious 30 minutes later. You realize that stamina is a core bottleneck to work and that with LLMs in hand it has been dramatically increased.
>They never get tired, they never get demoralized, they just keep going and trying things where a person would have given up long ago to fight another day.
"Listen, and understand! That Terminator is out there! It can't be bargained with. It can't be reasoned with. It doesn't feel pity, or remorse, or fear. And it absolutely will not stop... ever, until you are dead!"
If I tell it to implement something it will sometimes declare their work done before it's done. But if I give Claude Code a verifiable goal like making the unit tests pass it will work tirelessly until that goal is achieved. I don't always like the solution, but the tenacity everyone is talking about is there
> but the tenacity everyone is talking about is there
I always double-check if it doesn't simply exclude the failing test.
The last time I had this, I discovered it later in the process. When I pointed this out to the LLM, it responded, that it acknowledged thefact of ignoring the test in CLAUDE.md, and this is justified because [...]. In other words, "known issue, fuck off"
> If you ever work with LLMs you know that they quite frequently give up.
If you try to single shot something perhaps. But with multiple shots, or an agent swarm where one agent tells another to try again, it'll keep going until it has a working solution.
Yeah exactly this is a scope problem, actual input/output size is always limited> I am 100% sure CC etc are using multiple LLM calls for each response, even though from the response streaming it looks like just one.
Context matters, for an LLM just like a person. When I wrote code I'd add TODOs because we cannot context switch to another problem we see every time.
But you can keep the agent fixated on the task AND have it create these TODOs, but ultimately it is your responsibility to find them and fix them (with another agent).
Using LLMs to clean those up is part of the workflow that you're responsible for (... for now). If you're hoping to get ideal results in a single inference, forget it.
I realized a long time ago that I’m better at computer stuff not because I’m smarter but because I will sit there all day and night to figure something out while others will give up. I always thought that was my superpower in the job industry but now I’m not so sure if it will transfer to getting AI to do what I need done…
Same, I barely made it through Engineering school, but would stay up all night figuring out everything a computer could do (before the internet).
I did it because I enjoyed it, and still do. I just do it with LLMs now. There is more to figure out than ever before and things get created faster than I have time to understand them.
LLM should be enabling this, not making it more depressing.
Me three. I was not as smart as many of my peers in uni but I freakin LOVE the subject matter and I also love studying and feeling that progress of learning, which led me to put in the huge number of hours necessary to be successful and have a positive attitude the whole time.
But even tenacity is not enough. You also need an internal timer. "Wait a minute, this is taking too long, it shouldn't be this hard. Is my overall approach completely wrong?"
I'm not sure AIs have that. Humans do, or at least the good ones do. They don't quit on the problem, but they know when it's time to consider quitting on the approach.
> AIs have endless grit (or at least as endless as your budget).
That is the only thing he doesn't address: the money it costs to run the AI. If you let the agents loose, they easily burn north of 100M tokens per hour. Now at $25/1M tokens that gets quickly expensive. At some point, when we are all drug^W AI dependent, the VCs will start to cash in on their investments.
LLMs do not have grit or tenacity. Tenacity doesn't desribe a machine that doesn't need sleep or experience tiredness, or stress. Grit doesn't describe a chatbot that will tirelessly spew out answers and code because it has no stake or interest in the result, never perceives that it doesn't know something, and never reflects on its shortcomings.
> because they’re trying to normalise the AI’s writing style,
AIs use em dashes because competent writers have been using em dashes for a long time. I really hate the fact that we assume em dash == AI written. I've had to stop using em dashes because of it.
Likewise, I’m now reluctant to use any em dashes these days because unenlightened people immediately assume that it’s AI. I used em dashes way before AI decided these were cool
I don't know about other areas, but here in the Bay Area (or at least Silicon Valley) our Whole Foods has subsumed all the services provided by Amazon Fresh (and Go really never worked). So we're not really losing any services, just the brand name.
This is something I've been lamenting for a long time. The lack of shared culture. Sometimes a mega-hit briefly coalesces us, but for the most part everyone has their own thing.
I miss the days when everyone had seen the same thing I had.
I found this the other day: https://www.youtube.com/watch?v=ksFhXFuRblg "NBC Nightly News, June 24, 1975" I strongly urge people to watch this, it's 30 minutes but there are many very illuminating insights within. One word for you: Exxon.
While I was young in 1975, I did watch ABC's version of the news with my grandparents, and continued up through high school. Then in the late 1980s I got on the Internet and well you know the rest.
"Back Then", a high percentage of everybody I or my grandparents or my friends came into contact with watched one of ABC, NBC, or CBS news most nights. These three networks were a bit different, but they generally they all told the same basic stories as each other.
This was effectively our shared reality. Later in high school as I became more politically focused, I could still talk to anybody, even people who had completely opposite political views as myself. That's because we had a shared view of reality.
Today, tens of millions of people see the exact same footage of an officer involved shooting...many angles, and draw entirely different 'factual' conclusions.
So yes, 50 years ago, we in the United States generally had a share view of reality. That was good in a lot of ways, but it did essentially allow a small set of people in power to decide that convincing a non-trivial percentage of the US population that Exxon was a friendly, family oriented company that was really on your side.
Worth the trade off? Hard to say, but at least 'back then' it was possible, and even common, to have ground political discussions with people 'on the other side', and that's pretty valuable.
> 'back then' it was possible, and even common, to have ground political discussions with people 'on the other side'
As long as that common ground falls within acceptable parameters; couldn't talk too much about anything remotely socialist or being anti-war.
"The smart way to keep people passive and obedient is to strictly limit the spectrum of acceptable opinion, but allow very lively debate within that spectrum."
I don't know if it's good or bad but, outside of some megahit films, people mostly don't regularly watch the same TV series. I don't even have live TV myself.
France kicked the US military out of France in 1966 and left NATO's military command structure.
They largely rejoined in 2009 (and very deliberately never rejoined NATO's Nuclear Planning Group), but if any NATO member is capable of going it alone on this one, it's probably France.
240 nukes on subs is plenty to wave around as a stick, too.
For what it’s worth, Philo Farnsworth and John Logie Baird were friendly with each other. I was lucky to know Philo’s wife Pem very well in the last part of her life, and she spoke highly of Baird as a person.
David Sarnoff and RCA was an entirely different matter, of course…
One of his electro-mechanical units was on display in Victoria, Australia. Most amazing assemblage, you can sort-of get the idea from things.
I read online that at his end, Baird was proposing a TV scan-rate we'd class as HD quality, which lost out to a 405 line standard (which proceeded 625/colour)
There is also a quality of persistence in his approach to things, he was the kind of inventor who doesn't stop inventing.
Whatever we all television now, television then was literally "vision at a distance", which Baird was the first to demonstrate (AFAIK).
The TV I have now in my living room is closer to a computer than a television from when I grew up (born 1975) anyway, so the word could mean all sorts of things. I mean, we still call our pocket computers "phones" even though they are mainly used for viewing cats at a distance.
You should read about the invention of color television. There were two competing methods, one of which depended on a spinning wheel with colored filters in it. If I remember correctly, you needed something like a 10-foot wheel to have a 27-inch TV.
Sure enough, this was the system selected as the winner by the U.S. standard-setting body at the time. Needless to say, it failed and was replaced by what we ended up with... which still sucked because of the horrible decision to go to a non-integer frame rate. Incredibly, we are for some reason still plagued by 29.97 FPS long after the analog system that required it was shut off.
Originally you had 30fps, it was the addition of colour with the NTSC system that dropped it to 30000/1001fps. That wasn't a decision taken lightly -- it was a consequence of retrofitting colour onto a black and white system while maintaining backward compatibility.
When the UK (and Europe) went colour it changed to a whole new system and didn't have to worry too much about backward compatibility. It had a higher bandwidth (8mhz - so 33% more than NTSC), and was broadcasting on new channels separate to the original 405 lines. It also had features like alternating the phase of every other line to reduce the "tint" or "never twice the same color" problem that NTSC had
America chose 30fps but then had to slow it by 1/1001ths to avoid interference.
Of course because by the 90s and the growth of digital, there was already far too much stuff expecting "29.97"hz so it remained, again for backward compatibility.
In the UK the two earliest channels (BBC1 and ITV) continued to broadcast in the 405 line format (in addition to PAL) until 1985. Owners of ancient televisions had 20 years to upgrade. That doesn't seem unreasonable.
An engineer at RCA in New Jersey told me that at the first(early) NTSC color demo the interference was corrected by hand tweaking the color sub-carrier oscillator from which vertical and horizontal intervals were derived and the final result was what we got.
The interference was caused when the spectrum of the color sub-carrier over-lapped the spectrum of the horizontal interval in the broadcast signal.
Tweaking the frequencies allowed the two spectra to interleave in the frequency domain.
understanding the affect of the 1.001 fix has given me tons of job security. That understanding came not from just book learning, but OJT from working in a film/video post house that had engineers, colorists, and editors that were all willing to entertain a young college kid's constant use of "why?". Then being present for the transition from editing film on flat beds to editing film transfers to video. Part of that came from having to transfer audio from tape reels to video by changing to the proper 59.94Hz or 60Hz crystal that was needed to control the player's speed. Also had a studio DAT deck that could slow down the 24fps audio record in the field to playback at 23.976.
Literally, to this day, am I dealing with all of these decisions made ~100 years ago. The 1.001 math is a bit younger when color was rolled out, but what's a little rounding between friends?
For one thing, it’s much easier to measure spans of time when you have an integer frame rate. For example, 1 hour at 30fps is exactly 108,000 frames, but at 29.97 it’s only 107,892 frames. Since frame numbers must all have an integer time code, “drop-frame” time code is used, where each second has a variable number of frames so that by the end of each measured hour the total elapsed time syncs up with the time code, i.e. “01:00:00;00” falls after exactly one hour has passed. This is of course crucial when scheduling programs, advertisements, and so on. It’s a confusing mess and historically has caused all kinds of headaches for the TV industry over the years.
I had a communications theory class in college that addressed "vestigal sideband modulation," which I believe was implemented by Farnsworth. I think this is a critical aspect to the introduction of television technology.
In the United States in 1935, the Radio Corporation of America demonstrated a 343-line television system. In 1936, two committees of the Radio Manufacturers Association (RMA), which is now known as the Consumer Electronics Association, proposed that U.S. television channels be standardized at a bandwidth of 6 MHz, and recommended a 441-line, interlaced, 30 frame-per-second television system. The RF modulation system proposed in this recommendation used double-sideband, amplitude-modulated transmission, limiting the video bandwidth it was capable of carrying to 2.5 MHz. In 1938, this RMA proposal was amended to employ vestigial-sideband (VSB) transmission instead of double sideband. In the vestigial-sideband approach, only the upper sidebands-those above the carrier frequency-plus a small segment or vestige of the lower sidebands, are transmitted. VSB raised the transmitted video bandwidth capability to 4.2 MHz. Subsequently, in 1941, the first National Television Systems Committee adopted the vestigial sideband system using a total line rate of 525 lines that is used in the United States today.
The thing is that "television" seemed like a thing but really it was a system that required a variety of connected, compatible parts, like the Internet.
Different pieces of what became TV existed in 1900, the challenge was putting them together. And that required a consensus among powerful players.
I think it would be pretty uncontroversial from the technological point of view, but then, the first "real" TV broadcast would be the 1936 Olympic games...
You're skipping a few steps (like the Altair 8800) if you say that Apple invented the PC as we know it. Apple didn't even invent the GUI as we know it.
Philo Farnsworth invented the cathode ray tube. unless you're writing this from the year 2009 or before, I'm going to have to push back on the idea that tv's TODAY are based on his technology. They most certainly are not.
'Although he failed to gain much recognition in the West, he built the world's first all-electronic television receiver, and is referred to as "the father of Japanese television"'
How much AI is too much? Are you allowed to use Photoshop to create your digital art? Almost every tool there is now powered by AI in some way (some a lot more than others). Can you use its auto-fill button? What percent of the image can you use it for?
Can you generate something with AI and then manually edit it in Photoshop? How much manual editing is required before it's not considered AI anymore?
My point is, AI is another tool in the toolbox, it can be used well or poorly. How much is too much? Just like back in the day, using Photoshop wasn't allowed, until it was.
This isn't a "gotcha" experiment, it's a real question. And the problem with "I know it when I see it" is that right now people are biasing towards "if it's good it must be AI" and accusing legit artists and writers of being AI.
I happened to me. I spent 10 minutes writing a reddit comment. I researched it, I sourced it. It had sections with headlines, bullets, and even em dashes. 100% written by me.
As soon as I posted it, it was downvoted and I got PMs saying "don't post this AI slop!".
The problem is the AI has been trained on well executed material, and when you execute well, you look like an AI.
I mean the art community has always been continuous. Artists fight over everything, copying style or composition, claiming people are tracing others works. It's honestly not new, same thing in new clothes.
Clair Obscur: Expedition 33 (recently game of the year) got caught leaving a small amount of placeholder AI content in their game and everyone lost their mind
I’m sure reasonable artists agree with you, but many today do not
The first is that all “AI” is not equal. It’s specifically generative AI that most take issue with, mostly due to questionable ethics in training. Image editors have employed techniques marketed as “AI” for many years that are mostly or entirely unrelated to modern generative AI.
The second is that whether something is “AI art” is a spectrum, not binary. On one end you have creations in which generative AI played no role and on the other you have images that were generated off of nothing but a prompt or vague scribbles. In the middle you have things like images where the artist traced over an AI image or used bits and pieces of generated imagery. Probably the closest shorthand for where an image lands on the spectrum is to what degree the creator engaged their artistic skills.
A great many of digital artists would be happy to use Photoshop 7/CS1/CS2, all long predating generative AI, if those ran on modern operating systems. Some prefer modern simplistic (and without AI) tools like Paint Tool SAI.
That might be part of the equation, but for many it’s a strong gut reaction to having the work of themselves and others taken without consent, turned into the visual equivalent of pink slime and press-formed into other shapes, and sold as a service. It just feels wrong. Even I get a little squeamish thinking about it, and I only do art in a minor/hobby capacity — it’s something I’ve put time into, but it doesn’t pay my bills.
If training were purely ethical, the creative community probably still wouldn’t love generative AI, but it probably wouldn’t hate it nearly as much either. It’s the cavalier attitude towards violation of consent for the sake of profit that really seals the deal.
For many of us, even if drawing that line exactly is debatable, a prompt-generated image, where the "artist" didn't interact with any of the pixels is across the line for "too much AI".
It can definitely take creativity and fortitude to get an AI model to draw what you want it to. But if you worked at a fantasy publishing house and commissioned a cover painting, it might take a fair amount of work for you to get the artist to create something in line with what you envisioned. But you wouldn't get artistic credit for the resultant painting; the artist would! If AI is creating the piece, it is the artist; and you're merely the commissioner of the work.
> But if you worked at a fantasy publishing house and commissioned a cover painting, it might take a fair amount of work for you to get the artist to create something in line with what you envisioned.
If you do this infrequently, you're a commissioner of work.
If you do it daily, in-house, for your own products... you might just have the title "Art Director."
And the best Art Directors today almost all have a background in creating art themselves, in some fashion. I suspect that will remain true in the AI world as well, at least for the foreseeable future.
It's not that deep, if someone thinks it's AI it loses value to them. If you're able to utilize AI tools in a way that doesn't make the output look like AI to the average person you'll be fine.
Eventually no one will be able to really tell the difference and all of this will go away (though likely at the expense of more people's livelihoods).
I see that as being partially true. There will be people like Walter Keane that take the art of others and state it is their own work. [0] AI will assist those individuals.
AI will have a home on people's desktops for those that accept it.
Art that has value will not be AI art. Artist like Margaret Keane will continue to be viewed as exceptional along with their works. [1]
Personally, I view AI art as lacking passion and an attempt to short circuit the path to profit / greed. I wish to not fund that circuit.
A lot of prolific artists historically have passed their own work off to apprentices (for hundreds of years), and there's probably not much different here.
This frees the lead artist up for more conceptual work. AI can potentially carry on this tradition, and at less of an expense.
The truth of many great artists is that they needed to produce a lot more work than they'd have liked to make a living for themselves.
And as I learn about such individuals taking credit for others work I have nothing but disgust for them. This statement also binds to STEM.
Ivan Pavlov's assistants realized the dogs were salivating. Ronald Hare figured out how to mass produce penicillin and saved thousands of lives but no Nobel recognition for her triumph were those that got the prize failed. [0]
I personal will never fund AI art / AI artist. There is no personality in such works, only slop. Profits help artist is not why the create the art they do and if so, they lack soul in their works.
I think you're being too idealistic in these stances, it's just not how it's ever worked.
> And as I learn about such individuals taking credit for others work I have nothing but disgust for them
Historically, this wasn’t considered "taking credit." Leonardo da Vinci’s studio for example, involved apprentices completing significant portions of works under him. Buyers were aware they were commissioning a studio work and not a pair of hands. Koons or Hirst do it today, and it's no secret... it's how prolific artists have operated for centuries, they just can't keep up with the demand for their work solo.
The real soloists, like Van Gogh for example, were mostly obscure while they were alive... didn't really have any patrons, and didn't produce enough work to live off of.
To call the destitute failures "true artists" and someone like da Vinci "soulless" is pointless, because we don't decide these things, history does.
> Ronald Hare figured out how to mass produce penicillin and saved thousands of lives but no Nobel recognition for her triumph were those that got the prize failed.
I don't think this is true? Hare found success from trying to replicate Fleming's results. Fleming, Florey and Chain were jointly awarded the Nobel for the original discovery.
> I personal will never fund AI art / AI artist. There is no personality in such works, only slop
There will come a point in the near future where you won't be able to tell the difference. I lived through the same argument with Photoshop in photography.
Well I think we're just describing taste and craft. AI tools will get better, more granular, and become better integrated into the actual workflows of people over time. A good tool shouldn't take over my sense for taste and craft.
It's a good thing people are pushing back against the slop if we want there to be any incentives for AI tools to not be geared towards helping make slop.
This argument resonates with me - but it's the same argument that has been made and artists have ignored or put up the same (unconvincing, in my opinion) arguments against the whole time. As you pointed out, this same discussion has been had every step of the way with digital art - from things like photoshop, to the tools that have been gradually introduced inside of photoshop and similar, to even things like brush packs, painting over kitbashes, etc. The traditionalist viewpoint holds strong, until the people arguing blink and realize everyone else eventually stopped caring and did what worked best for them.
At this point, I believe it's not a matter of intellectual honesty or actually disagreeing with any of it - it's just about outcomes. They don't want to see their work devalued, their sources of income drying up. It's an understandable fear. No one who enjoys the work they do enjoys the prospect of potentially having to change careers to keep making a living. Hell, most people that don't enjoy what they do have no desire to have to try and find a new career.
But humans are selfish. The same artists who are worried about technology taking their job will laud praise on technology in other areas that have eliminated jobs, with my recurring example being how happy they are for no longer having to pay a web dev to build them a portfolio site and can instead just go to squarespace and pay them a fraction of the cost. No one laments how there are basically no independent web designers building small sites anymore - it's just not a viable career. It's all been consolidated into shops working for big clients or pumping out themes for Wordpress, Squarespace, and Shopify. And of course, there are countless examples of this throughout history.
I'm not sure AI is going to be the great job destroyer we fear it is. I'm not sure it isn't, either. So I get it. This has a chance to force an issue on a massive scale that usually is much more limited in blast radius.
But to answer the question - I don't think it actually matters to them what the line is from any sort of rational perspective. It will move and shift based on the conversation to wherever they think it needs to be to protect themselves.
The main argument artists use isn’t that it is taking their job. The problem is that it was trained on their work without their consent and without compensation. This is fundamentally different from a Wordpress or squarespace and arguably different from models trained on open source software only.
A result of a prompt you can’t, I believe you can’t trace over a copyrighted work and claim it as your own either so I say that tracing over an AI generated image would not fly either. But IANAL so the details to be fleshed out. Also would probably break if one uses a model that is not trained on any copyrighted data.
AI generated images themselves can't be copyrighted, but if you modify them they can be considered copyrightable, that's the current landscape, though it's a pretty new legal standard so we'll see how it plays out
It's funny you chose that as your example, because there are very strict definitions of when day becomes night. I think what you were looking for was "when does someone become bald" or "when does an acorn become a tree".
There is a reason those are classic philosophical questions. Because they highlight the fact that while it is easy to identify the ends of the spectrum, it's impossible to find the midpoint, because everyone has a different lived experience.
I know there are AI SRE companies that have discovered the same -- that you can't just throw a bunch of data at a regular LLM and have it "do SRE things". It needs more structured context, and their value add is knowing what context and what structure is necessary.
reply