Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

All I can say is, I asked Siri today (verbatim): What is 75 degrees fahrenheit in celsius, and what is 85 degrees in fahrenheit — and it offered a web search about fahrenheit. The "and" completely disabled its most basic ability to do metric conversions.

So, it's nice to see Apple is doing research and talking about it, but we're out here waiting, still waiting, for anything useful to make of it all on our thousand-dollar devices that literally connect us to the world and contain our entire life data. It's what I would've expected from one of the most valuable companies in the world.



You asked 2 questions in a system made for 1 question at a time. Split these up and Siri answers them fine. You’re holding it wrong.


A tool that can handle more than one question at a time is useful. Modern LLMs handle that with ease. So it's completely reasonable to be critical of that limitation.


Sure, what’s not reasonable is expecting Siri to be a modern LLM, when they know it’s not. They asked a question they knew Siri couldn’t handle just to slam it. I’m not critical of a 5-function calculator for not one-shotting complex equations like a computer.

While Siri only does one thing at a time, I trust the answer more, because it’s doing the actual math and not just guessing what the most likely answer is, like an LLM. We need to pick the right tool for the right job. Frankly, I don’t think an LLM is the right tool for conversations like this, and jumbling multiple questions into a single question is something people do with LLMs to get more use out of them during the day, this is an adaptation to a limitation of the free tier (and sometimes speed) of the LLM.


On Android phone, the equivalent voice assistant (Gemini) handles the question gracefully. Regardless of what you think about Google, having a single-button LLM-powered voice assistant, deeply integrated into the phone's OS, is a very useful feature, and Apple is quite far away from developing a competing version of this. They'll have to buy it or go without.


It’s not unreasonable

Amazon already reworked Alexa to be backed by a LLM months ago, and they were delayed doing it.

You’re telling me that Apple isn’t capable of the same to Siri?


The unreasonable part is acting like Siri got its big LLM update, when they know it didn’t. Just like it would be unreasonable to expect any famously delayed, or unannounced, feature to magically start happening.

Amazon just needs a generic LLM. Apple, from the sound of it, is trying to create deep integration with the OS and on-device data. That’s a different problem to solve. They also seem to be trying to do it while respecting user privacy, which is someone most other companies ignore.

I don’t see what the big deal is. I’d rather wait for something good than have them rush out a half-ass “me too” chatbot, that is indistinguishable from the dozens of other chatbots I can simply download as an app for that.

If we believe what Craig Federighi said, they had something, it just wasn’t up to their standards when talking about rolling it out to a billion devices. Which is fair, I run into bad data from ChatGPT and other LLMs all the time. Letting it mature a little more is not a bad thing.

ChatGPT spent a couple months getting my dad pumped up for an elective open heart surgery; he was almost arrogant going into it about how the recovery would go, thinking ChatGPT gave him all the info he could possibly need and a balanced view of reality. Reality hit him pretty hard in the ICU. He sent me some of the chats he had, it was a lot of mutual ego stroking. He was using ChatGPT to downplay the negatives from the doctors and boosting the positives. While it’s good to feel confident, I think it went too far. I spent the whole week in the hospital trying to pull him out of his depression and recalibrating the unrealistic expectations that ChatGPT reinforced. I hope Apple finds a way to be more responsible. If that takes time, great.


Why is Siri being discussed in the context of LLMs and Apple Intelligence? Have they already released Siri 2.0 or am I missing something?


The OP is making a point that Apple is behind. They might be publishing research, but it’s completely useless to the end user buying their products.


A plethora of LLMs are available on Apple platforms. If someone wants a chatbot, they can get a chatbot on Apple products. It’s not hard.

Are all Android users using Gemini exclusively? Are all Windows users only using Copilot? Where is the native Linux desktop LLM?

I really don’t understand this criticism. Would it be nice if Siri could do more, sure. Do I have tolerance for Siri to start hallucinating on simple problems it used to use real math for, no. Do I have other options to use in the meantime to get the best of both worlds, absolutely. Where is the hardship?


Siri is the default and only voice assistant that has access to all the data on your phone. It doesn't matter if I have ChatGPT, Claude, Gemini, or another SOTA model on my iPhone—I can't easily activate them in the car or another handsfree situation or use them with any other app or data on my iPhone.


Replace "LLMs" with "competitors" and maybe you'll see the point..


The LLMs aren't necessarily competitors. Apple doesn't need to have the best all around LLM. They need to create an AI with excellent integration into their OS and the data the user's store on those systems. Beyond that, they need to have a good system for plugging into whatever other generic LLM a person might want/need. Having something decent out of the box is nice, for basic questions, but being able to easily switch to whatever specialist company is in the lead, or best suited for a user's need, is a lot better than being stuck with one first-party option. Based on how ChatGPT looks in the Apple Settings, I wouldn't be surprised if this is the plan.

Much like with the internet, Apple didn't need to re-invent every website to own it all. From Apple platforms a user can access Amazon, Google, or whatever else. Apple didn't create the internet, they sold a gateway to it. AI could be done largely the same way. This way it doesn't matter who wins, Apple can support it. At the end of the day, an LLM doesn't exist on its own, it needs to be accessed through hardware/software people enjoy using, and not be yet another device to charge and carry. Apple has a very popular phone and the most popular wearable. This positions them very well. They are often late to the party, but tend to be best dressed. The first iPhone didn't even have video, and people clowned them for it, and now iPhone video is largely considered one of the best in the smartphone world.


Never mind that Infocom games running on my Apple ][+ could handle that sort of command in 1983.

(Well, with multiple direct objects, anyway.)


"holding it wrong" was exactly the right phrase given how that phrase was used with the iPhone antenna bridging problem. This is an Apple product failing.


"You haven't contorted your comically simple query enough to make the brittle tool work. Throw the chicken bones better next time."


It’s been this way for over a decade. If someone hasn’t figured it out by now, that’s kind of on them.

I’m not even sure why those two things would be asked as a single question. It seems like a very unnatural way to pose those two questions. Most humans would trip on that, especially if it was asked verbally.


> It seems like a very unnatural way to pose those two questions. Most humans would trip on that

I'd assume GP only gave an example. As a pretty frequent user, I can unfortunately only confirm that Siri trips over almost every multi-part question.

This would be forgivable if there weren't multiple voice-based AI consumer products available that can handle these kinds of requests perfectly.


And Apple has integrated one of them, ChatGPT, to do just that.

If they wanted an LLM answer they could have got one. They went out of their way just to take shots at Apple.


I can’t talk to ChatGPT hands-free on my Apple devices, but I can to ChatGPT.

Besides that, many people don’t install any apps, and Apple not pre-installing a reasonable LLM to cater to that market just seems incredibly out of character.

And there’s enough credible reporting and personnel reshuffling happening to suggest that it’s not available yet because they failed to make it work, not because they didn’t try.


OP isn't asking how to use Siri to do his contrived task. OP is saying that Siri in 2025 should be able to handle that relatively simple albeit contrived task.


> What is 75 degrees fahrenheit in celsius, and what is 85 degrees in fahrenheit

Err, what? As a native English speaker human that's a pretty confusing question to me, too!


First, most of the English speaking world is not native.

"As of 2022, there were about 400 million native speakers of English. Including people who speak English as a second language, estimates of the total number of Anglophones vary from 1.5 billion to 2 billion."

Second, all popular models I tested did well with that query, including Gemini on Android (aka "ok Google"), except Apple's.

https://en.m.wikipedia.org/wiki/English-speaking_world


I am not sure why you go on the subject of English speaking world etc. Anyway, the models you tested with that query, which I am not sure why we think is a good benchmark, are local models running on a wireless device or they use datacenter and only convey the text back and forth?


I'm fairly sure Siri still sends user voice samples to a data center. At least for a while, it used to use multipath TCP to decrease latency over multiple available network connections if I'm not misremembering.

Some modern Apple devices support "local Siri", but it's a limited subset of both voice recognition performance and capabilities.


>> What is 75 degrees fahrenheit in celsius, and what is 85 degrees in fahrenheit

Probably wouldn't have made a difference but the second half of that statement isn't exactly clear. 85 degrees what?

I also think when you're chaining these two separate calculations together you get a problem when it comes to displaying the results.


That exact phrase "What is 75 degrees fahrenheit in celsius, and what is 85 degrees in fahrenheit" given to ChatGPT produces the correct result (it infers that the second degrees must be Celsius) and ChatGPT gives me a nicely laid out formula for the math of the conversion.

So yeah, Apple is way behind on this stuff.


the fact is that gemini responds with this: 75 degrees Fahrenheit is 23.89 degrees Celsius, and 85 degrees Celsius is 185.00 degrees Fahrenheit.


Meanwhile users have been conditioned to expect a system that understand the multiple queries and answers them appropriately.


True. But for most of us, only in the past year. I have a few friends/relatives who have still never conversed with an LLM.


I just tried this on my phone and just got two pop ups with the conversions appear in quick succession.


Your usage of Siri today (probably on an old version of iOS) frankly has nothing to do with the article we are discussing. Sorry to say this but it is going to take time. Comparing the performance of a chatgpt running in a big data center with a model running locally on a phone device... give it a few years.


People have been giving Siri a few years for a decade now. Siri used to run in a data center (and still does for older hardware and things like HomePods) and it has never supported compound queries.

Siri needs to be taken out back and shot. The problem with “upgrading” it is the pull to maintain backwards compatibility for every little thing Siri did, which leads them to try and incorporate existing Siri functionality (and existing Siri engineers) to work alongside any LLM. Which leads to disaster, and none of it works and just made it all slower. They’ve been trying to do an LLM assisted Siri for years now and it’s the most public facing disaster the company has had in a while. Time to start over.


As a user, I'd gladly opt into a slightly less deeply integrated Siri that understands what I want from it.

Build a crude router in front of it, if you must, or give it access to "the old Siri" as a tool it can call, and let the LLM decide whether to return its own or a Siri-generated response!

I bet even smaller LLMs would be able to figure out, given a user input and Siri response pair, whether the request was resonably answered or whether the model itself could do better or at least explain that the request is out of capabilities for now.


> Build a crude router in front of it, if you must, or give it access to "the old Siri" as a tool it can call, and let the LLM decide whether to return its own or a Siri-generated response!

Both of these approaches were tried internally, including even the ability for the LLM to rewrite siri-as-a-tool's response, and none of them shipped, because they all suck. Putting a router in front of it makes multi-turn conversation (when Siri asks for confirmation or disambiguation) a nightmare to implement, and siri-as-a-tool suffers from the same problem. What happens when legacy siri disambiguates? Does the LLM try to guess at an option? Does it proxy the prompt back to the user? What about all the "smart UI" like having a countdown timer with Siri saying "I'll send this" when sending a text message? Does that just pass through? When does the LLM know how/when to intervene in the responses the Siri tool is giving?

This was all an integration nightmare and it's the main reason why none of it shipped. (Well, that and the LLM being underwhelming and the on-device models not being smart enough in the first place. It was just a slower, buggier siri without any new features.)

The answer is that they need to renege on the entire promise of a "private" siri and admit that the only way they can get the experience they want is a _huge_ LLM running with a _ton_ of user context, in the cloud, and don't hinder it all with backwards compatibility with Siri. Give it a toolbox of things it can do with MCP to your device, bake in the stock tools with LoRA or whatever, and let it figure out the best user experience. If it's a frontier-quality LLM it'll be better than Siri on day one, without Apple having to really do anything other than figure out a good system prompt.

The problem is, Apple doesn't want to admit the whole privacy story is a dead-end, so they're going to keep trying to pursue on-device models, and it's going to continue to be underwhelming and "not meeting our quality bar", for the foreseeable future.


Very good details on why just bolting on an LLM isn't that trivial I hadn't really considered before, thank you!

But regarding Apple not wanting to admit that client side compute isn't enough: Haven't they essentially already done that, with Private Cloud Computing and all that? I believe not even proofreading and Safari summarization work fully on-device, at least according to my private compute privacy logs.


Those little things have been broken for a while now, it's best to bite the bullet and integrate LLM to Siri now.


> Your usage of Siri today (probably on an old version of iOS) frankly has nothing to do with the article we are discussing.

Yes, but isn't that infuriating? The technology exits! It even exists, as evidenced by this article, in the same company that provides Siri!

At least I feel that way every time I interact with it – or for that matter my Google Home speaker, ironically made and operated by the company that invented transformer networks.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: