Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Damn, I hate to plug products on HN, but I'd say that the New Yorker is the one subscription I've loved maintaining throughout my life. First got it right out of college and appreciate it 20 years later.

Everyone is publishing think pieces about ChatGPT - yawn. But only the New Yorker said, hmm, how about if we get frickin' Ted Chiang to write a think piece? (It is predictably very well written.)



Certainly beats a Medium subscription, where you pay (more?) to read 95% garbage when compared with what's put in The New Yorker.


I don't really like the New Yorker, but even I agree it's a way better deal than Medium. I can't remember the last time I read anything on Medium that wasn't mostly vacuous.


There are people out there that pay for Medium?


I hear you. A few articles a year, such as this one, makes the 70ish dollars I cough up every year, worth it.


I've been tempted to subscribe. But I emailed them trying to find out what the price would be after the trial sub expired. They replied that they could not say.

So it seems obvious (to me at least) that they segment renewals and charge what the market will bear.


Wholly agree.

I worked there many years ago, leading the re-design and re-platform (fun dealing with 90 years of archival content with mixed usage-rights) and paywall implementation (don't hate me, it funds journalism).

When you see how the stories get made and how people work there, well, its just amazing.


I notice that the extremists never use paywalls, meaning extremism is allowed to spread unchecked.

If respectable newspapers and magazines cared about society, they'd follow suit, and give the extremists some competition.


I think respectable papers would go out of business if they followed suit. They are usually respectable because they dont rely on a outside organisation for funding, but instead rely on creating good journalism people want to pay for.


They also then have less motivation to rely on clicks. Like outlets on both sides have devolved to.


The Guardian would surely qualify as somewhat respectable and still paywall free...(I pay for it anyway, given I read it pretty regularly. It's only publications I read articles from maybe 3 or 4 times a year I have an issue with signing up for - would have no issue with a one-off payment if it was easy enough to do).


That's because the extremism is the advertisement. They get paid to inject garbage into your brain.


I assume you have multiple subscriptions to respectable newspapers and magazines because you care about society, right?

That said, the answer to your question is 'no'. If respectable newspapers and magazines followed suit, they'd disappear.

The cost of production of extremism/misinformation is much lower then it is to do investigative journalism or fund bureaus, or send teams out to physical locations to report. Fact-checking costs money, editors cost money. It all adds up.

The paywall model exists in many respects because if you are reliant on advertising as primary means of revenue which does fluctuate. A solid subscription base offers stability and predictable in which to run a viable organization - if you can pull it off.

That said, most of these paywalls are not 'hard-paywalls' where you need to subscribe immediately to read anything. They are typically 'soft-paywalls' where you can read a few articles before being asked to subscribe. From that perspective, your argument falls flat.


I have yet to see any evidence that “sunlight disinfects” or that mainstream media can out-shout disinformation.


You volunteering to pay for it?


In his short story Understand, he talks about two superintelligent individuals who are having high bandwidth conversations. Maybe ChatGPT and Bard are those bespoke intelligent agents.

https://web.archive.org/web/20140527121332/http://www.infini...

We continue. We are like two BARDs, each cueing the other to extemporize another stanza, jointly composing an epic poem of knowledge. Within moments we accelerate, talking over each other's words but hearing every nuance, until we are absorbing, concluding, and responding, continuously, simultaneously, synergistically.


For anybody who does not want to pay, your taxes likely already pay for a subscription you can use from your local library on the Libby app. King and Snohomish counties in Washington provide unlimited digital copies of The New Yorker and The Economist as an example.


They also have weekly cryptic crosswords that are cryptic enough to be interesting but as easy to completely solve as a regular crossword. (With cryptics, very good ones are also very hard.)


Just don't try to use ChatGPT to help! (To be fair, it can be useful at general knowledge based clues, but certainly not ones relying on word play/anagrams etc.)


Damn too, actually. Reading the piece, I've been thinkking this publication really dserves to be subscribed to!

Amazing write up. ChatGPT is a blurry JPEG of the internet.


It is certainly one of the best and most concise ELI5 explanations I've seen.


Too bad that this kind of writing is married to 31(!) ads/trackers on that page. Can the journalism like this really not survive without all that crap?


They sell the perfect solution to this: a magazine.


It's not perfect because it's one time paper that takes spaces or goes to waste quickly. But it's indeed a good solution nonetheless.


The magazine still has ads in it, doesn't it? I'd rather they know my ip address rather than my mailing address.


Agreed. I've subscribed to many a magazine in my long life. The New Yorker is the last one standing as it is excellent from cover to cover.


Interesting that it's such a conservative, opinion-less, air-tight piece. Guess its his technical writing background coming through.


If this article is the best the New Yorker offers now, I'm glad I don't subscribe.

It used to have high-quality articles, certainly.


I thought the author was uncharacteristically perceptive for a reporter. Yann LeCun or Geoff Hinton couldn't have come up with a better analogy.


The author is not a random reporter but Ted Chiang, a well-known science fiction author. The movie "Arrival" is based on a story by him.


Which explains why this is being promoted:

He paid for an advertisement, wrote this article as that advertisement or had it ghostwritten, and now it's being hyped.


Analogy of?


"Blurry JPEG" for how ChatGPT "compresses" character-based knowledge into vectors. That "compression" process gives ChatGPT an ability to generalize because it learns statistics (unlike JPEG) but like JPEG it is a lossy process.


It's a terrible analogy because the entire point of ML systems is to generalize well to new data, not to reproduce the original data as accurate as possible with a space/time tradeoff.


I don't think you can describe the math in this context as "generalize well to new data."

ChatGPT certainly can't generate new data. It's not gonna correctly tell you today who won the World Series in 2030. It's not going to write a poem in the style of someone who hasn't been born yet.

But it can interpolate between and through a bunch of existing data that's on the web to produce novel mixes of it. I find the "blurring those things together" analogy pretty compelling there, in the same way that blurring or JPEG-compressing something isn't going to give you a picture of a new event but it might change what you appear to see in the data you already had.

(Obviously it's not exactly the same, that's why it's an analogy and not a definition. As an analogy, it works much better if you ignore much of what you know about the implementation details of both of them. It's not trying to teach someone how to build it, but to teach a lay person how to think about the output.)


It absolutely can generate new data, it does so all the time. If you are claiming otherwise I think we need a more formal definition of what you mean by new data.

Are you suggesting because it can't predict the future it can't generate novel data?


It's not just the future, though the examples I gave were future oriented.

But it's all very interpolation/summarization-focused.

A "song lyrics in the style of Taylor Swift" isn't an actual song by Taylor Swift.

A summary of the history of Texas isn't actually vetted by any historian to ensure accuracy.

The answer to a math problem may not be correct.

To me, those things don't qualify as "new data." They aren't suitable for future training as-is. Sometimes for a simple reason: they aren't facts, using the dictionary "facts and statistics collected together for reference or analysis" definition of data. So very simply "not new data."

Sometimes in a blurrier way - the song lyrics, for instance, could be touching, or poignant, or "true" in a Keats sense[0] - but if the internet gets full of GPT-dreams and future models are trained on that, you could slide down further and further into an uncanny valley, especially since most of the time you don't get one of those amazing poignant ones. Most of the time I've gotten something bland.

[0] "What the imagination seizes as beauty must be truth"


One way to think about prompting is as a conditional probability distribution. There is a particular song by Taylor Swift or the set of all songs by Taylor Swift but ChatGPT is particularly talented at sampling the "set of all songs in the style of Taylor Swift".

One of the worst problems in the "Expert Systems" age of A.I. was reasoning over uncertainty, for instance this system

https://en.wikipedia.org/wiki/Mycin

had a half-baked approach that worked well enough for a particular range of medical diagnosis. In general it is an awful problem because it involves sampling over a joint probability distribution. If you have 1000 variables you have to sample a 1000-dimensional space, to do it the brute force way you'd have sample the data in an outrageous number of hypercubes.

Insofar as machine learning is successful it is that we have algorithms that take a comparatively sparse sample and make a good guess of what the joint p.d. is. The success of deep learning is particularly miraculous in that respect.


The thing is that generalization is good enough to make people squee and not notice that the output is wrong but not good enough to get the right answer.

If it were going to produce ‘explainable’ correct answers for most of what it does that would be a matter of looking up the original sources to make sure they really say what it thinks they do. I mean, I can say, “there’s this paper that backs up my point” but I have to go look it up to get the exact citation at the very least.


There is definitely a misconception about how to use a tool like ChatGPT.

If you give it an analytic prompt like "turn this baseball box score into an entertaining outline" it will reliably act as a translator because all of the facts about the game are contained in the prompt.

If you give it a synthetic prompt like "give me quotes from the broadcasters" it will reliably acts as a synthesizer because none of the facts of the transcript are in the prompt.

This ability to perform as a synthesizer is what you are identifying here as "good enough to make people squee and not notice that the output is wrong but not good enough to get the right answer", which is correct, but sometimes fiction is useful!

If all web pages were embedded in ChatGPT's 1536 dimensional vector space and used for analytic augmentation then a tool would more reliably be able to translate a given prompt. The UI could also display the URLs of the nearest-neighbor source material was used to augment the prompt. That seems to be what Bing/Edge has in store.


That's a touch beyond state of the art but we might get there.

If there was one big problem w/ today's LLMs it is that the attention window is too short to hold a "complete" document. I can put the headline of an HN submission through BERT and expect BERT to capture it but there is (as of yet) no way to cut up a document up into 512 (BERT) or 4096 (ChatGPT) token slices and then mash those embeddings together to make an embedding that can do all the things the model is trained to do on a smaller data set. I'm sure we will see larger models, but it seems a scalable embedding that grows with the input text would be necessary to move to the next level.


No, this is the current state of the art: https://supabase.com/blog/chatgpt-supabase-docs

  It's built with Supabase/Postgres, and consists of several key parts:
  
  Parsing the Supabase docs into sections.
  Creating embeddings for each section using OpenAI's embeddings API.
  Storing the embeddings in Postgres using the pgvector extension.
  Getting a user's question.
  Query the Postgres database for the most relevant documents related to the question.
  Inject these documents as context for GPT-3 to reference in its answer.
  Streaming the results back to the user in realtime.
The same thing could be done with search engine results and from recent demos it looks like this is the kind of analytic augmentation that MS and OpenAI have added to Bing.


So they can't afford an actual subject-matter expert for their articles?


In a world where supposedly more-tech-industry-aware writers are talking about what "ChatGPT believes" and other such personification... show me a better article.


Any article on how ChatGPT works would be much better.


After reading the article, it is obviously a publicity piece for the author, and not to be taken seriously.

Is that the best the New Yorker can offer?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: