Until We started to see LLMs, and the tools that can be created with them, I dou...

Nition · on March 27, 2023

Here's an image result from MidJourney for "Paris, 1950s, rainy afternoon". No additional editing, and I intentionally avoided adding any text to the prompt beyond your own.

https://i.imgur.com/IYuh29H.png

Not perfect but man, we're getting pretty close.

purplecats · on March 27, 2023

interesting. i apparently opened that image, then forgot about it, and without context saw it later. i looked at it, thought wow that's a cool picture from back in the day, looked at the people in it, and left.

i now ran into your comment (with a purple link) and did some reflection. upon reexamination, its clear that the picture is fake (because im looking for it) but when i wasn't looking for it, its interesting how all the "hot spots" or interesting pieces of the picture are pretty good and the (imo) lackluster parts are the "less interesting" pieces like the end of the roads where it it blurs out. i wonder if that bias is inherently ingrained in the system.

whitemary · on March 27, 2023

The focus blur is repulsive. I think convincing focus blur will be the milestone that replaces stock photography.

Nition · on March 27, 2023

It can get a bit better if the prompt is made more detailed. For instance here are the four results I got for "Professional black and white photo of Paris in the 1950s, on a rainy afternoon. Leica 35mm lens. --s 1000" (--s 1000 lets it 'stylize' a bit more).

https://i.imgur.com/pPU7K0c.png

Things still get a little weird in the distance (particularly in photo 3), but I think overall it's a bit better. People who are really good at writing prompts could probably do even better, although one of the strengths of MidJourney V4 and V5 is that it can give good results without the traditional paragraph of "incredible, award winning, photo of the year" etc.

whitemary · on March 27, 2023

Very interesting. Photo 4 is a significant step in the right direction. It's refreshing that it doesn't veer towards a Gaussian look either. Thanks for sharing.

galgot · on March 27, 2023

It's a nice image, but I find it's easy to spot IA generated images when trying to make images of very specific existing hardware. Here you can see all the cars are generic with 50's design look, none are models that existed. Try to ask an IA to draw you a Boeing 747-400 for example, see what I mean. Btw, Have you noticed all the Youtube vids thumbnails made by IA now ? Easy to spot.

wrycoder · on March 27, 2023

These systems sure hallucinate text.

pixl97 · on March 27, 2023

Yes, because models like midjourny tiny compared to LLMs like GPT. I'm pretty sure there's a good hackernews discussion on this that occurred recently, but with all the AI talk I can't find it. But really we need a lot less information to make a reasonable city, then the amount of information we need to make billboards and signs make sense. I don't think Midjourny wants to pay 10+ million dollars to have their model trained.

sorz · on March 27, 2023

> then the amount of information we need to make billboards and signs make sense.

Subsequently, this applies to posters, letters, newspapers, and other types of text-heavy images, ultimately reducing the language modeling problem to an image generation problem.

swyx · on March 27, 2023

if you ever find it i’d love to read!

presentation · on March 27, 2023

I wonder if how feasible it would be to have mid journey mark the points where text should be, then pass it off to GPT to propose the text to write.

aryamaan · on March 27, 2023

midjourney-- a window in the past and future

stevenhuang · on March 27, 2023

A prompt like that already can generate something great using stable diffusion/mid journey. Very exciting indeed that LLMs are now similarly so capable.

RobotToaster · on March 27, 2023

Using stable diffusion suddenly makes Picard telling the computer his tea has to be hot every time seem reasonable.

wefarrell · on March 27, 2023

When Siri, Alexa, and Google home came out I was convinced voice would be the next paradigm shift in human computer interactions, comparable to mobile, but the voice assistants fell short and I was disappointed.

Now it's clear that the shift is coming and it will revolutionize the way we interface with machines.

MattRix · on March 27, 2023

The image AIs are much more capable too, and I find it interesting that every technically inclined person used to always make fun of those “enhance” moments in TV shows and movies where they would zoom into some area of a photograph or security footage. The fact that now this is actually possible (to some extent at least) is pretty wild.

sircastor · on March 27, 2023

On a serious note, this is where these models get dangerous - when someone "zooms in" on the technology, finds something the computer created from nothing, and then takes that as irrefutable fact.