> 1) software that makes it easy to do for the layman (browser extensions etc.), and
It's already a given that this only works on a locked-down device. Making it a simple binary "is this device owned by a minor" switch means parents will actually be able to understand it.
> 2) scams and malware that target children offering a "bypass" to access adult websites
And advertising to children should also be banned, so they won't be exposed to such scams, among other things. Thankfully this header lets the site know if they're breaking the law by showing scam ads, which makes prosecution super easy.
> I think this would end up similar to the Do Not Track header which ultimately no one cared about or took seriously.
Oh, of course none of this works unless it has the teeth of law to back it up.
Most of the tricks seem like they'd work in the cooling direction too. They amount to insulation and thermal mass. You would need AC to actually lower the temperature, but those improvements would let you run the cooling on a lower duty cycle.
I don't think Trump is explicitly an asset. He just likes oligarchs and dictatorial strongmen, and is usually the dumbest man in the room. There's no doubt that there are ties there, but Trump is not a loyal man. These things combined means he can swing back and forth between doing Putin favors and being genuinely upset by perceived slights, then back to friends when Putin gets his ear to smooth things over.
Rather than specifically being a Russian asset, he's an asset to the last charismatic man he remembers speaking to.
1) You don't deport them, you don't ignore them, you document them. Then you let them live their lives. They're people, not a mold outgrowth that needs culling.
2) Check those stats a bit more closely. The vast majority of "deportations" were people turned away at the border.
Would you support deporting people who are criminals? Or have no intention of ever working and just want to live off various welfare programs? Trying to find some common ground here.
Nope. Access to food, water, shelter, and freedom of movement are fundamental human rights. I'm not a proponent of executing useless eaters. If you commit a crime with a prison sentence then you serve that sentence where you committed the crime.
Thanks for taking the time to clarify your position.
So if China or some other country decided to send 10 million people here for whatever reason, you think our official policy should be to welcome then in and provide them food, shelter, etc...?
What about 100 million people?
Should they also be given citizenship and right to vote in addition to food/shelter?
The only issue would be logistics. Getting supporting infrastructure and housing set up. But yeah, ultimately. More hands, more consumers. Why wouldn't we want as many citizens as possible, we certainly have the land for it.
I wonder in such a case if more populous countries like India or China could in theory send over 100 million+ people to our country over the course of a decade, and then once those people are citizens, legally vote for the US to be annexed by China, etc..
You could conquer a country without a single shot fired.
"These people are akin the mold growing upon a rotting city-state economy. They have to be removed." --our poster
"humanity suffers today under Jewish parasitism" --Adolf Hitler
It is this fake injury or mis-assignment of blame for real harm that serves as justification for actual crimes against humanity be they at CEDOT or Dachau
This is disgusting hyperbole. Nazis killed millions of innocent people; a nation enforcing border laws by asking illegals to leave or removing them when they don't is not that.
We sent people who committed no crimes to a foreign concentration camp in a country that they aren't from and have killed several including citizens.
Our present admin holds that it can detain anyone it merely asserts is illegal without trial or any due process and ship them to such camps or hold them domestically indefinitely in fetid slums that if we fill with the millions they want picked up will become death camps due to illness, climate, privation, lack of medical care.
They have variously called for imprisoning and even executing law makers who speak up, shooting protesters, killing them and shutting down journalists who run negative press.
They did commit a crime by crossing the border illegally. Illegals are free to leave the country on their own and not deal with any of this, in fact they are paid to do so. The idea that removing people who entered America illegally and sending them back is the same as systematically exterminating an entire race of humans is so dangerous and makes any discussion with people who think like you such a waste of time. It's rhetoric like yours that encourage people like the Tyler Robinsons or that sniper who attacked the ice facility.
A crime against the citizens like robbery not a civil wrong like overstaying their visa. We have a different interest in enforcing one vs the other which I think you know.
You are suggesting we can't call out what is actually happening in case the proud boys running around kidnapping and murdering people get hurt as if people will be inspired to hurt them because of online rhetoric and not because of the kidnapping and murder.
Want to keep ICE from getting hurt? Roll back enforcement to 2010 norms and start rolling in greater penalties from hurtful to ruinous for employee illegals with 5 years in jail for all management/HR/accounting who lie about it.
Start at 10% of payroll paid to illegal labor increasing to 1000% over whatever timeline would allow companies to transition from illegal labor.
Making it economical to hire illegal labor with a slap on the wrist or no penalty then punishing laborers for adapting over decades to this situation is insane.
What was the reasoning Hitler used to deport Jews and other "undesirables" to Polish concentration camps? Was it legal?
If so, maybe we shouldn't try to equate "What is legal/possible" with "what is moral/good". It can be legal and possible, and still very inhumane and evil. The Nazis prove that, don't they?
> and sending them back
We didn't "send them back". We sent them to a third place. A very bad place. Why are you ignoring that when the person you are replying to was specifically mentioning it?
> It's rhetoric like yours that encourage people like the Tyler Robinsons or that sniper who attacked the ice facility.
There is absolutely zero evidence of this. Tyler could have a very specific grievance with Charlie Kirk's rhetoric without being motivated by other people calling Trump and MAGA Nazis or Facists.
> 1) You don't deport them, you don't ignore them, you document them. Then you let them live their lives. They're people, not a mold outgrowth that needs culling.
I don't think that's a policy that would get majoritarian support in the US. The only people who can and should get deported are those who are not already not authorized to be here. If you don't deport them, it's functionally equivalent to an open-borders policy. Do you want more MAGA? Because open-borders is how you get more MAGA.
What you're proposing is also roughly analogous to a policy of not evicting squatters. If someone breaks into your house and decides to start living in one of your bedrooms, are you going to want them out or give them a key? The squatter is a person too, not a mold outgrowth that needs culling.
> Pretending that immigrants are the underlying cause of every societal failure is how you get MAGA. Enabling that big lie bolsters it.
What are you going to do, win elections by lecturing everyone about how they're wrong and they need to think just like you? People thought the Biden administration's immigration policy was too lax, and that was a major contributing cause to the second Trump term.
Deporting people who are in the country illegally is a no brainer. If you don't want that, get the law changed. Until then, it's not wrong to deport them.
Now, that doesn't mean deportation should be the only or even the main method of immigration enforcement (personally, I like the idea of putting more burden on employers).
> And I don't think I can enumerate the ways in which an occupied house are different from a country and unsuitable for the metaphor you're trying.
Oh of course, it's always too different if you want it to be. That way, you can continue to feel righteous.
> Deporting people who are in the country illegally is a no brainer. If you don't want that, get the law changed. Until then, it's not wrong to deport them.
Enjoy this little cognitive dissonance:
You could also change the law make them legals eg. after X years of work, no criminal record and citizenship tests.
This would completely disable the current ICE gestapo and would have prevent soo much suffering. But i can imagine what you must be thinking now: But they came here illegally, this is harm to our society enough.
> What are you going to do, win elections by lecturing everyone about how they're wrong and they need to think just like you?
I'm partial to the strategy of selling voters on a set of policies that will improve their lives and address their problems. Unfortunately neither party in my country is keen on that idea.
> People thought the Biden administration's immigration policy was too lax, and that was a major contributing cause to the second Trump term.
People thought that once they were told to think that. It's an easy sell to blame everything wrong on the scary dirty foreigners. When people are dissatisfied populism wins, regardless of whether the talking points are rooted in reality. The responsible thing to do is try to get people on board with populist ideas that help rather than hurt.
> I'm partial to the strategy of selling voters on a set of policies that will improve their lives and address their problems.
It's a seductive idea, but it's the attitude of an authoritarian technocrat. However, the US is supposed to be a representative democracy, which requires being sensitive to the problems voters have, as voters see them. And that's probably a big part of Trump's actual appeal. My understanding is at his rallies and in his rhetoric, he gave the appearance of being responsive to many concerns that had been willfully ignored or denied for a long time (for instance: free trade dogma, which destroyed a lot of things and insisted people be satisfied with the easily-quantified cheap junk they were being given).
> People thought that once they were told to think that.
Don't pretend your thoughts are any more independent than those of the people you're othering.
> There is broad support for Dreamers. It's not as simple as deport everyone here illegally and the public seems to understand that.
What the GGP was advocating was much broader than that. What's sympathetic about the Dreamers is the non-consensual nature of their position (their parents took them here) and many of them have little to no connection to the country they'd be deported to.
1. Entering a country without proper documentation is a crime. Therefore all "undocumented immigration" is by definition criminal.
2. Removing criminals is paramount to a safe society and a justice system that is respected.
3. "Documenting them and letting them live" undermines legal immigrants who likely worked very hard to integrate culturally, establish themselves, and do the proper LEGAL paperwork. These legal immigrants have stringent reporting requirements, need to be careful about even minor crimes (excessive speeding tickets even!) etc. How is your proposal remotely fair to them?
I don't understand why this is a controversial opinion at all. I have yet to meet a legal immigrant that isn't okay with booting anyone that isn't legal out. A country without border control is NOT a country.
> "Documenting them and letting them live" undermines legal immigrants who likely worked very hard to integrate culturally, establish themselves, and do the proper LEGAL paperwork.
It's a shame those people had to work so hard to be treated like their neighbors. That's not a reason to deny others that treatment though.
> I have yet to meet a legal immigrant that isn't okay with booting anyone that isn't legal out.
Yeah they tend to skew pretty reactionary. That tends to sort itself out after a generation or two.
> A country without border control is NOT a country.
I didn't say we shouldn't have border security. In what universe is a goon squad going door to door checking for undesirables "border control"?
The term you're looking for is discoverability, and in my experience it's the most discussed concept when it comes to critiques of text based user interfaces.
Thinking about it - for traditional text based interfaces like a unix shell, perhaps I'd argue that with stackoverflow and google search they became more discoverable than GUI's.
And perhaps even more with LLMs.
ie it's easier to find out how to do X in bash and cut and paste the solution than watch a video on which series of things to click.
Not sure how that extends to specific chat interfaces - can you ask the general models how best to use specific chat from ends over specific tools?
Got to say, I like the current Android versions.
In the early days I flashed my Motorola Defy every second month with some cool new ROM.
Always rooted and Xposed, always enabling something new.
Now I run a S23 Ultra and after two years it still does everything I need.
OneUI 8.0 and Android 16.
For work (app de) I also have a Pixel 7a, always with the newest Android Beta.
Also works well.
Even the entry level phones work OK to pretty good now.
My Samsung A16 5G (also for work) functions surprisingly well for 150€.
> Now I run a S23 Ultra and after two years it still does everything I need.
Maybe, but it is fully under Google and Samsung's control, and is choke full of spyware. You couldn't pay me to use a stock (Googled) Android phone for this reason alone.
Back when I used Android phones, tweaking was pretty important to me too. I still remember when I installed CyanogenMod on a Motorola XT1565, those were the days... Eventually, LineageOS, and then some new phones happened, not all of which were rootable, though I eventually ended up with a OnePlus 7 Pro which was pretty tweakable and even opened the possibility of bootloader re-locking, until a TWRP bug wiped my device and I pretty much stopped tweaking. Was never quite able to get EdXposed working right again...
How well is rooting supported on these newer Android versions/devices? If I install LineageOS on my device, for example, I can be reasonably sure that Magisk will work fine. But how well does it work on a stock, locked-down ROM?
Most devices doesn't have unlockable bootloaders now thus you can't even root them unless it was a popular device and a temporary /finicky hack was found.
I am asking out of curiosity and nothing else: what use cases do you have that motivate you to get a new phone every year? Do iPhones get notably better with every release? I'm guessing camera or storage would be big ones?
Well, with this last one they finally made the telephoto 48MP. Also, vapor chamber is nice. I don't know if the 18 will have enough for me to upgrade, and it might even have a reason for me not to upgrade (removing gestures from Camera Control). But so far it's been every year, because I've only been using iPhone for a couple years, and my first was a refurbished 15 Pro Max.
The 17 Pro (non-Max) only comes with up to 1TB of storage, but that's still more than my 15 of before.
I'm not parent but a counter perspective - the only three motivations I have are:
phone dies
camera vastly improves (imo it's been on a decline since the Nexus 6)
phone is too slow to use
I'm on year 5 of my Samsung s21u that I can replace the Samsung ux slop with asop ports
It is not for anyone but Apple, because they control the source code and full remote code execution access to your device at a higher privilege level than you as the supposed owner have.
Including custom ROM devs like the GrapheneOS team or the LineageOS team? That's a lot of trust you're putting in a company that only has their own profit at heart.
So you believe dictatorships are a good idea when it comes to technology control.
My question is then the same of anyone who prefer to give up freedoms to centralized seemingly benevolent dictators: What happens when you are told you can no longer do something you were previously allowed to do, that is only in the interest of the centralized power?
The linux ecosystem is a peaceful and effective system of anarchy with no central authority. Pretty much the exact opposite of the Apple dictatorship.
I am a Linux distro maintainer and my team and I do whatever we think is best in our distro, even including patches and defaults Torvalds did not approve of, because our goal is security first and his is compatibility first. That is what we mean when we say "free" in free open source software. Torvalds can do whatever he wants in his branch, and we can do whatever we want in ours, selectively taking the bits we want.
Want to modify the operating system on your iPhone? Want to use Tor globally for privacy? Want to use an external NFC/USB smartcard for secret management or authentication? Want to use a browser with an engine other than last gen crippled webkit? Good luck. Apple did not extend those freedoms to you.
You have no freedom on that device but to install binaries Apple blesses and use it the way they intend. Apple does not produce free software or give their users freedom over their devices because they want maximum profit and control.
After Trump's re-election, I figured that there's not much difference between using a cheap Android from Chinese OEM, or an iPhone. Both will give away my information if the totalitarian government (Chinese or American) requests so. I don't really have particular preference on whether it's the Chinese or Americans spying on me, so in the end it all boils down to price. Chinese Android devices deliver same level of performance and features as Apple for 1/4 of the price.
Of course if I really cared about privacy, I would just install GrapheneOS or LineageOS on supported Android device, so no Apple in that case either.
I think this is meant to show that moving the responsibility this way would be absurd because we don't do it for cars but... yeah, we probably should've done that for cars? Maybe then we'd have safe roads that don't encourage reckless driving.
But I think you're missing their "like bank robberies" point. Punishing the avenue of transport for illegal activity that's unrelated to the transport itself is problematic. I.e. people that are driving safely, but using the roads to carry out bad non-driving-related activities.
It's a stretched metaphor at this point, but I hope that makes sense (:
It is definitely getting stretchy at this point, but there is the point to be made that a lot of roads are built in a way which not only enables but encourages driving much faster than may be desired in the area where they're located. This, among other things, makes these roads more interesting as getaway routes for bank robbers.
If these roads had been designed differently, to naturally enforce the desired speeds, it would be a safer road in general and as a side effect be a less desirable getaway route.
Again I agree we're really stretching here, but there is a real common problem where badly designed roads don't just enable but encourage illegal and potentially unsafe driving. Wide, straight, flat roads are fast roads, no matter what the posted speed limit is. If you want low traffic speeds you need roads to be designed to be hostile to high speeds.
I think you are imagining a high-speed chase, and I agree with you in that case.
But what I was trying to describe is a "mild mannered" getaway driver. Not fleeing from cops, not speeding. Just calmly driving to and from crimes. Should we punish the road makers for enabling such nefarious activity?
(it's a rhetorical question; I'm just trying to clarify the point)
Which in case of digital replicas that can feign real people, may be worth considering. Not a blanket legislation as proposed here, but something that signals the downstream risks to the developer to prevent undesired uses.
Then only foreign developers will be able to work with these kinds of technologies... the tools will still be made, they'll just be made by those outside jurisdiction.
Unless they released a model named "Tom Cruise-inator 3000," I don't see any way to legislate that intent that would provide any assurances to a developer that their misused model couldn't result in them facing significant legal peril. So anything in this ballpark has a huge chilling effect in my view. I think it's far too early in the AI game to even be putting pen to paper on new laws (the first AI bubble hasn't even popped, after all) but I understand that view is not universal.
I would say a text-based model carries a different risk profile compared to video-based ones. At some point (now?) we'd probably need to have the difficult conversation of what level of media-impersonation we are comfortable with.
It's messy because media impersonation has been a problem since the advent of communication. In the extreme, we're sort of asking "should we make lying illegal?"
The model (pardon) in my mind is like this:
* The forger of the banknote is punished, not the maker of the quill
* The author of the libelous pamphlet is punished, not the maker of the press
* The creep pasting heads onto scandalous bodies is punished, not the author of Photoshop
In this world view, how do we handle users of the magic bag of math? We've scarcely thought before that a tool should police its own use. Maybe, we can say, because it's too easy to do bad things with, it's crossed some nebulous line. But it's hard to argue for that on principle, as it doesn't sit consistently with the more tangible and well-trodden examples.
With respect to the above, all the harms are clearly articulated in the law as specific crimes (forgery, libel, defamation). The square I can't circle with proposals like the one under discussion is that they open the door for authors of tools to be responsible for whatever arbitrary and undiscovered harms await from some unknown future use of their work. That seems like a regressive way of crafting law.
> The creep pasting heads onto scandalous bodies is punished, not the author of Photoshop
In this case the guy making the images isn't doing anything wrong either.
Why would we punish him for pasting heads onto images, but not punish the artist who supplied the mannequin of Taylor Swift for the music video to Famous?†
Why would we punish someone for drawing us a picture of Jerry Falwell having sex with his mother when it's fine to describe him doing it?
(Note that this video, like the recent SNL "Home Alone" sketch, has been censored by YouTube and cannot be viewed anonymously. Do we know why YouTube has recently kicked censorship up to these levels?)
> then we'd have safe roads that don't encourage reckless driving.
You mean like speed limits, drivers licenses, seat belts, vehicle fitness and specific police for the roads?
I still can't see a legitimate use for anyone cloning anyone else's voice. Yes, satire and fun, but also a bunch of malicious uses as well. The same goes with non-fingerprinted video gen. Its already having a corrosive effect on public trust. Great memes, don't get me wrong, but I'm not sure thats worth it.
Creative work has obvious applications. e.g. AISIS - The Lost Tapes[0] was a sort of Oasis AI tribute album (the songs are all human written and performed, and then the band used a model of Liam Gallagher's mid 90s voice. Liam approved of the album after hearing it, saying he sounded "mega"). Some people have really unique voices and energy, and even the same artist might lose it over time (e.g. 90s vs 00s Oasis), so you could imagine voice cloning becoming just a standard part of media production.
As a former VFX person, I know that a couple of shows are testing out how/where it can be used. (currently its still more expensive than trad VFX, unless you are using it to make base models.)
Productivity gains in the VFX industry over the last 20 years has been immense. (ie a mid budget TV show has more, and more complex VFX work than most movies that are 10 years old, and look better.)
But, does that mean we should allow any bad actor to flood the floor with fake clips of whatever agenda they want to push? no. If I as a VFX enthusiast gets fooled by GenAI videos (Picture area done deal, its super hard to stop reliably) then we are super fucked.
You said you can't see a legitimate use, but clearly there are legitimate uses (the "no legitimate use" idea is used to justify bad drug policy for example, so we should be skeptical of it). As to whether we should allow it, I don't see how we have a choice. The models are already out there. Even if they weren't, it becomes cheaper every year to train new ones, and eventually today's training supercomputers will be tomorrow's commodity. The whole idea of AI "fingerprinting" is bad anyway; you don't fingerprint that something is inauthentic. You sign that it is authentic.
> The models are already out there. Even if they weren't, it becomes cheaper every year to train new ones,
Yes, lets just give up as bad actors undermine society, scam everyone and generally profit from us.
> You sign that it is authentic.
Signing means you denote ownership. A signed message means you can prove where it comes from. A service should own the shit it generates.
Which is the point, because if I cannot reliably see what is generated, how is a normal person able to tell. being able to provide a mechanism for the normal person to verify is a reasonable ask.
You put the bad actors in prison, or if they're outside your jurisdiction, and they're harming your citizens, and you're America, you go murder them. This has to be the solution anyway because the technology is already widely available. You can't make everyone in the world delete the models.
Yes signing so the way you show something is authentic. Like when the Hunter Biden email thing happened I didn't understand (well, I did) why the news was pretending we have no way to check whether they're real or whether the laptop was tampered with. It was a gmail account; they're signed by Google. Check the signatures! If that's his email address (presumably easy enough to corroborate), done. Missed opportunity to educate the public about the fact that there's all sorts of infrastructure to prove you made/sent something on a computer.
If you want to spread this idea, it would probably help your cause if you pointed to what you think Rust does particularly poorly. Steep learning curve, npm-esque packaging?
The article gave many good reasons where Rust is lacking. The points you raise are also clearly issues. My main additional issues are the high complexity, the syntax, the lack of stability, compilation times (and monomorphization), and lack of alternate implementations. Ignoring the language itself, the aggressive and partially misleading negative and positive marketing.
What do you mean by useful dynamic linking? Dynamic linking with C ABI is supported natively in Rust and is very widely used (just checked GitHub), especially for FFI like in Python modules (PyO3). If you mean an ABI that supports all the Rust features (without extern C), then it's a problem faced by every language that has more features than C - C++, Haskell, Go, Zig, etc included. To solve that problem, somebody will have to design a new stable standard common ABI that natively supports all the features from these languages, or at least makes it possible to express these features in some way.
C++ has reasonable dynamic linking (there are ABI breaks, but I only remember only one really bad one with std::basic_string and C++11), obviously excluding a lot of metaprogramming features, but a lot of C++ dynamic libraries exist that are widely used. Supposedly Swift does a better job, though I'm not familiar. Yes, Rust can dynamically link with a C ABI (as can any language) but it loses a lot of the expressiveness (I imagine, I'm not a rust developer) to have to use the C ABI.
Dynamic languages of course support dynamic linking.
Because the size information for an instance of foo is only known at compile-time, so the clients aren't allocating enough space on the stack for it (ditto the heap if you're using `new foo();`)
The way around this is awkward and involves the pimpl pattern and moving all your constructors out-of-line... but you also need to freeze all virtual methods (even just adding a new one breaks ABI), and avoid using any template-heavy std:: types (not even std::string), since those are often fragile.
Most people just give up and offer an extern "C" API, because that has the added benefit that it's compatible across compilers.
"true" C++ shared libraries are crazy difficult to maintain. It's the reason microsoft invented COM.
It is problem of all languages whose extensions to C are badly designed. That it works with the C FFI just shows that C part properly supports it while Rust does not. And yes, C++ has similar problems as Rust.
Both Pin Projection and the handling of async are pretty bad in my opinion and I write a lot of Rust code. The syntax is also slowly getting very funky with horrible additions like the recent `+ use <'a>` snytax. I also agree with the orphan rule but it makes a lot of code very ugly, fast.
> Everyone goes to the "simplest" target - Google in this case - to blame for the status quo, but Google is in this position because everybody else
Eh, I think we ought to dole out our ire in accordance with the damage. All are responsible to varying degrees, but Google is the most powerful, and has the greatest ability to curb bad behavior if they wanted to, so they get and deserve the most blame second only to the governments that let them become that powerful.
Dole out the ire, but it won't fix the problem until you realize that everyone's dismissal of ownership and responsibility in exchange for convenience is what creates the googles and apples of the world.
Google will argue they are enforcing good behaviour: if you want to rely on their technical guarantees you follow their rules/specs.
> Dole out the ire, but it won't fix the problem until you realize that everyone's dismissal of ownership and responsibility in exchange for convenience is what creates the googles and apples of the world
This is the opposite of true. Blaming normal human behavior for our problems is distraction from effective action. Humans are a near-constant, you have to look to incentive structures to make any changes to the world.
The die is certainly not multi-terabyte. A more realistic number would be 32k-sided to 50k-sided if we want to go with a pretty average token vocabulary size.
Really, it comes down to encoding. Arbitrarily short utf-8 encoded strings can be generated using a coin flip.
Of course, it's random and by chance - tokens are literally sampled from a predicted probability distribution. If you mean chance=uniform probability you have to articulate that.
It's trivially true that arbitrarily short reconstructions can be reproduced by virtually any random process and reconstruction length scales with the similarity in output distribution to that of the target. This really shouldn't be controversial.
My point is that matching sequence length and distributional similarity are both quantifiable. Where do you draw the line?
> Of course, it's random and by chance - tokens are literally sampled from a predicted probability distribution.
Picking randomly out of a non-random distribution doesn't give you a random result.
And you don't have to use randomness to pick tokens.
> If you mean chance=uniform probability you have to articulate that.
Don't be a pain. This isn't about uniform distribution versus other generic distribution. This is about the very elaborate calculations that exist on a per-token basis specifically to make the next token plausible and exclude the vast majority of tokens.
> My point is that matching sequence length and distributional similarity are both quantifiable. Where do you draw the line?
Any reasonable line has examples that cross it from many models. Very long segments that can be reproduced. Because many models were trained in a way that overfits certain pieces of code and basically causes them to be memorized.
Right, and very short segments can also be reproduced. Let's say that "//" is an arbitrarily short segment that matches some source code. This is trivially true. I could write "//" on a coin and half the time it's going to land "//". Let's agree that's a lower bound.
I don't even disagree that there is an upper bound. Surely reproducing a repo in its entirety is a match.
So there must exist a line between the two that divides too short and too long.
Again, by what basis do you draw a line between a 1 token reproduction and a 1,000 token reproduction? 5, 10, 20, 50? How is it justified? Purely "reasonableness"?
There are very very long examples that are clearly memorization.
Like, if a model was trained on all the code in the world except that specific example, the chance of it producing that snippet is less than a billionth of a billionth of a percent. But that snippet got fed in so many times it gets treated like a standard idiom and memorized in full.
Is that a clear enough threshold for you?
I don't know where the exact line is, but I know it's somewhere inside this big ballpark, and there are examples that go past the entire ballpark.
I care that it's within the ballpark I spent considerable detail explaining. I don't care where inside the ballpark it is.
You gave an exaggerated upper limit, so extreme there's no ambiguity, of "entire repo".
I gave my own exaggerated upper limit, so extreme there's no ambiguity. And mine has examples of it actually happening. Incidents so extreme they're clear violations.
Maybe an analogy will help: The point at which a collection of sand grains becomes a heap is ambiguous. But when we have documented incidents involving a kilogram or more of sand in a conical shape, we can skip refining the threshold and simply declare that yes heaps are real. Incidents of major LLMs copying code, in a way that is full-on memorization and not just recreating things via chance and general code knowledge, are real.
You're the only person I've seen ever imply that true copying incidents are a statistical illusion, akin to a random die. Normally the debate is over how often and impactful they are, who is going to be held responsible, and what to do about them.
To recap, the original statement was, "Llm's do not verbatim disgorge chunks of the code they were trained on." We obviously both disagree with it.
While you keep trying to drag this toward an upper bound, I'm trying to illustrate that a coin with "//" reproduces a chunk of code. Again. I don't see much of a disagreement on that point either. What I continue to fail to elicit from you is the salient difference between the two.
I'm trying to find a scissor that distills your vibes into a consistent rule and each time it's the rebutted like I'm trying to make an argument. If your system doesn't have consistency, just say so.
I have a consistent rule. The rule is that if an LLM meets the threshold I set then it definitely violated copyright, and if it doesn't meet the threshold then we need more investigation.
We have proof of LLMs going over the threshold. So that answers the question.
Your illustrations are all in the "needs more investigation" area and they don't affect the conclusion.
We both agree that 1 token by itself is fine, and that some number is too many.
So why do you keep asking about that, as if it makes my argument inconsistent in some way? We both say the same thing!
We don't need to know the exact cutoff, or calculate how it varies. We only need to find violators that are over the cutoff.
How about you tell me what you want me to say? Do you want me to say my system is inconsistent? It's not. Having an area where the answer is unclear means the system is not able to answer every question, but it doesn't need to answer every question.
If you're accusing me of using "vibes" in a way that ruins things, then I counter that no I give nice specific and super-rare probabilities that are no more "vibes" based than your suggestion of an entire repo.
> What I continue to fail to elicit from you is the salient difference between the two.
Between what, "//" and the threshold I said?
The salient difference between the two is that one is too short to be copyright infringement and the other is so long and specific that it's definitely copyright infringement (when the source is an existing file under copyright without permission to copy). What more do you want?
Just like 1 grain of sand is definitely not a heap and 1kg of sand is definitely a heap.
If you ask me about 2, 3, 20 tokens my answer is I don't care and it doesn't matter and don't pretend it's relevant to the question of whether LLMs have been infringing copyright or not ("verbatim disgorge chunks").
It's already a given that this only works on a locked-down device. Making it a simple binary "is this device owned by a minor" switch means parents will actually be able to understand it.
> 2) scams and malware that target children offering a "bypass" to access adult websites
And advertising to children should also be banned, so they won't be exposed to such scams, among other things. Thankfully this header lets the site know if they're breaking the law by showing scam ads, which makes prosecution super easy.
> I think this would end up similar to the Do Not Track header which ultimately no one cared about or took seriously.
Oh, of course none of this works unless it has the teeth of law to back it up.
reply