no more technical debt

not_IO@lemmy.blahaj.zone · 5 days ago

no more technical debt

Logi@lemmy.world · 5 days ago

Grazie.

Surprised it described itself as a war crime and didn’t try to justify that.

It is not sentient. It has no sense of self or guilt or any of that. It’s just writing back the most likely next word given the question and whatever context it has. It seems to have the leaked code in context? Or it could be bullshitting.

dustyData@lemmy.world · 5 days ago

It’s probably just plagiarizing an article that used the turn of phrase to refer to spaghetti code. Or, just as likely, copying directly an article about the code leak.

Wispy2891@lemmy.world · 5 days ago

Right, I asked the source of those figures (I didn’t give the source in context, it just did a web search) and I got this: https://read.engineerscodex.com/p/diving-into-claude-codes-source-code

jj4211@lemmy.world · 4 days ago

I find the conundrum of the ‘Claw Code’ scenario amusing.

LLM companies have argued they should get to ignore all copyright, and now that one of their code leaked, suddenly they care greatly about copyright.

But fine, their argument is that LLM digest of copyright infringements are ‘fair use’, so now that’s been turned against them, using an LLM to launder their copyrighted material in precisely the way they declared doesn’t count. So either they let it ride or undermine their own argument…

GamingChairModel@lemmy.world · 3 days ago

LLM companies have argued they should get to ignore all copyright, and now that one of their code leaked, suddenly they care greatly about copyright.

Anthropic itself has argued that digitizing and using the digitized copies to train models is fair use, so long as:

They don’t redistribute the physical copies they bought
They don’t allow an end user to retrieve the contents of any one specific work at the user interface (if you ask Claude to spit out the entire text of a copyrighted work used to train it, it is designed to resist copying too much out of a single work)

So they don’t argue that copyright doesn’t count, exactly. They argue that copyright doesn’t prevent model training from ingesting an entire copyrighted work, as long as it’s done with so many other copyrighted works that any given original isn’t a huge contributor to the model or its outputs.

There’s tension in their positions, but not so much that it would totally fall apart.

ErmahgherdDavid@lemmy.dbzer0.com · 4 days ago

Or, and in my view the most likely option, they will get away with ripping off the little guy AND claiming “but bruh mah copyright” because they have very well paid lawyers and most western countries but especially the US have a two tiered legal system designed to protect the rich and punish the poor.