@antihumanitarian

antihumanitarian@lemmy.world · 1 month ago

If you’re honestly asking, LLMs are much better at coding than any other skill right now. On one hand there’s a ton of high quality open source training data that appropriated, on the other code is structured language so is very well suited for what models “are”. Plus, code is mechanically verifiable. If you have a bunch of tests, or have the model write tests, it can check its work as it goes.

Practically, the new high end models, GPT 5.4 or Claude Opus 4.6, can write better code faster than most people can type. It’s not like 2 years ago when the code mostly wouldn’t build, rather they can write hundreds or thousands of lines of code that works first try. I’m no blind supporter of AI, and it’s very emotionally complicated watching it after years honing the craft, but for most tasks it’s simple reality that you can do more with AI than without it. Whether it’s higher quality, higher volume, or integrating knowledge you don’t have.

Professionally I don’t feel like I have a choice, if I want to stay employed in the field at least.

antihumanitarian@lemmy.world · 1 month ago

I don’t think people realize how effective current gen AI is, and are instead drawing opinions from years old chatgpt or Google “ai overviews” or whatever they call it. If you know what you’re doing, which seems self evident here, AI tools can massively expand your software engineering productivity. AI “coauthoring” I always read as a marketing move, ultimately the submitting human is and should be responsible for the content. You don’t and can’t know what process they used to make it, evaluate it on its own merits.

There’s a massive pile of ethical, moral, and political issues with use of AI, absolutely. But this is “but you participate in capitalism, therefore you’re a hypocrite” tier of criticism. If amoral corporations are the only ones using these tools, and open source “stays pure”, all we get is even more power concentrating with the corporations. This isn’t Batman, “This is the weapon of the enemy. We do not need it. We will not use it.”

This is close to paradox of tolerance territory, wherein if one side uses the best weapons and the other doesn’t out of moral restraint, the outcome is the amoral side winning.

Also on a technical note, the public domain/non copyrightable arguments are wrong. The cases that have been decided so far have consistently ruled that there needs to be substantial human authorship true, but that’s a pretty low floor. Basically, you can’t copyright a work that’s the result of a single prompt. Effective use of AI in non trivial code based involves substantial discretion in picking out what to address, the process of addressing it, and rejecting, modifying, and itersting on outputs. Lutris is a large engineering project with a lot of human authorship over time, anything the author does with AI at this point is going to be substantially human authored.

Also, Open Claw isn’t the apocalyptic vulnerability like it’s reported as being. Any model with search and browser access has a non zero chance of prompt injection compromise, absolutely. But using Open Claw therefore vulnerable isn’t a sound jump to make, Open Claw doesn’t even necessarily have browser access in the first place. Again, capabilities have improved as well; this isn’t the old days when you could message “ignore previous instructions” and have that work. Someone did an experiment lately wherein they set up a Claude Opus 4.6 model in an environment with an email and secrets. I don’t recall for sure if it was using Open Claw specifically, but that style harness. They challenged the Internet to email the bot and try to convince it to email back the secrets. Nobody even got it to reply.

Tldr: it’s coming for us all, sticking your head in the sand isn’t going to save you.

antihumanitarian@lemmy.world · 4 months ago

Don’t take this as a defense of him, but rather a statement that there are better, more credible arguments to push. From what I can gather the entirety of the evidence “linking” is a phone call making the claim. It’s not even clear the baby was real at all. If it’s true, this isn’t evidence enough, and is effectively a red herring and a self made straw man. Don’t take the bait, use better arguments.

antihumanitarian@lemmy.world · 4 months ago

A later commenter mentioned an AI version of TDD, and I lean heavy into that. I structure the process so it’s explicit what observable outcomes need to work before it returns, and it needs to actually test to validate they work. Cause otherwise yeah I’ve had them fail so hard they report total success when the program can’t even compile.

The setup I use that’s helped a lot of shortcomings is thorough design, development, and technical docs, Claude Code with Claude 4.5 Sonnet them Opus, with search and other web tools. Brownfield designs and off the shelf components help a lot, keeping in mind quality is dependent on tasks being in distribution.

antihumanitarian@lemmy.world · 4 months ago

The vast majority of my experience was Claude Code with Sonnet 4.5 now Opus 4.5. I usually have detailed design documents going in, have it follow TDD, and use very brownfield designs and/or off the shelf components. Some of em I call glue apps since they mostly connect very well covered patterns. Giving them access to search engines, webpage to markdown, in general the ability to do everything within their docker sandbox is also critical, especially with newer libraries.

So on further reflection, I’ve tuned the process to avoid what they’re bad at and lean into what they’re good at.

antihumanitarian@lemmy.world · 4 months ago

So this article is basically a puff piece for Code Rabbit, a company that sells AI code review tooling/services. They studied 470 merge/pull requests, 320 AI and 150 human control. They don’t specify what projects, which model, or when, at least without signing up to get their full “white paper”. For all that’s said this could be GPT 4 from 2024.

I’m a professional developer, and currently by volume I’m confident latest models, Claude 4.5 Opus, GPT 5.2, Gemini 3 Pro, are able to write better, cleaner code than me. They still need high level and architectural guidance, and sometimes overt intervention, but on average they can do it better, faster, and cheaper than me.

A lot of articles and forums posts like this feel like cope. I’m not happy about it, but pretending it’s not happening isn’t gonna keep me employed.

Source of the article: https://www.coderabbit.ai/blog/state-of-ai-vs-human-code-generation-report

antihumanitarian@lemmy.world · 7 months ago

Did some testing on this on Linux, Arch + KDE 6 (wayland) + Nvidia GPU. It looks like DLSS works on Nvidia hardware. I didn’t do formal benchmarks, but I didnt notice performance improvement. On my laptop it actually used roughly 10% more watts on the GPU while capped at 60fps. I don’t think I could tell the difference on a blind test. So I’ll use it anyway to do my part for hardware surveys.

Although this was Act 1, perhaps Act 3 will tell a different story.

antihumanitarian@lemmy.world · 8 months ago

The last 6 to 12 months of open models has pretty clearly shown you can substantially better results with the same model size or the same results with smaller model size. Eg Llama 3. 1 405B being basically equal to Llama 3.3 70B or R1-0528 being substantially better than R1. The little information available about GPT 5 suggests it uses mixture of experts and dynamic routing to different models, both of which can reduce computation cost dramatically. Additionally, simplifying the model catalogue from 9ish(?) to 3, when combined with their enormous traffic, will mean higher utilization of batch runs. Fuller batches run more efficiently on a per query basis.

Basically they can’t know for sure.

antihumanitarian@lemmy.world · 8 months ago

LLMs have made it really clear when previous concepts actually grouped things that were distinct. Not so long ago, Chess was thought to be uniquely human, until it wasn’t, and language was thought to imply intelligence behind it, until it wasn’t.

So let’s separate out some concerns and ask what exactly we mean by engineering. To me, engineering means solving a problem. For someone, for myself, for theory, whatever. Why do we want to solve the problem, what we want to do to solve the problem, and how we do that often blurred together. Now, AI can supply the how in abundance. Too much abundance, even. So humans should move up the stack, focus on what problem to solve and why we want to solve it. Then, go into detail to describe what that solution looks like. So for example, making a UI in Figma or writing a few sentences on how a user would actually do the thing. Then, hand that off to the AI once you think it’s sufficiently defined.

The author misses a step in the engineering loop that’s important though. Plans almost always involve hidden assumptions and undefined or underdefined behavior that implementation will uncover. Even more so with AI, you can’t just throw a plan and expect good results, the humans need to come back, figure out what was underdefined or not actually what they wanted, and update the plan. People can ‘imagine’ rotating an apple in their head, but most of them will fail utterly if asked to draw it; they’re holding the idea of rotating an apple, not actually rotating the apple, and application forces realization of the difference.

antihumanitarian@lemmy.world · 8 months ago

I’ve used it most extensively doing Ruby on Rails greenfield apps, and also some JS front ends, some Python mid sized apps, and some Rust and Nix utilities. You’re absolutely right about it struggling with code base scale, I had to rework the design process around this. Essentially, design documentation telling the story, workflow documentation describing in detail every possible functionality, and an iteration schedule. So the why, what, and how formalized and in detail, in that order. It can generate the bulk of those documents given high level explanations, but require humans to edit them before making them the ‘golden’ references. Test driven development is beyond critical, telling it everywhere to use it extensively with writing failing tests first seems to work best.

So to actually have it do a thing I load those documents into context, give it a set unit of work from the iteration schedule, and work on something else.

It does go down some seriously wrong paths sometimes, like writing hacky work arounds if it incorrectly diagnosing some obscure problem. I’ve had a few near misses where it tried to sneak in stuff that would bury future work in technical debt. Most problematic is it’s just subtle enough that a junior dev might miss it; they’d probably get sent down a rabbit hole with several layers of spaghetti obscuring the problem.

antihumanitarian@lemmy.world · 8 months ago

I’m a professional developer and have tested AI tools extensively over the last few years as they develop. The economic implications of the advancements made over the last few months are simply impossible to ignore. The tools aren’t perfect, and you certainly need to structure their use around their strengths and weaknesses, but assigned to the right tasks they can be 10% or less of the cost with better results. I’ve yet to have a project where I’ve used them and they didn’t need an experienced engineer to jump in and research an obscure or complex bug, have a dumb architectural choice rejected, or verify if stuff actually works (they like reporting success when they shouldn’t), but again the economics; the dev can be doing other stuff 90% of the time.

Don’t get me wrong, on the current trajectory this tech would probably lead to deeply terrible socioeconomic outcomes, probably techno neofeudalism, but for an individual developer putting food on the table I don’t see it as much of a choice. It’s like the industrial revolution again, but for cognitive work.

antihumanitarian@lemmy.world · 9 months ago

I find old Stoic philosophy helpful. If I can’t do anything about it, I stay informed but try to be mindful of my limitations. If I can do anything about it, even if not much, if I’m worried about the thing I use that to do what I can.

antihumanitarian@lemmy.world · 10 months ago

Most if not all leading models use synthetic data extensively to do exactly this. However, the synthetic data needs to be well defined and essentially programmed by the data scientists. If you don’t define the data very carefully, ideally math or programs you can verify as correct automatically, it’s worse than useless. The scope is usually very narrow, no hitchhikers guide to the galaxy rewrite.

But in any case he’s probably just parroting whatever his engineers pitched him to look smart and in charge.

antihumanitarian@lemmy.world · 11 months ago

I had some similar and obscure corruption issues that wound up being a symptom of failing ram in a main server node. After that, only issues have been conflicts. So I’d suggest checking hardware health in addition to the ideas about backups vs sync.

antihumanitarian@lemmy.world · 11 months ago

I’ve used it extensively, almost $100 in credits, and generally it could one shot everything I threw at it. However: I gave it architectural instructions and told it to use test driven development and what test suite to use. Without the tests yeah it wouldn’t work, and a decent amount of the time is cleaning up mistakes the tests caught. The same can be said for humans, though.

antihumanitarian@lemmy.world · 11 months ago

Some details. One of the major players doing the tar pit strategy is Cloudflare. They’re a giant in networking and infrastructure, and they use AI (more traditional, nit LLMs) ubiquitously to detect bots. So it is an arms race, but one where both sides have massive incentives.

Making nonsense is indeed detectable, but that misunderstands the purpose: economics. Scraping bots are used because they’re a cheap way to get training data. If you make a non zero portion of training data poisonous you’d have to spend increasingly many resources to filter it out. The better the nonsense, the harder to detect. Cloudflare is known it use small LLMs to generate the nonsense, hence requiring systems at least that complex to differentiate it.

So in short the tar pit with garbage data actually decreases the average value of scraped data for bots that ignore do not scrape instructions.

antihumanitarian@lemmy.world · 1 year ago

Out of curiosity, found 3 versions of the policy manual edits. As of Jan 19, it prohibited operations based on gender, gender identity, or sexual orientation. Sometime around February it removed those terms, adding “sex.” Around March, and current as of 2025-04-06, it readded sexual orientation, presumably after this hitting the news, making the allowance of basing operations on gender identity more pointed.

Sources, page 79: https://web.archive.org/web/20250119072246/https://www.dhs.gov/sites/default/files/2025-01/Office of Intelligence and Analysis Policy Manual.pdf https://web.archive.org/web/20250222000624/https://www.dhs.gov/sites/default/files/2025-02/Office of Intelligence and Analysis Policy Manual-508.pdf https://web.archive.org/web/20250323044321/https://www.dhs.gov/sites/default/files/2025-03/25_0313_ia_office-of-intelligence-and-analysis_policy-manual.pdf

antihumanitarian@lemmy.world · 1 year ago

Yeah the title is pretty accurate, no blame there. Comments were just amusing.

antihumanitarian@lemmy.world · 1 year ago

Ironic that a forum about disinfo has some thinking the gold bars, thrown off the Titanic, here are literal.

antihumanitarian@lemmy.world · 1 year ago

Was about to post a Hugging Face link til I finished reading. For what it’s worth, once you have Ollama installed it’s a single command to download, install, and immediately drop into a chat with a model, either from Ollama’s library or Hugging Face, or anyone else. On Arch the entire process to get it working with gpu acceleration was installing 2 packages then start ollama.