Hacker Newsnew | past | comments | ask | show | jobs | submit | vidarh's commentslogin

It's an article that tries to be literature rather than just the information it conveys, and some people don't like that whether it is successful or not.

I'm Norwegian, and the Norwegian stereotype of Finnish people used to be that they are dour and introvert. And we're by and large culturally a lot less outwardly cheerful to people we don't know than the Danes.

Sometimes Norwegian TV would show Finnish dramas while I was growing up in the '80s, and the standing joke was that the typical Finnish drama had two guys hiking through the forest, one of them saying something, and then half an hour more of hiking before the other would reply. I don't remember whether that was accurate (it's not as if I'd have kept watching), but I suspect not.


Really, this. You still need to check its work, but it is also pretty good at checking its work if told to look at specific things.

Make it stop. Tell it to review whether the code is cohesive. Tell it to review it for security issues. Tell it to review it for common problems you've seen in just your codebase.

Tell it to write a todo list for everything it finds, and tell it fix it.

And only review the code once it's worked through a checklist of its own reviews.

We wouldn't waste time reviewing a first draft from another developer if they hadn't bothered looking over it and test it properly, so why would we do that for an AI agent that is far cheaper.


I wouldn't mind see a collection of objectives and the emitted output. My experience with LLM output is that they are very often over-engineered for no good reason, which is taxing on me to review.

I want to see this code written to some objective, to compare with what I would have written to the same objective. What I've seen so far are specs so detailed that very little is left to the discretion of the LLM.

What I want to see are those where the LLM is asked for something, and provided it because I am curious to compare it to my proposed solution.

(This sounds like a great idea for a site that shows users the user-submitted task, and only after they submit their attempt does it show them the LLM's attempt. Someone please vibe code this up, TIA)


So why can't the deterministic part of the agent program embed in all these checks?

It absolutely can, I'm building things to do this for me. Claude Code has hooks that are supposed to trigger upon certain states and so far they don't trigger reliably enough to be useful. What we need are the primitives to build code based development cycles where each step is executed by a model but the flow is dictated by code. Everything today relies too heavily on prompt engineering and with long context windows instruction following goes lax. I ask my model "What did you do wrong?" and it comes back clearly with "I didn't follow instructions" and then gives clear and detailed correct reasons about how it didn't follow instructions... but that's not supremely helpful because it still doesn't follow instructions afterwards.

It increasingly is. E.g. if you use Claude Code, you'll notice it "likes" to produce todo lists that rendered specially via the TodoWrite tool that's built in.

But it's also a balance of avoiding being over-prescriptive in tools that needs to support very different workflows, and it's easy to add more specific checks via plugins.

We're bound to see more packaged up workflows over time, but the tooling here is still in very early stages.


Tell it to grade its work in various categories and that you'll only accept B+ or greater work. Focusing on how good it's doing is an important distinction.

It's very funny that I can't tell if this is sarcasm or not. "Just tell it to do better."

Oh I'm not at all joking. It's better at evaluating quality than producing it blindly. Tell it to grade it's work and it can tell you most of the stuff it did wrong. Tell it to grade it's work again. Keep going through the cycle and you'll get significantly better code.

The thinking should probably include this kind of introspection (give me a million dollars for training and I'll write a paper) but if it doesn't you can just prompt it to.


An experiment on that from a year ago: https://news.ycombinator.com/item?id=42584400

Think of it as a "you should - and is allowed to - spend more time on this" command, because that is pretty much what it is. The model only gets so much "thinking time" to produce the initial output. By asking it to iterate you're giving it more time to think and iterate.

Same here. I've picked up projects that have languished for years because the boring tasks no longer make me put them aside.

If done in chat, it's just an alternative to talking to you freeform. Consider Claude Code's multiple-choice questions, which you can trigger by asking it to invoke the right tool, for example.

None of the issues go away just because it's in chat?

Freeform looks and acts like text, except for a set of things that someone vetted and made work.

If the interactive diagram or UI you click on now owns you, it doesn't matter if it was inside the chat window or outside the chat window.

Now, in this case, it's not arbitrary UI, but if you believe that the parsing/validation/rendering/two way data binding/incremental composition (the spec requires that you be able to build up UI incrementally) of these components: https://a2ui.org/specification/v0.9-a2ui/#standard-component...

as transported/renderered/etc by NxM combinations of implementations (there are 4 renderers and a bunch of transports right now), is not going to have security issues, i've got a bridge to sell you.

Here, i'll sell it to you in gemini, just click a few times on the "totally safe text box" for me before you sign your name.

My friend once called something a babydoggle - something you know will be a boondoggle, but is still in its small formative stages.

This feels like a babydoggle to me.


> None of the issues go away just because it's in chat?

There is a wast difference in risk between me clicking a button provided by Claude in my Claude chat, on the basis of conversations I have had with Claude, and clicking a random button on a random website. Both can contain a malicious. One is substantially higher risk. Separately, linking a UI constructed this way up to an agent and let third parties interact with it, is much riskier to you than to them.

> If the interactive diagram or UI you click on now owns you, it doesn't matter if it was inside the chat window or outside the chat window.

In that scenario, the UI elements are irrelevant barring a buggy implementation (yes, I've read the rest, see below), as you can achieve the same things as you can do that way with just presenting the user with a basic link and telling them to press it.

> as transported/renderered/etc by NxM combinations of implementations (there are 4 renderers and a bunch of transports right now), is not going to have security issues, i've got a bridge to sell you.

I very much doubt we'll see many implementations that won't just use a web view for this, and I very much doubt these issues will even fall in the top 10 security issues people will run into with AI tooling. Sure, there will be bugs. You can use this argument against anything that requires changes to client software.

But if you're concerned about the security of clients, mcp and hooks is a far bigger rats nest of things that are inherently risky due to the way they are designed.


We can't even tell for certain the we have existence in time beyond just this moment - our only source of that is a memory of time passing, which we can't validate.

A shell company complete with directors of your preferred nationality is trivial to procure for relatively small amounts of money.

> - Jon Richelieu-Booth for posting a picture of himself with a gun in the US

A quick search suggests that the photo with the gun wasn't the sole cause of the arrest, given there were stalking allegations "involving serious alarm or distress" from someone he had a conflict with, where the gun was one part of what caused the complainint to (claim to) feel threatened. Police may well have overreacted due to the gun post, but your framing leaves out rather relevant details.

> - Jordan Parlour for Facebook posts that were deemed ‘hateful.’

Appears to have incited violence by advocating an attack on a hotel, something he pleaded guilty to.

> - Bernadette Spofforth for a post with a “mild inaccuracy”

Was arrested for posting a fake name for an attacker, but released and faced no further action.

Calling potentially putting a target on the back of someone innocent by connecting them to a violent crime a "mild inaccuracy" is at best wildly misleading.

> Maxie Allen and Rosalind Levine

These people did get a wrongful arrest payout, but the claim was most certainly not just raising concerns in a private parent's WhatsApp group. The claims including harassment, and causing a nuisance on the school premises. The claim was still wrong, and the payout reflects that the police should not have been so quick to believe the allegations before making an arrest. But your claim is still hyperbole.

> - Lucy Connolly, for a post calling for mass deportation and to set fire to hotels housing immigrants

At least in this one you admitted the arrest was over incitement to violence.

> - Norbert Gyurcsik, for having “extreme right wing music”

No, for buying and distributing albums whose lyrics breach terrorism legislation and intended to incite racial hatred.

I have plenty of issues with UK terror legislation, which I believe is being abused to shut down legitimate speech at times, but framing this the way you did is again wildly misleading and hyperbolic.

But even if none of your claims were wildly misleading, none of them support your initial claim:

> You are allowed to say it. Unlike UK, you won’t be arrested. But you won’t be allowed in.

... about a comment referring to criticism of the government.

None of the cases above were relevant to that. Most of them are relating to classes of speech that are not protected in the US either.


Yes, but few other countries are as draconian about this as the US seems to want to be, and it is relevant to want to discuss how it will affect the US to make itself a less attractive place to travel to.

Try travel to Europe on an African passport…

I don't doubt there are worse countries/scenarios. We're dealing specifically though with the downward slide of the U.S.

I have in-laws who do that regularly. I'm aware there are plenty of complications with that. I still stand by what I wrote.

> there are plenty of complications with that

What are some of the complications?


Do you have an actual point that doesn't involve me divulging private information of people who are not part of this conversation? My identity is on my profile; identifying the people in question would be rather easy.

If what you're suggesting is that the US is not being more draconian than most, you're free to make an actual claim about how.

I'll note that this article is about people eligible for the visa waiver program, which does not include any African countries - travelling to the US from African countries is also far more draconian than what is outlined in the article, so it's unclear why you think the comparison is relevant.


For one: the US is way more permissive than the EU when it comes to visa duration.

Common to get a 10 year US visa. Schengen visa? For the duration of your visit (for which you have to have bought plane tickets and accommodation before showing up for a visa appointment). The EU also charges pretty hefty fees for a Schengen visa, which I view as a racket and/or xenophobia.

Don’t even get me started on the requirement to hand over your passport at hotels in Europe!

My point is that characterizing the US as “more draconian than most” is quite far from reality, which is a lot more nuanced.


> Common to get a 10 year US visa. Schengen visa? For the duration of your visit

Both of these are possible. Neither are nearly that simple.

For starters the validity period depends on the country, and the type of visa, and since you mentioned Africa, applicants from the vast majority of African states are limited to single entry visas with 3 months validity for B-visas. A few can get 4-5 years, and a handful (I think Morocco, Botswana, South Africa) can get 10 years.

Given that, it's rather odd that you used specifically African countries as the basis for comparison and then pulled out 10 year duration.

On the other side, it is reasonably uncommon to be limited to just the stay for Schengen visas, though it can certainly happen, especially for applicants from poorer countries. And validity can be up to 5 years. But you certainly can

> The EU also charges pretty hefty fees for a Schengen visa, which I view as a racket and/or xenophobia.

The standard cost for a Schengen visa is 90 euros or 105 USD. If you've paid more that has been service fees to application centres, not the EU fees.

The application fee for a US B-visa is 185 USD, in addition there is an issuance fee for some countries, most of them African.


or try to travel to an islamic country with an Israeli stamp on your passport or an Israeli passport.

I've quoted Marx on HN on more than one occasion. I'm not sure they'd like my social media profile, despite having also been consistent in arguing for liberal freedoms that the US used to like to claim to favour.

I've visited the US many times, but I have no intention of going back under the current regime.

I transited through China earlier this year, and I frankly felt less concerned doing that - despite having criticised the Chinese government online many times over the years - than I would feel about entering the US at this point.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: