Hacker Newsnew | past | comments | ask | show | jobs | submit | nkoren's commentslogin

For the past 18 months, I've been creating an in-house GUI application which is starting to approaching Photoshop-level complexity. By which I mean: it's still probably a solid order of magnitude less complex than Photoshop -- but it's not two orders of magnitude less complex. It's several orders of magnitude more complex than the examples of vibecoded apps I typically see.

(The domain, FWIW, is a geospatial transport-planning tool, including a completely custom microsimulation engine, with loads of options for visualization, analytics, etc.)

At the start of this development process, LLMs were capable of assisting with little more than the framework boilerplate stuff. That was very useful, but was well under 50% of the LOC. They were particularly bad at understanding the microsimulator, where they would routinely forget which end of a FIFO queue was the front. LLMs are routinely and correctly criticized for their lack of a true world model, and when it came to modelling real-world physical/spatial/geographic systems, the fact that they see the world as nothing but text was a huge limitation. Not just in terms of having a pretty hazy grasp on concepts like "spatial direction", but even more critically, being unable to rationalize about the "world-within-a-world" which the simulator is attempting to model. They were fully unable to do that.

That was 18 months ago. Now, Claude writes > 99% of my code. It demonstrates a far better grasp of first-order world-model phenomena (like "spatial orientation"), and a decent (but not fantastic) ability to reason about the second-order "world-within-a-world" that the simulator is creating. It's a huge improvement. For some areas of the code, I still need to spell things out very explicitly, giving precise instructions for how a method will work. That's definitely not vibe-coding. But for other areas of the code, I can just say "add this analysis or visualization feature", without specifying how, and Claude will one-shot a result that's somewhere between good and great.

So where we're at now is that Claude often needs hand-holding for some of the most complex areas of the code, and it definitely doesn't understand how the whole application hangs together -- I have to keep reminding it of that, and am constantly taking steps to ensure that it remains well-architected doesn't devolve into a collection of warring patches.

And yet -- in the past 18 months, the boundary between what the LLM is capable of and what I need to exercise control over has shifted MASSIVELY, and it has shifted in the direction of LLMs being more able to rationalize about meta-models and higher-order architectures.

I've got two small children. When they say they can't do something, I always remind them that they can't do that thing -- YET. What they can do today is very far from the ultimate limits of their capabilities. I feel similarly about the capabilities of LLMs. No, they definitely can't vibecode a Photoshop-class application. YET.


I had an an interesting experience with a bird brain today.

There's a robin who often sits in the fig tree in my back yard, giving friendly little chirps whenever I'm near. (I have no way of knowing whether it's the same robin from day to day, but if it's different robins then they all seem to be on the same wavelength.)

Anyhow, today a neighborhood cat came to the back door, and was aggressively friendly when I opened it. Clearly offering affection in exchange for... what? I've never given this cat anything before, apart from a friendly pat. Meanwhile the robin was overhead in the fig tree, giving totally different chirps than I'm used to. Clearly "warning!" "danger!" chirps. It was amazing how unambiguous they were.

I was puzzled who the robin's audience for this was, however. I'd never noticed it freaking out about cats before. Was it trying to warn me for some reason? Trying to warn other nearby birds? I couldn't see any. I thought that maybe it was just shouting at the cat out of general pique.

Then the cat led me to the answer. Turns out it had trapped an (uninjured) baby squirrel behind a planter box near my door. It couldn't reach the squirrel, and the squirrel couldn't escape. The cat seemed to be under the impression that since we were now friends, I could move the planter box and help it to get the baby squirrel. Sadly I had to disappoint it, and after unexpectedly acrobatic shenanigans, I facilitated the squirrel's escape instead.

The robin, meanwhile, ceased its warning chirps the moment it saw that I was aware of the baby squirrel. Then it watched the ensuing affair unfold, from the safety of the fig tree. Once the squirrel was safe and the cat had left disappointed, the robin looked at me, gave a few of its usual happy chirps, and flew away.


crows warn about stuff.

If I go outside and the crows are going crazy, something interesting is happening.

Mostly it is hawks, and the crows will chase and dive bomb them.

Once I came outside and the crows were going nuts, but not flying. And right in the middle of the driveway was a bobcat. no wonder.


There are many ways that America could be more democratic, and simultaneously produce less stupid results:

1. Eliminate / work around the electoral college system, which makes it so that people in the most diverse, educated, and economically-productive parts of the country have dramatically less voting power than a small minority of people who live in more homogeneous, less educated, and less economically-productive areas. This would significantly change the messaging needed to win.

2. Eliminate first-pass-the-post voting, which encourage candidates with extreme views, eliminate anything other than (largely false) political binaries, make it possible to win elections while receiving a minority of the votes, and make it so that the only viable strategy is to vote for the lesser evil rather than somebody you actually want.

3. Get the money out of politics. Make untraceably-funded super-PACs illegal.

4. Gerrymandering should be super fucking illegal.

Other places do this. They're more democratic than the US, and while they still frequently elect stupid politicians, none of those are as bottom-of-the-barrel as what the US is able to scrape together.


re:2, proportional representation systems oftentimes have more extremist parties elected, they’re just severely kneecapped by not having enough votes to do anything extremist


Except that they can hold your coalition government hostage by making you concede on their pet issue or leave the coalition and force an election.


...while correctly spelling "strawberry"...


I could describe the electrical and chemical signals within your neurons and synapses as proof that you are merely a series of electrochemical reactions, and can only mimic genuine thought.


You could do that if you wanted to ignore reality and be reductive to score points in an argument by purposefully conflating mimicry with intention, yes.


That is, by definition, genuine thought.


And that is dogma. It's unthinking circular reasoning.

It wasn't very long ago that scientists were certain that animals did not posses thoughts or feelings. Any behaviour which appeared to resemble thinking or feeling was simply unconscious autonomic responses, with no more thought behind them than a sunflower turning towards the sun. Animals, by definition, lack Immortal Souls and Free Will, and therefore they are empty inside. Biological automata.

Of course this dogma was unfalsifiable, because any apparent evidence of animal cognition could be refuted as simply not being cognition, by definition.

Look, either cognition is magic, or it's math. There really isn't a middle ground. If you want to believe that wetware is fundamentally irreducible to math, then you believe it's magic. If that's want you want to believe, then fine. But it's dogma, and maintaining that dogma will require increasingly willful acts of blindness.


You are using word "math" in a magical way. Current LLM programs are reducible to math and human cognition is reducible to math (which is a reasonable hypothesis). What you are implying is that just because word math is used in both sentences it actually means the same thing. And that is a magical thinking. Just because human cognition is reducible to math (let's assume that for sake of discussion) doesn't mean it's the same math as in the LLM programs, or even close enough. Or maybe it is, but we don't have any proof yet.


I agree with this. I'm not arguing that LLMs are conscious. We don't understand the math behind how our brains work; we don't know how close or far LLMs are to that; and we don't know how many different pathways to consciousness there are within math.

All I'm saying is that the argument that "It's not consciousness, it's just <insert any tangentially mathematical claim here>", is dogma. Given everything that we don't know, agnosticism is the appropriate response.


> It wasn't very long ago that scientists were certain that animals did not posses thoughts or feelings. Any behaviour which appeared to resemble thinking or feeling was simply unconscious autonomic responses, with no more thought behind them than a sunflower turning towards the sun. Animals, by definition, lack Immortal Souls and Free Will, and therefore they are empty inside. Biological automata.

It's cool that you can decide to take half-remembered incorrect anecdotes about what "scientists" are certain of at some indeterminate time in the past, sans citation, and use that to underpin your argument about a totally different thing.

> Of course this dogma was unfalsifiable...

...like your post's anecdata.

> Look, either cognition is magic, or it's math.

Yes, when you decide to draw a convoluted imaginary bounding box around the argument, anything can be whatever you want it to be.

LLMs have no mind and no intention. They are programmed to mimic human language. Read some Grice and learn exactly how dependent humans are on the cooperative principle, and exactly how vulnerable we are to seeing intent where none exists in LLM communication that mimics the outputs our inputs expect to receive.

Your cries of "dogma dogma dogma" are unpersuasive and lack grounding in practical reality.


On the off-chance that psychosomatic suggestibility is on this list, I'm not even looking at it.


Good intuition. This document isn’t real medical research from medical professionals. It’s from a group of people soliciting donations and trying to sell neurodiversity trainings for organizations.

This comes from a part of social media where medical conditions are redefined in ways that make them sound more broad, generic, vague, and common than they really are. This is why this corner of the internet diagnoses themselves with many of these conditions together: The definitions they believe are like horoscopes where anyone who has non-specific symptoms or even psychosomatic conditions can match the definitions given. The result is groups of people on TikTok or Facebook groups who all think they have a list of 5 different medical conditions and who are all frustrated that actual specialists for those conditions won’t agree with their self diagnoses.


Psychosomatic suggestibility is not on the list. These are also all chronic conditions, so if the descriptions don't match your history, you have nothing to worry about.


Unfortunately this document and the linked YouTube videos uses the social media definitions of these conditions, so readers could be misled into thinking they have conditions which they don’t.

The article is really bad. I do not suggest anyone take it seriously.


Yeah… I skipped the first page, in order to quickly answer the question. Shouldn't have done that. (While one should never judge a book by its cover in the metaphorical sense, it's actually quite sensible to make such judgements in the literal sense.)


Though statistically if you’ve previously been told you have something psychosomatic the odds are pretty good that you have something on this list.


What's the context for that, I don't get it


Hypochondria is really common among autistic people. To the point that it's an in-joke and meme among the community.


A most logical and rational reaction.


As someone who does both development and design, I agree. With some caveats.

At this point, Claude now writes > 99% of my code. I wasn't an enthusiastic early adopter; it took me a while to be willing to let go of the reins. But in tandem with LLMs getting better, I also began to realize that what happens inside the code is very rarely important enough for me to care about. Like, I care that it's secure, and performant where it needs to be, etc. -- but mostly I just care about its outputs. If it does what I want it to do, then how it does this doesn't really matter to me or my clients or my users. On the development side, my attention has focused to writing specifications and monitoring the correctness of the test suite, and > 99% of the time that's good enough. It's been a lesson in non-attachment to let go of lovingly crafting every single line of code, but the tradeoff in terms of productivity has absolutely been worth it.

What makes this viable is the fact that there's essentially a "hidden layer" (the code) upon which Claude can operate. My clients don't actually care about the code, and within certain bounds (correctness, security, performance, extensibility, etc.) it turns out that neither do I. This gives Claude a lot of latitude to solve things in its own way, and I think that's a bit part of its effectiveness.

On the other hand, with design there is no hidden layer. Every single aspect of the design is visible to the user and the customer. So the design reflects upon my work in ways that code does not. This means that the conditions which allow me to relax my grip on coding just don't exist for design. It's very difficult for me to imagine delegating design in the same way that I've become comfortable delegating coding.

That said: I suspect that the use-case for Claude Design will be for applications which today receive very little design attention. There are loads of applications where design is less than an afterthought, and the product suffers terribly for it. Delegating to Claude, in those contexts, would likely be a very big win. But for applications which already have designers obsessing over every pixel, I see a very limited role for this. Figma's market is mostly the latter -- the former, by definition, is not part of the market for design tools -- so I don't see them being threatened by this for a long time.


As a person doing design, yes, you feel like you cannot let go.

But as a person employing designers, I have already accepted that I will let go.

We did a marketing website redesign for our b2b saas product with a 3rd party design firm, we gave a lot of input, but the thing isn't perfect, at some point we had to call it done. It was still a significant improvement over what we previously had, but I am under no illusions that it is a masterpiece.

Now, coding tools do have some clear shortcomings for design atm, but how long they will be like that is not clear.


Similar path, I was very skeptical but Claude touches over 90% of my code now. I review almost all of it though. I did not have high expectations for Claude Design but yesterday I tried it for a workflow tool I'm building and in one shot it made something much better suited than standard component libraries. Some more back and forth interspersed with errands etc used up my CD quota, and the result looks better than most of the software I use (it helps that I value information density and clear affordances for interactivity). I haven't tried applying the design to the existing code yet.


Are there goals for an app design? can they be measured? specified? constrained?

For example, in the world of e-commerce, one goal is improving conversion rate, as long as we get that and the design looks nice, that's OK.


Sure there are goals -- but the problem is, you can't make automated tests for them in the same way as you can for (many) software engineering outputs. So you can A/B test something for conversion rate, and find that instead of getting more conversions, it damages your brand. Or it gets more conversions AND damages your brand. And maybe brand damage is frankly not the worst thing in the world with some demographics, but is catastrophic for other demographics. And even if you were okay with doing this kind of A/B testing in the wild, how do you even instrument for everything that matters, anyhow? Your first port of call for security wouldn't be to do an A/B test on how hackable you are.

These sort of issues are what you trust the judgement of a good designer to navigate through. I have no doubt that Claude Design can be better than no designer, and probably better than a bad designer, too. But better than a good designer? I'm more skeptical of that than I am of software engineering.


Maybe machines alone, or with little help from human feedback can estimate possible brand damage.


I thought it was about replicating a mossad supply chain attack.


This makes me a very happy Claude Max subscriber.

Finally, someone of consequence not kissing the ring. I hope this gives others courage to do the same.


As a European user, I‘m not happy at all. I can’t fail to notice that non-domestic mass surveillance is not excluded here. I won’t cancel my account just yet because Opus is the best at computer use. But as soon as Mistral catches up and works reasonably well, I‘ll switch.


If you don't cancel your account now, I don't see what your problem is. Isn't it standard practice for allies to spy on each other? No reason to wait for Mistral to catch up when EU foreign policy already sealed the deal.


Is your argument I should use a shitty model while my coworkers feed the US-based models with the same data? Where would be the sense in that?

> Isn't it standard practice for allies to spy on each other?

Allies? The US is on the brink of breaking up with the EU.

> EU foreign policy already sealed the deal

Not sure what you mean.


Go Mistral !


They already kissed the ring, just not the asshole. They have a little dignity left.


Better than the rest. here's $200, Dario!


This is how we bought Tim Cook the gold trophy. Today's fundraising buys tomorrow's tithe.


The whole article reads as virtue signaling to me. Anthropic already has large defense contracts. Their models are already being used by the military. There's really no statement here.


The notion that it's bad to signal virtue is one of the crazier propaganda efforts I've seen over the last 20 years or so.


It’s a manipulative tactic. Businesses have no soul and no conscience.


It's arguable that businesses are subject to the same morality-inducing processes that humans are. For example, as a human (with a soul?) what is at risk when we do something immoral? I see it to be a reputational cost at the highest level. Morality could be viewed from the perspective that it increases predictability/coherence in society (generates less heat).


If societal feedback is the only thing keeping a human from deviating in catastrophic ways, that’s what we call a sociopath.


The humans working there do. To state otherwise is to absolve those humans of any responsibility.


Did I state otherwise though?


Did I say you stated otherwise?


How is it virtue signalling when sticking by these principles risks their entire business being destroyed by either being declared a supply chain risk or nationalized?


A company being asked to violate their virtues refuses, and then communicates that to reestablish their commitment to said virtues?

Tell me more about what they should do if a virtue signal in such a situation is a nothing statement.


Isn't it nice to have virtues to signal though? In saying that, you're saying you don't have any worth signaling over.


Not when your actions don’t align with your professed virtues.


I read the statement twice. I can't understand how you landed on "take my money".

Looks like an optics dance to me. I've noticed a lot of simultaneous positions lately, everyone from politicians and protesters, to celebrities and corporations. They make statements both in support of a thing, and against that same thing. Switching up emphasis based on who the audience is in what context. A way to please everyone.

To me the statement reads like Anthropic wants to be at the table, ready to talk and negotiate, to work things out. Don't expect updated bullet-point lists about how things are worked out. Expect the occasional "we are the goodies" statements, however.


I wonder if this might be a setup by competition. Certainly looks like one.


this article is _about_ kissing the ring and damage control. Are you seriously believing at face value? You're ok with spying non us peaceful citizens?


Zubrin's "Hydrogen Hoax" from 2007[1] is basically an ironclad critique. The physics are inescapably poor, and always will be. (Zubrin makes other points in that article which should probably be taken with more salt, but his critique of hydrogen stands).

1: https://www.thenewatlantis.com/publications/the-hydrogen-hoa...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: