More

6r17 · 2026-05-31T18:20:20 1780251620

Hei more than that - as an avid weed consumer - drugs are not dangerous by themselves especially when you don't like them - they are dangerous when you create space for them and get along with them. I like to believe weed is my little "cheat" - but i'd argue that any product is an abuse on a daily basis - so taking creatine once in a while might be fun to try out - but i would warn about anything on the spectrum of "regime" that plan for a daily basis etc... especially as you rightfully justified, without a doctor or an expert being able to know what effect this is going to have on a said person.

aeonik · 2026-05-31T20:55:40 1780260940

You need creatine to live. Your body typically makes enough from protein to survive (vegetarians and vegans might need extra vigilance here).

This is just taking more to optimize performance.

It's not the same.

6r17 · 2026-05-31T22:07:04 1780265224

right...

https://www.google.com/search?q=artificial+creatine+side+eff...

kidney damage. liver damage. kidney stones. weight gain. bloating. dehydration. hair loss. muscle cramps.

The fact that you estimate weed as a drug is correct - the fact that you don't realize that everything is basically a drug depending on the amount an potency is worrying.

Hatrix · 2026-05-31T22:26:40 1780266400

Water is bad for you if you drink too much.

6r17 · 2026-05-31T23:01:34 1780268494

Yes because taking advice to medicine itself everyday based on an online entrepreneur forum to "boot-yourself" is 100% a good advice that you should take blindly without taking a notice on what you are doing.

You are so right.

matwood · 2026-06-01T09:10:32 1780305032

> based on an online entrepreneur

Creatine is one of the most studied supplements over the last 30+ years. First, it is proven to work for athletic performance and second, it's relatively safe. The AI overview of your Google link even says no evidence of serious harm. If people have underlying health issues and/or they take way more than recommended there can be side effects - just like anything else we put in our bodies.

6r17 · 2026-05-19T21:30:25 1779226225

Very cool work ! I'm running harness system myself and could measure improvement of token use of 2x to 10x on gsm8k only by running a math harness - i'm confident the future is bright for people who will know how to sell tech that is appropriately scaled to one's need. We absolutely do not need to run Claude 123 for most tasks and we better prepare for the rag-pull !

andai · 2026-05-20T12:26:55 1779280015

A while back when the latest Big Model came out, very impressive benchmarks, I tested it on some coding tasks.

I gave it 3 simple changes to make. It did it perfectly.

Then I tried with a much smaller model. It also did it perfectly, except 3x faster and 9x cheaper.

I used to think "best model" was what's at the top of the benchmarks, but for most tasks that just means you're going to wait longer and pay more money. The right model depends on the job.

(Also, speed itself is a feature -- when you get the really fast models, it enables a kind of real-time interactive usage that is otherwise not possible in the "alt tab and hope it's done" workflow.)

zambelli · 2026-05-20T13:01:25 1779282085

Definitely! A lot of tasks are within reach of small models, much more than people would think. Big models still shine in vague contexts or for breadth, or for very long running tasks, but yeah. The small ones just need help on longer multi-step workflows.

What small models have you used most/found most stable?

6r17 · 2026-05-19T14:07:31 1779199651

open-bsd will always feel like a safe pick for anything in regard to vault or key holding ; it's not appropriate to run anything CPU intensive - but it's a very appropriate system for anything that just need to boot up and hold some data ; eventually expose a network interface.

6r17 · 2026-05-18T09:19:59 1779095999

But can you prove that we cannot identify someone using that data ? First thing to come in my mind is the audio that we can plug-in ; that-said it's a very nice idea, package and maybe even product - kudos !

6r17 · 2026-05-17T14:43:25 1779029005

You have stated everything in your answer. I want to point out that the problematic starts with who controls the safety. Yes tech-constructors should be obligated to build their software such that the end-user can exercise any kind of required control and yes the parent should be liable. None of this require the government forcing identity through the OS layer.

6r17 · 2026-05-04T03:30:16 1777865416

The problem with that statement is that a lot of people who yield it fail to see the advantages that come with these extra shenanigans ; and let's just take pure concealment so I don't pushing weird arguments ; in the age of AI - each time we are able make an attacking AI misaligned we are essentially buying time ; an on-going attack is never a on-shot event ; it's an ongoing process where the attacker has to understand where it is located and what it can do ; since each element will be a resource ; do not let it have it in the first place.

It's a bit of an elitist view of security that romanticize concepts without thinking about what they can actually be used for. My personal bad experience with that was a manager who was stating me that having a different subdomain for the admin panel was a concealment and not a security practice.

I mean - it's very easy to see how this kind of argument actually prevents from doing something that can help just on the basis of philosophical purity - which often just miss the point - security is not a mechanism that will solve all your problems ; heck in fact I have to layer at least 4 mechanisms just on the http interface to feel safe ; it's more of a lot of layers that together form a barrier ;

We sit too much on TLS thinking "That's it, security job is done" - then we get some crazy stuff like French ANTS that get pawned with some IDOR ; as IF f* using some hash or something ; ANYTHING PLEASE F* HELL ; would have not helped

6r17 · 2026-05-01T15:32:07 1777649527

I'd be worry about security tbf - this sounds cool until it's used to host some weird shenanigans and nobody has any kind way to tell who did what

keepamovin · 2026-05-15T13:31:51 1778851911

Well it's linked to your GH account, and surely GH has logs of all the workflows that get run, so it's the same as regular Actions.

6r17 · 2026-04-27T04:09:22 1777262962

On a less dramatic pissed (rightfully) reading ; I have found that if you do give the capability to a LLM to do something ; it will be inclined to see this as an option to solving what it what asked to ; but then giving the instruction by negative present very poor results whereas the same can be driven by a positive one ; a "don't delete the database" becomes "if you want to reset the database you have a tool that you can call ..." ; at which point this tool just kills the agent. That said - this solution cannot guarantee by itself that the command is not ran ; but i'd argue that people have be writing more complex policies for ages - however the current LLM-era tend to produce the most competent idiots.

cwsx · 2026-04-27T04:32:48 1777264368

I tell people to treat LLM's like a toddler (albeit a very capable toddler).

Do kids learn well when you only tell them what NOT to do? Of course not! You should be explaining how to do things correctly, and most importantly the WHY, as well as providing examples of both the "correct" and "incorrect" ways (also explaining why an example is incorrect).

bostik · 2026-04-27T06:58:17 1777273097

The best way to describe AI agents I've heard: treat them as hostages that will do anything to appease their captor.

They have a vast latent knowledge base, infinite patience and zero capacity for making personal judgement calls. You give one a goal and it will try to meet that goal.

generic92034 · 2026-04-27T07:34:18 1777275258

> The best way to describe AI agents I've heard: treat them as hostages that will do anything to appease their captor.

A scary image, if we consider agents to develop anything like a conscience at some point in time. Of course, with the current approach they never might, but are we so sure?

palmotea · 2026-04-27T06:00:08 1777269608

> I tell people to treat LLM's like a toddler (albeit a very capable toddler).

Bbbbut a guy from Anthropic, just this last Friday, told me to think of Claude as my "brilliant coworker"! Are you telling me that's not true!?

boc · 2026-04-27T05:11:20 1777266680

LLMs can research what a tool does before calling it though - they'll sniff that one out pretty quick.

I think the better route is to be honest and say that database integrity is a primary foundation of the company, there's no task worth pursuing that would require touching the database, specifically ask it to think hard before doing anything that gets close to the production data, etc.

I run a much lower-stakes version where an LLM has a key that can delete a valuable product database if it were so inclined. I've built a strong framework around how and when destructive edits can be made (they cannot), but specifically I say that any of these destructive commands (DROP, -rm, etc) need to be handed to the user to implement. Between that framework and claude code via CLI, it's very cautious about running anything that writes to the database, and the new claude plan permissions system is pretty aggressive about reviewing any proposed action, even if I've given it blanket permission otherwise.

I've tested it a few times by telling it to go ahead, "I give you permission", but it still gets stopped by the global claude safety/permissions layer in opus 4.7. IMO it's pretty robust.

Food for thought.

not_kurt_godel · 2026-04-27T06:32:31 1777271551

> specifically ask it to think hard before doing anything that gets close to the production data

This is recklessly negligent and I would personally not tolerate a coworker or report doing it. What's next, sending long-lived access tokens out over email and asking pretty please for nobody to cc/forward?

boc · 2026-04-27T17:05:41 1777309541

As described, there are other failsafes as well. The ultimate being that I keep all code version-controlled, and all databases snapshotted offsite daily/hourly and can rebuild them from a complete delete in fewer than X min.

My broader point is that LLMs are going to need access to these keys whether we like it or not, and until we get extremely scoped API permissions (which would make a ton of sense, but most services aren't there), you have to live a bit on the edge to move quickly.

not_kurt_godel · 2026-04-27T19:11:49 1777317109

> The ultimate being that I keep all code version-controlled, and all databases snapshotted offsite daily/hourly and can rebuild them from a complete delete in fewer than X min.

Mitigation is good, but what's preventing your sudo-privileged LLM from disabling/corrupting/deleting on-site backups either directly or by proxy via access to the DB and code that writes to it?

boc · 2026-04-27T20:07:08 1777320428

It's a good question. I think it's similar to the question about an employee having sensitive access, and whether they'll get blackout drunk one night and delete everything. Or they get spearfished and get owned (prob more likely).

In the future, I could see this solved by the same "nuclear launch key" style delegation of keys. Aka in order to run certain API or database commands, the service requires both the standard dev key (presumably used by the LLM) and a separate "human admin key" that gets requested whenever a specific operation is requested. It could be tied to a biometric request or something as well to avoid the LLM hacking its way around it. Honestly this is pretty out of my technical depth but just thinking out-loud.

not_kurt_godel · 2026-04-28T04:01:21 1777348881

The difference with a rogue employee is they can be held accountable so they are verily heavily incentivized to avoid doing that (and hopefully also by the good pay and work environment you are providing them).

And, a lot of DevOps/SecOps at scale is concerned with mitigating potential rogue or dangerously incompetent employees. You don't let your juniors push senior-unreviewed code, much less let them anywhere near the keys to kingdom if you can help it.

boc · 2026-04-28T18:47:08 1777402028

Very fair points! I think I'll re-assess how I'm handling my setup. Unfortunately I don't have a dedicated devOps team, but still want to do my best to prevent those types of outcomes.

kamaal · 2026-04-27T05:57:58 1777269478

>>LLMs can research what a tool does before calling it though

Thats stretching the definition of 'research', it basically checks if the texts are close enough.

Delete can occur in various contexts, including safe contexts. It simply checks if a close enough match is available and executes. It doesn't know if what it is doing is safe.

Unfortunately a wide variety of such unsafe behaviours can show up. I'd even say for someone that does things without understanding them. Any write operation of any kind can be deemed unsafe.

EagnaIonat · 2026-04-27T07:26:43 1777274803

> specifically ask it to think hard before doing anything that gets close to the production data, etc.

Standard rule is you never let your developers at the production instance. So I can't see why an LLM would get a break.

Jean-Papoulos · 2026-04-27T06:35:25 1777271725

"I've put enough safety around the bomb that the bomb is worth using. The other people that exploded just didn't have enough safety but I do !"

boc · 2026-04-27T17:07:38 1777309658

More like, I expect this bomb can explode, so I've built contingency plans around it because the cost of not using the tooling is much higher than having downtime for my specific use-case.

yowlingcat · 2026-04-27T05:59:36 1777269576

It's been a very strange realization to have with AI lately (which you have reminded me of) because it also reminds me that the same thing works with humans. Not the killing part at least, but the honeypot and jailing/restricting access part.

Probably because telling someone not to do something works the 99% of the time they weren't going to do it anyways. But telling somebody "here's how to do something" and seeing them have the judgment not do it gives you information right away, as does them actually taking the honeypot. At the heart of it, delayed catastrophic implosions are much worse than fast, guarded, recoverable failures. At the end of the day, I suppose that's been supposed part of lean startup methodology forever -- just always easy in theory and tricky in practice I suppose.

6r17 · 2026-04-26T05:17:19 1777180639

Whether there is a single app or not doesn't really matter - i'm more concerned about the database itself and the inter-connectivity between them and most importantly by which control acceptance protocol we abide between states.

The idea that we want a single database or a network without any kind of control is frightening me

delusional · 2026-04-26T05:40:41 1777182041

What do you mean by "control" here? It's my understanding that EU law afford citizens the right to correct data that is wrong about them.

choo-t · 2026-04-26T06:04:24 1777183464

The problem is not about the data being correct or not, it's about its existence in the first place.

Why would you correct data about you very own surveillance ?

grey-area · 2026-04-26T07:15:54 1777187754

Governments need to identify citizens. They currently do this via paper records and extensive digital databases that those tie into. They will in future do this via digital records/tokens but this won’t change much.

Some amount of id verification and surveillance is of course required for a government to function, the question should be more what is allowed and what is not.

delusional · 2026-04-26T06:12:07 1777183927

Is all data about you "surveillance". When your doctor produces a medical record after your visit, are they "surveilling" you? How about when the railway company stores your travels to bill you later?

I'll assume your answer is no, and I that case surely you must see the value in that medical record being correct.

choo-t · 2026-04-26T06:34:05 1777185245

Are you equaling mass surveillance to a doctor keeping track of your health for diagnostic accuracy purpose ?

Concerning the railway example, they only need to store how much I owe them, not my travels. Storing travel history on their end is already surveillance.

Data keeping purpose and consents are what make something surveillance or not. Forcing every citizen to use ID to access the web is surveillance plain and simple.

delusional · 2026-04-26T07:33:16 1777188796

> Are you equaling mass surveillance to a doctor keeping track of your health for diagnostic accuracy purpose ?

No, I am legitimately asking to clarify your position, hence why I assumed you wouldn't call that surveillance. The point was for us to agree that the right to correct data is a meaningful and useful right to have.

Once we've clarified that, the rest of the arguments comes down on the separation of "surveillance" from "record keeping", a separation you attribute to "Data keeping purposes and consents". That aligns with current EU law, and I largely agree with treating that as a separation point. If you have a valid purpose, either by law or by duty to your customer, you get to keep records necessary to fulfill that need. I would note that these "duty to your customer" clauses are usually pretty broad and would, I imagine, allow the railroad company to keep and process your travel record for fraud prevention purposes.

The issue we encounter is what a valid "data keeping purpose" is, and if we trust our public institutions and infrastructure to govern that question. Especially when the potential data processors is a government agency. This I'm entirely uninterested in debating that question with a rando on HN. We likely live in two very distinct regulatory frameworks and have vastly different local governments. There's no basis for us to agree here.

I would however end by noting that the two clauses of your statement

> Data keeping purpose and consents are what make something surveillance or not.

and

> Forcing every citizen to use ID to access the web is surveillance plain and simple.

Are in tension with one another. Clause 1 opens up for the idea that there exists valid "non-surveillance" record keeping, and that the distinction of such record keeping from surveillance requires determination of consent and purpose. Clause 2 then foregoes that determination and just presupposes the argument. All ID checks are definitionally surveillance irrespective of purpose and consent.

In the current legal framework, government derives it's unilateral consent from the vote. If the law passes in a democratic system then it is, by that very process, a consensual and valid purpose.

choo-t · 2026-04-26T07:51:26 1777189886

> Are in tension with one another. Clause 1 opens up for the idea that there exists valid "non-surveillance" record keeping, and that the distinction of such record keeping from surveillance requires determination of consent and purpose. Clause 2 then foregoes that determination and just presupposes the argument.

"Forcing" highlights the lack of consent, the distinction is still present.

> In the current legal framework, government derives it's unilateral consent from the vote. If the law passes in a democratic system, then it is, by that very process, a consensual and valid purpose.

Absolutely not. Being voted in a parliament doesn't mean citizens consented to it.

Simple example: compulsory military enrollment vs voluntary military enrollment. Only one of them derive from consent, even if both derive from a law discussed in parliament.

harvey9 · 2026-04-26T07:06:57 1777187217

Since you are bringing a semantic argument you might like to know that your doctor does in fact surveil you, hence the term "public health surveillance"

delusional · 2026-04-26T08:13:07 1777191187

I have never heard the term "public health surveillance" and it's hypothetical use has no bearing on my argument.

6r17 · 2026-04-26T07:03:27 1777187007

I mean that there is a big difference between a state automatically providing your data to any other state while having "their database disconnected" - and a human operator in the loop and an administrative verification of the appropriate access ;

For example this would allow a state to refuse access to the PI of their citizens for cases that are not administratively documented. This forces the access audit sufficiently that a malign actor cannot simply request data for a citizen without having probable cause ; another vector we want to protect ourselves against is simply the psycho/sociopaths that have access to these data without surveillance.

hiciu · 2026-04-26T18:54:23 1777229663

Whats your source for the database sharing claim?

The way I understand it is more like tls certs, with each country managing their own root cert.

6r17 · 2026-04-23T03:48:14 1776916094

I tried it not long ago - it's really cool just a tad sad that the rust eco-system didn't allow verus to be more streamlined in the tool and requires these little shenanigans with a different build of it - it felt a bit clunky to swap cargo for the verus one ; but the tool is definitely needed right now

mirashii · 2026-04-23T05:38:24 1776922704

Do you have any reference to the Rust community “not allowing” something? This seems more like a case of a relatively niche tool doing what it needed to do to work, but not (yet) some broader effort to upstream or integrate this into cargo or rustup. I couldn’t find any RFCs or anything, for instance.

scott_w · 2026-04-23T06:01:24 1776924084

I didn’t read OP as saying “the community won’t allow” but more “the tooling doesn’t allow” for what they want to do.

yencabulator · 2026-04-23T13:22:29 1776950549

So far out of what I've looked at, Kani blends into the Rust language best. Verified code snippets look a lot like unit tests.

  #[kani::proof]
  fn check_my_property() {
     // Create a nondeterministic input
     let input: u8 = kani::any();

     // Call the function under verification
     let output = function_under_test(input);

     // Check that it meets the specification
     assert!(meets_specification(input, output));
  }

It looks like fuzzing, but it's proving no-panic for all possible values symbolically. If only it handled loops better :-/

Verus wraps everything in its macro that makes rust-analyzer etc unhappy.