Hacker Newsnew | past | comments | ask | show | jobs | submit | macNchz's commentslogin

Yeah there's a weird thing where people would get really focused on whether something is "actually doing RAG" when it's pulling in all sorts of outside information, just not using some kind of purpose built RAG tooling or embeddings.

Now, the pendulum on that general concept seems to be swinging the opposite direction where a lot of those people just figured out that you don't need embeddings. That's true, but I'd suggest that people don't overindex on thinking that means embeddings are not actually useful or valuable. Embeddings can be downright magical in what you can build with them, they're just one more tool at your disposal.

You can mix and match these things, too! Indexing your documents into semantically nested folders for agents to peruse? Try chunking and/or summarizing each one, and putting the vectors in sidecar files, or even Yaml frontmatter. Disks are fast these days, you can rip through a lot of files indexed like that before you come close to needing something more sophisticated.


I’ve created a bunch of fresh Azure accounts over the past few years and each time I’ve found myself sitting there dumbfounded anew at how garbage the experience is.

There has been weird broken jank at just about every step of the process at one point or another. Like, I’m a serious person trying to set something up for a production workload, and multiple times along the way to just having a working account that I can log into with billing configured, I’ll get baffling error messages like [ServiceKeyDepartureException: Insufficient validation expectancy. Sfhtjitgfxswinbvgtt-33-664322888], and the whole thing will simply not work until several hours later. Who knows why!?

I evaluated some Azure + Copilot Studio functionality for a project recently, which required more engagement with their whole 365 ecosystem than I’d had in a long time and it had many of the same problems but worse. Just unbelievably low quality software for the price and how popular it is. Every step of the way I hit some stupid issue. The people using this stuff are clearly not the people buying it.


I've joked that on some services, when you're clicking buttons, you're actually opening tickets that a human needs to action.

That scenario is an example. You complete an action on a web page and nothing works. You make no further changes and hours later it works perfectly. Your human wasn't fast enough that day.


That's the "digital escort" process mentioned in the very long OP. Understandably, the US government got mad when they found out that cheap Chinese tech support staff were being used for direct intervention on "secure" VMs.

That's not what the "problem" was. It's that cheap American support people were "escorting" foreign Microsoft SWEs, so they could manage and fix services they wrote and were the subject matter experts for in the sovereign cloud instances which they otherwise would have no access to.

And this was NOT for the government clouds we have that hold classified data. Those are air-gapped clouds that physically cannot be accessed by anyone who doesnt have a TS clearance and physically go into a SCIF.

source: I work in a team very closely related the team who designed digital escort.


I would definitely fight against calling anything I work on „digital escort”.

Yeah, it’s not a great name. But it originates from the government. When somebody without a security clearance needs to go to a secure area, they must be escorted by somebody.

When the blog post mentioned Hegseth and “digital escort” in the same sentence, I was surprised to learn it wasn’t about his OnlyFans habit at his work desktop.

Yes but this misses the underlying point: this is the same software. It suffers from the same defects. If your management stack keeps crashing and leaking VMs you are seeing a reduction in the operational capacity of the fleet. If you are still there just tour Azure Watson and tell me if you’d want the military to rely on that system in wartime? Don’t forget things like IVAS and God knows what else that are used during operations while Azure node agents happily crash and restart on the hosts. The system should be no-touch and run like an appliance, which is predicated on zero crashes or 100% crash resiliency. In Windows Core we pursued a single Watson bucket with a single hit until it was fixed. Different standards.

I'm only commenting on parent comment's understanding of what digital escort process is specifically. Escort is used by all kinds of teams that are just doing day-to-day crap for various resource providers across azure. I've never worked anywhere close to Azure Core so I don't know about these more low-level concerns. Overall I agree and sympathize with your assessment of the engineering culture.

You also make it sound like getting a JIT approved is getting keys to the kingdom. It's not -- every team has it's own JIT policies for their resources. Should there be far less manual touches? Ideally. But JIT is better than persistent access at least, and JIT policies should be scoped according to principle of least privilege. If that is not happening, it's a failure at the level of that specific org.

Policies vary. The node folks get access to the nodes and the fabric controller by necessity.

I guess we agree on the point where it should not be necessary, which echoes Cutler’s original intent of “no operational intervention.”

This is not an impossible task, after all it’s just user-mode code calling into platform APIs.


200 requests a day, lol

on average :)

> I've joked that on some services, when you're clicking buttons, you're actually opening tickets that a human needs to action.

I just experienced one startup where the buttons just happen to only work during business hours on the US west coast.


Infrastructure-as-a-ServiceNow Ticket

> when you're clicking buttons, you're actually opening tickets that a human needs to action

I had one public cloud vendor sales literally admit this was the case with their platform. But they were now selling "the new one" which is supposed to be better.

It was, a lot. But only compared to the old one.


Beyond just invasive/annoying, ad networks explicitly spread malware and scams/fraud. There's not much incentive for them to clamp down on it, though, as that would cost them money both in lost revenue and in paying for more thorough review.

It'd not even be hard for them to stop it, but they just had to be annoying instead.

When I first started out on the internet, ads were banners. Literally just images and a link that you could click on to go see some product. That was just fine.

However, that wasn't good enough for advertisers. They needed animations, they needed sounds, they needed popups, they needed some way to stop the user from just skimming past and ignoring the ad. They wanted an assurance that the user was staring at their ad for a minimum amount of time.

And, to get all those awful annoying capabilities, they needed the ability to run code in the browser. And that is what has opened the floodgate of malware in advertisement.

Take away the ability for ads to be bundled with some executable and they become fine again. Turn them back into just images, even gifs, and all the sudden I'd be much more amenable to leaving my ad blocker off.


It’s kind of baffling to me that laptops in classrooms took off the way they did, as it seemed like a distraction machine to me even 25+ years ago, as a kid myself! My school got some carts of laptops that would move from classroom to classroom in ~2000—they were heavily used for flash games and other nonsense, and were strictly worse for that than in the dedicated computer lab classroom, where all of the monitors faced into the center of the room where the teacher could see them.

When I got to college a few years later I’d sit in the back of classrooms and see that a majority of students who’d brought a laptop (ostensibly for notes) were consistently distracted and doing something else, be it games or StumbleUpon. I can only imagine these decisions were made by groups of adults sitting around conference rooms, each staring at their own laptop and paying 20% attention to the meeting at hand.


I find that all of Google’s ad products are under-moderated for malicious ads. It’s a choice on their part to not tightly control this—they certainly could, though it would harm their incredible profitability if they did more scrutiny on the ads they show. I personally don’t especially care to pay a premium not to see deepfakes of celebrities promoting crypto scams.

I've been wondering if we'd see a cyber campaign emerge in this conflict. To my knowledge Iran seems to have pretty advanced cyber capabilities and increasingly fewer reasons to hold back. Gloves-off cyber war doesn't sound good to me. The US CISA already been cut back, has lost "virtually all of its top officials"^, doesn't have a permanent director, and is operating at a further reduced capacity because of the DHS shutdown.

^ https://www.cybersecuritydive.com/news/cisa-senior-official-...


> To my knowledge Iran seems to have pretty advanced cyber capabilities and increasingly fewer reasons to hold back.

Iran isn’t alone!! They are a quad along with China, Russia, and North Korea.


that's the thing that people overlook the most in regards to this war.iran isn’t doing this on its own. Russia, China and north korea have been backing it from the start. they’re the ones helping with intel on US base locations across the Middle East, supplying drones, and working out strategies to drag things into a stalemate, plus whatever else iran needs along the way


Can you blame them? Iran is fighting for its own survival and has to find help where it can.

If the US had an educated administration not composed by lap dogs they would've known that attacking Iran was going to be a terrible idea.

Saddam did the same mistake in 1980.

He thought that the Iranian Kurds, the political opponents, the Iranian Arabs, civilians were going to raise against the regime.

None of this happened. None. In fact, hundreds of thousands of people, even kids, rallied around the banner. There are documented stories of 13 year olds, jumping on barbed wire to use their bodies as bridges for infantry. Disgusting, yet telling of the fact that the Persians will do everything to defend their land even if they don't like its leadership.

It's very difficult to convince people you're bombing left that you're helping them get rid of a regime (which, you never know for sure how popular or unpopular it is).

Iranians, yet again, are rallying around the flag for what is effectively a foreign aggression.


Iran has been preparing for this war for 40 years. So has Israel. They will engage in a battle of supremacy over the Middle East. Both want the USA knocked out so that the Americans can't use their influence there anymore (both consider the USA a nuisance).

As soon as ground troops land in Iran, it's over for the USA. As it is, oil and goods shipping via the Persian Gulf and the Red Sea will be controlled by Iran for a very long time to come. All Iran has to do is withstand the pummeling, which it very likely will do. And they'll get plenty of support from China, since this plays into the South China Seas plan quite nicely as the USA moves carrier after carrier out of Asia.


The corpses of Irans’s leadership have us right where they want us


It's relative. We're in a pretty bad spot relative to where we were before the attack, and so is the world economy.

The Iranian regime is doing much better so far, relative to where they should be after a joint military attack from the US/Israel and maybe even relative to where they were just a few months ago.

The previous Ayatollah was 86 and had multiple bouts of pancreatic cancer. He was on deaths door, Iran was destabilizing with bouts of protest and repression, the regime itself suffered major military blows, and a potentially rocky and fractured transition was imminent.

Thanks to the war, the regime survived a transition, and seems consolidated around the son of the former Ayatollah, who's entire family was killed by our strikes, and the US seems largely impotent as Iran chokes off a large portion of the worlds oil supply and strikes at energy assets in the ME.


The son seems to have a distinct lack of “vitality” lol


The thing getting overlooked is all of the recent moves by Trump all lead back to China. Venezuela, Cuba, now Iran. These are all tentacles of China. The aggression against these 3 countries is not a coincidence. It’s a concerted and indirect attack on China in an attempt to weaken their subsidiaries. In the eyes of this administration, this is unpleasant, but necessary housekeeping that should have been done decades ago but no one was willing to spend the political capital to do it.

In Iran, Trump was clearly hoping (and verbally requested) the same thing you say about Sadam. I think we actually do know how unpopular the regime is, the mass protests demonstrated that. But the religious hardliners are the ones with the guns. And they clearly aren’t afraid to use them. So while there was some momentum, after everyone got gunned down in the streets by the IRGC it quickly deflated. So asking unarmed protesters to step up again is kind of big ask, without any material support.


Iranian protesters were not calling for US interference. Let's be very clear about that. They were doing it for their own regime change, not some US imposition. What they think of the US or whether they are for this war or supposed regime change by the US is a totally different consideration.


> The thing getting overlooked is all of the recent moves by Trump all lead back to China.

Are you trying to frame the twice accidental president as some sort of visionary? He doesn’t even remember what he said 5 mins ago. If he had planned or even had any clue about wars, we’d not be in this mess. He insulted Zelenskyy last year but ended up asking for his help.

Do you recall orange phenomenon was asking for China’s help just last week, let’s wait for it, to act against their friends, which you called their subsidiaries :-). You can’t script this horror show, even if you wanted to.


Also, he's pushing the world towards China.

And rightfully so. China isn't killing and kidnapping world leaders, supporting genocides in Gaza, launching military operations, threatening its allies of annexation or overtly interfering in their democratic process.


Russia and North Korea are obviously doing so, but I haven't seen any direct evidence that China is providing intelligence support to Iran, do you have any links? It is certainly plausible, China would love to see Russia tied up in Ukraine and the US tied up in Iran.


There has been speculation that China is letting Iran use their satellites for targeting but it’s not confirmed.

China is for sure providing material for drone and rocket manufacturing as well as air defense systems.

https://moderndiplomacy.eu/2026/01/28/how-china-aims-to-bloc...


I forget all the details but a hacker group associated with Iran already hacked the infrastructure of a major US health care tech company


Stryker. FWIW a friend in ER medicine said it had very very limited effect.


That’s right thanks. The same Hacker group as this story. Yeah I didn’t hear much after the initial breach so I assumed it was minor.

Edit: apparently 80000 employee workstations got remotely wiped. So not so I guess I wouldn’t call that minor.

Also that’s what I get for commenting before reading the story, they mention the Styker incident in the story lol


They got rid of all the trans people (and presumably with them a lot of the furries)... if there's 2 groups you want on your side in a cyber-conflict it's trans women and furries


Opus 4.5 to 4.6 was pretty incremental, I didn't see much of a difference.

The big coding model moments in recent recollection, IMO, were something like:

- Sonnet 3.5 update in October 2024: ability to generate actually-working code using context from a codebase became genuinely feasible.

- Claude 4 release in May 2025: big tool calling improvements meant that agentic editors like Claude Code could operate on a noticeably longer leash without falling apart.

- Gemini 3 Pro, Claude 4.5, GPT 5.2 in Nov/Dec 2025: with some caveats these were a pretty major jump in the difficulty and scale of tasks that coding assistants are able to handle, working on much more complex projects over longer time scales without supervision, and testing their own work effectively.


Maybe they're like me, who didn't spend a lot of time investigating Claude until 4.6 launched and the hype was enough to be the tipping point to invest energy. I do know that I've been having good/great results with Opus 4.6 and the CLI, but after an hour or so, it'll suddenly forget that the codebase has tab-formatted files and burn up my quota trying to figure out how to read text files. And apparently this snafu has been around since at least late last year [0]. Again, I can't complain about the overall speed and quality for my relatively light projects, I'm just fascinated by people who say their agents can get through a whole weekend without supervision, when even 4.6 appears to randomly get tripped up in a very rookie way?

[0] https://github.com/anthropics/claude-code/issues/11447


There's definitely a productivity curve element to getting it to behave effectively within a given codebase. Certainly in the codebases I work with most frequently I find Claude will forget certain key aspects (how to run the tests or something) after a while and need a reminder, otherwise it gets into a loop like that trying to figure out how to do it from first principles with slightly incorrect commands.

I think a lot of the noise about letting Claude run for very extended periods involves relatively greenfield projects where the AI is going to be using tools and patterns and choices that are heavily represented in training data (unless you tell it not to), which I think are more likely to result in a codebase that lends itself to ongoing AI work. People also just exaggerate and talk about the one time doing that actually worked vs the 37 times Claude required more handholding.

The bigger problem I see with the "leave it running for the weekend" type work is that, even if it doesn't get caught up on something trivial like tabs vs spaces (glad we're keeping that one alive in the AI era, lol), it will accumulate bad decisions about project structure/architecture/design that become really annoying to untie, and that amount to a flavor of technical debt that makes it harder for agents themselves to continue to make forward progress. Lots of insidious little things: creating giant files that eventually create context problems, duplicating important methods willy nilly and modifying them independently so their implementations drift apart, writing tests that are..."designed to pass" in a way that creates a false sense of confidence when they're passing, and "forest for the trees" kind of issues where the AI gets the logic right inside a crucial method so it looks good at a glance, but it misses some kind of bigger picture flaw in the way the rest of the code actually uses that method.


Yes, for me I think it was around Nov/Dec 2025, along with harness improvements, and hearing about lots of successes with agenic programming. Having the agent managing its own context and doing the full software engineering loop with writing code, running it, and seeing if it works. That was already there before February 9th.


This is also supported by the Opus degradation tracker [1]. The dotted line is when they switched from Opus 4.5 to 4.6. There's no difference on statistically significant difference the tested benchmark.

1: https://marginlab.ai/trackers/claude-code-historical-perform...


Watching over shoulders as elderly people watch YouTube with ads and engage with clips of deepfake celebrities selling fraudulent nonsense is both enlightening and painful.


This is a really cool implementation—embeddings still often feel like magic to me. That said, this exact use case is sort of also my biggest point of concern with where AI takes us, much more so than most of the common AI risks you hear lots of chatter about. We live in a world absolutely loaded with cameras now but ultimately retain some semblance of semi-anonymity/privacy in public by virtue of the fact that nobody can actually watch or review all of the video from those cameras except when there is a compelling reason to do so, but these technologies are making that a much more realistic proposition.

The presence of cameras everywhere is considerably more concerning than the status quo, to me at least, when there is an AI watching and indexing every second of every feed—where camera owners or manufacturers or governments could set simple natural language parameters for highly specific people or activities notify about. There are obviously compelling and easy-to-sell cases here that will surely drive adoption as it becomes cost effective: get an alert to crime in progress, get an alert when a neighbor who doesn't clean up after his dog, get an alert when someone has fallen...but the potential implications of living in a panopticon like this if not well regulated are pretty ugly.


It's being built as we speak. I attended at a city council meeting yesterday, discussing approving a contract for ALPR cameras. I learned about a product from the camera vendor called Fusus[0], a dashboard that integrates various camera systems, ALPRs, alerts, etc. Two things stood out to me: natural-language querying of video feeds, and future planned integration with civilian-deployed cameras. The city only had budget for 50 ALPRs, and they stressed how they're only deploying them on main streets, but it seems like only a matter of time before your neighbor is able to install a camera that feeds right into the local PD's AI-enabled systems. One council member raised concerns about integrations with the citizen app[1] specifically (and a few others I didn't catch the names of). I'm very worried about where all this is heading.

[0]: https://www.axon.com/products/axon-fusus [1]: https://citizen.com/


I live in Oxford, UK and walked past a police van that said "automatic facial recognition in use". Not exactly a good sign without any caveats. I imagine they recorded me staring at their van.


Totally valid concern. Right now the cost ($2.50/hr) and latency make continuous real-time indexing impractical, but that won't always be the case. This is one of the reasons I'd want to see open-weight local models for this, keeps the indexing on your own hardware with no footage leaving your machine. But you're right that the broader trajectory here is worth thinking carefully about.


It's 2.50 an hour because Google has margins. A nation state could do it at cost, and even if it's not a huge difference, the price of a year's worth of embeddings is just $21,900. That's a rounding error, especially considering it's a one time cost for footage.


Right? $2.50 an hour is trivial to a Government that can vote to invent a trillion dollars. Even just 1 million dollars is the cost of monitoring 45 real time feeds for a year. I'm sure just many very rich people would pay that for the safety of their compound.


How are you getting to $2.50/hr ? The price sheet says its 0.00079 per frame.

https://ai.google.dev/gemini-api/docs/pricing#gemini-embeddi...


From what I see the code downsamples video to 5 fps, so 1 hour of video is 3600 seconds * 5 fps = 18,000 frames. 18,000 frames * $0.00079/frame = $14.22. A couple dollars more with the overlap.

(The code also tries to skip "still" frames, but if your video is dynamic you're looking at the cost above.)


you're right that the code uses ffmpeg to downsample the chunks to 5fps before sending them, but that's only a local/bandwidth optimization, not what the api actually processes.

regardless of the file's frame rate, the gemini api natively extracts and tokenizes exactly 1 fps. the 5 fps downscaling just keeps the payload sizes small so the api requests are fast and don't timeout.

i'll update the readme to make this more clear. thanks for bringing this up.


Thanks for the details and correction.


Most cameras are also not queryable by any one person or organization. They are owned by different companies and if the government wants access they have to subpoena them after the fact.

The problems start cropping up when you get things like Flock where governments start deploying cameras on a massive scale, or Ring where a single company has unrestricted access to everyone's private cameras.


I think Flock is just a symptom of the underlying tech becoming so cheap that "just blanket the city in cameras" starts to sound like a viable solution when police rely so heavily on camera footage.

I don't think it's a good thing but it seems the limiting factor has been technological feasibility instead of any kind of principle against it.


Yeah, the panopticon is now technically very feasible it's just expensive to implement (for now).


Its very cheap to target an individual though so they dont need to look everywhere


Once the hardware to run inference for something like the vision understanding module of this can be run on a low / medium power asic drones are going to be absolutely horrifying weapons.



All the major cloud providers offer some form of face detection and numberplate reading, with many supporting object detection (ie package, vehicle, person) out of the camera itself.


It's definitely creeping into things, though most of the features I've seen are fairly simplistic compared to what would be possible if the video was being reviewed + indexed by current SoTA multimodal LLMs.


> this exact use case is sort of also my biggest point of concern with where AI takes us, much more so than most of the common AI risks you hear lots of chatter about.

I've been hearing warnings that AI would be used for this since well before it seemed feasible.


Not claiming to have hit on something unique here, but I think it’s realistic and often drowned out in favor of sci-fi nonsense.


For specific people they probably wouldn’t use general embeddings. These embeddings can let you search for “tall man in a trenchcoat” but if you want a specific person you would use facial recognition.


I think a general description is better for surveillance/tracking like this, no? If they're at a weird angle or intentionally concealing their face then facial recognition falls apart but being able to describe them naturally would result in better tracking IMO.


Presumably the ideal is some kind of a fusion. Upload or tag some images/videos and link someone's social profiles and the system can look out for them based on facial recognition, gait recognition, vehicle/pets/common wardrobe items in combination.


Was curious—good number of projects out there with an un-pinned LiteLLM dependencies in their requirements.txt (628 matches): https://github.com/search?q=path%3A*%2Frequirements.txt%20%2...

or pyproject.toml (not possible to filter based on absence of a uv.lock, but at a glance it's missing from many of these): https://github.com/search?q=path%3A*%2Fpyproject.toml+%22%5C...

or setup.py: https://github.com/search?q=path%3A*%2Fsetup.py+%22%5C%22lit...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: