Every morning I would check AWS billing just out of habit. I'm just thankful I d...

epiphanitus · on March 29, 2020

SRE here. I feel for your situation. Here's some advice. One simple thing you could do is set up AWS billing alarms and have them delivered to a notification app like PagerDuty.

https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitori....

If you don't want to pay for PD, you can patch together any number of ways to get your phone to scream and holler when it gets an email from ohshit@amazonasws.com. It's also good to have clear expectations as to whose responsibility it is to deal with problem x between the hours of y and z and exactly what they are supposed to do.

Keep the alerts restricted to the really important stuff, because if your team becomes overloaded with useless alerts they will 1) dislike you and 2) be more prone to accidentally mistaking a five alarm fire for a burnt casserole.

There are more complex systems you could build, but that's a start.

HenryBemis · on March 29, 2020

Thank you for this. How can anyone run ANY service with ANY company and not add a clause in the contract (and then have the alerts up an running) in controlling costs?

I remember PagerDuty was advertising (a lot) on Leo Laporte's podcasts a few years back.

A clause in the contract: if monthly bill reaches $Xk amount then:

(a) seek written approval by client, and

(b) continue until $Yk or approval is given with a new ceiling price.

MattSayar · on March 30, 2020

I was just playing around with AWS a while ago and was surprised that I could not find any option to put a cap on the amount I'd spend in a month. Only thing I could do was set up alerts.

I imagine AWS would have 0 problems suspending all my services if I can't pay, so why can't it do the same thing when it reaches my arbitrary cap?

lowercased · on March 29, 2020

> I'm never getting into another startup which has financial risk like that without being a core expert in that risk/tech

This may be something that is 'unstated', but unless you actually had access to fix something that was wrong, as well, being an expert in that wouldn't really help all that much. I've been in situations where I have explicit/expert knowledge of XYZ, but when the people responsible for XYZ do not take your input, and/or don't provide you the ability to fix a problem, expert knowledge is useless (or worse, it's like having to watch a train wreck happen when you know you could have stopped it).

Aeolun · on March 29, 2020

This. But on the other hand, you can be ready with the popcorn when shit eventually does hit the fan.

bluecmd · on March 30, 2020

And then have to live with asking yourself "could I have done more?"

dmos62 · on March 30, 2020

As in beer and crisps? /s

amiga · on March 30, 2020

"...could I have saved the day if I were willing to loudly complain until someone listened?"

justinclift · on March 29, 2020

On the other hand, it sounds like you hired someone who wasn't really up for the level of responsibility given. :(

In theory ;), you shouldn't have to be a core expert in everything. But yeah... in the real world, things aren't so cut and dry. :/

nitely · on March 29, 2020

TBH, the real problem is AWS bills cannot be capped in any way (you can setup an alarm, though). It's unreasonable to expect a programmer won't make mistakes.

manigandham · on March 29, 2020

Of course they can be capped, you just turn off the services. If you're asking them to automate that for you, then the counterpoint would be people accidentally setting a budget that wipes out their resources and complaining about that.

Easier for both sides to just ask AWS for a refund if there's a reasonable case.

nicoburns · on March 29, 2020

> the counterpoint would be people accidentally setting a budget that wipes out their resources and complaining about that.

This wouldn't be an issue if it was configurable.

manigandham · on March 29, 2020

Mistakes will always be an issue. How you recover is more important.

Would you rather make a mistake leading to a big bill with the possibility of a refund or set your max budget and have your resources permanently deleted?

nicoburns · on March 29, 2020

There would be no need to delete existing resources. Just prevent me from creating new ones until action is taken. For small projects in particular, I'd much rather have service taken offline and an email notification than even a $1000 bill. And $1000 is small in the scale of what you could end up with on AWS.

manigandham · on March 30, 2020

It's the existing resources that are a problem because most of them have a steady-state cost.

EC2 instances, EBS volumes, S3 data... should AWS delete those when you hit your budget? How do you stop the billing otherwise?

justinclift · on March 30, 2020

> How do you stop the billing otherwise?

With prioritisation, so the non-steady state services are stopped/killed with plenty of time to leave the needed foundations still running. :)

manigandham · on March 30, 2020

1) If you're AT the budget amount then everything must be deleted to avoid going over.

2) If it's a soft budget then it's no different than the alarms you already have.

3) If you want to stop it before it hits the budget, then you're asking for a forecasted model with a non-deterministic point in time where things will be shutdown.

This just leads to neverending complexity and AWS doesn't want this liability. That's why they provide billing alarms and APIs so you can control what you spend.

Dylan16807 · on March 30, 2020

> 2) If it's a soft budget then it's no different than the alarms you already have.

Not if I'm busy, or away from work, or asleep. There is a massive difference between getting an alarm (which is probably delayed because AWS is so bad at reporting spent money) versus having low priority servers immediately cut.

Even without a priority system, shutting down all active servers would be a huge improvement over just a warning in many situations.

manigandham · on March 30, 2020

That's not a soft budget then, so which option is it? 1 or 3?

You want it to selectively turn off only EC2? Does it matter which instance and in which order? What if you're not running EC2 and it's other services? Is there a global priority list of all AWS services? Is it ranked by what's costing you the most? Do you want to maintain your own priority of services?

And what if the budget was a mistake and now you lost customers because your service went down? Do you still blame AWS for that? Or would you rather have the extra bill?

There is no easy solution.

plorkyeran · on March 30, 2020

It's really not that complicated. "Stop paying for everything except for persistent storage" is sufficient for the majority of use-cases where a soft cap would be appropriate. When you need to do anything fancier, you can just continue to use alarms as you do now. A tool does not have to solve every problem that might ever exist to be useful.

manigandham · on March 30, 2020

It's really not that complicated... to watch your own spend. But yet everyone here keeps running into issues, and that's just with your own projects. I'm sure you can at least appreciate the complexities involved at the scale of AWS where even the minority use-cases matter.

"Everything except for persistent storage" is nowhere near useful enough to work and can cause catastrophic losses. Wipe local disks? What about bandwidth? Shutdown Cloudfront and Lambda? What about queues and SNS topics? What about costs that are inseparable from storage like Kinesis, Redshift, and RDS? Delete all those too? And as I said before, what happens if you set a budget and AWS takes your service down which affects your customers?

It's easy to say it's simple in an HN comment. It's entirely different when you need to implement it at massive scale and that's before even talking about legal and accounting issues. There's a reason why AWS doesn't offer it.

mewpmewp2 · on March 30, 2020

Just shut down everything, but don't delete existing data written to disks. That can cover a wide array of budget problems. If you set a budget like that you really do not want to go over it and any potential loss from customers is not as huge as going over that budget. At least have that option.

I sometimes for example fiddle with Google APIs. I do not even have customers so don't really care if things will stop working, but I have accidentally spent 100 euros or more. I have alerts, but those alerts arrived way too late.

I make a loop mistake in my code and now I suddenly owe 100 euros...

manigandham · on March 30, 2020

> "Just shut down everything, but don't delete existing data written to disks."

I literally just explained why this doesn't work with AWS services. You will have data loss.

And it creates a whole new class of mistakes. If people mistakenly overspend then they'll mistakenly delete their resources too. All these complaints that AWS should cover their billing will then be multiplied by complaints that AWS should recover their infrastructure. No cloud vendor wants that liability.

ghaff · on March 30, 2020

It's not an unreasonable use case to just nuke everything if your spend exceeds some level. (I'm just playing around and want to set some minimal budget.) But, yes, implement that and you will see a post on here at some point about how my startup had a usage spike/we made a simple mistake and AWS wiped out everything so we had to close up shop.

ADDED: A lot of people seem to think it's a simple matter of a spending limit. Which implies that a cloud provider can easily decide:

1.) How badly you care about not exceeding a spending threshold at all

2.) How much you care about persistent storage and services directly related to persistent storage

3.) What is reasonable from a user's perspective to simply shutdown on short notice

Dylan16807 · on March 30, 2020

Don't let the perfect be the enemy of the good. In so many use cases, shutting off everything except storage would do a good job. And the cloud provider doesn't have to decide anything. It's a simple matter of setting a spending limit with specified semantics. A magic "do what I want" spending limit is not necessary.

manigandham · on March 30, 2020

> "shutting off everything except storage would do a good job"

Except it wouldn't. This is the 3rd time in this thread explaining that. Edge cases matter, especially when creating leading to new mistakes like setting a budget and deleting data or shutting off service when customers need it most.

If it's not a hard budget but a complex set of rules to disable services... then you already have that today. Use the alarms and APIs to turn off what you don't need.

Dylan16807 · on March 30, 2020

Edge cases are the difference between good job and perfect job. It makes no sense to use edge cases to say it qualifies as neither.

> If it's not a hard budget but a complex set of rules to disable services... then you already have that today. Use the alarms and APIs to turn off what you don't need.

I have been describing a simple set of rules, not a complex one.

It used to be extremely difficult to get accurate usage data on all their services. Has that been fixed? If not, then the alarms aren't good enough. If the alarms can automate enough right now, in a non-buggy way, then that should be the answer to people "hey, the alarms do more than alarm, use them to trigger shutdowns". Don't say "it can't be done, sorry". If the alarms aren't good enough for that automation, then the argument stands.

And using the APIs means that each company that wants safety is duplicating effort in an almost untested way, a recipe for so many bugs it makes the problem worse. No, this needs to be a feature of AWS itself.

jfkebwjsbx · on March 30, 2020

Keeping those resources for a week but completely inaccessible would not be a huge cost for AWS yet a very big relief for startups.

manigandham · on March 30, 2020

And this happens every time you go over budget? So it's a constant monthly emergency credit? Or extended free tier? Is there a dollar cap on that? What happens if you go over that?

Not so simple.

dragonwriter · on March 30, 2020

> Of course they can be capped, you just turn off the services.

That's not a he's cap, since turning off services isn't instant and costs continue to accrue. But, yes, there are ways to mitigate the risk of uncapped costs and they are subject to automation.

manigandham · on March 30, 2020

See the sibling comment thread. It's just not that simple. It creates a lot of liability, could lead to permanent data loss, and doesn't really prevent any mistakes either (just swaps them for mistakes in budget caps).

AWS would rather lose some billings than deal with the fallout of losing data or critical service for customers (and in turn their customers).

thayne · on March 30, 2020

it depends on the use case. For example, I would like to have developer accounts with a fixed budget that developers can use to experiment with AWS services, but there isn't a great way to enforce that budget in AWS. In this case I don't really care about data loss, since it's all ephemeral testing infrastructure.

In theory I could build something using budget alarms, apis, and iam permissions to make sure everything gets shut down if a developer exceeds their budget, but if I made a mistake it could end up being very expensive. Not that I don't trust developers at my company to use such an account responsibly, but it is very easy to accidentally spend a lot of many on AWS, especially if you aren't an expert in it.

manigandham · on March 30, 2020

So now we have another potential mistake - you setup a "delete everything/hard budget" for a production account instead of a developer account. What then?

It's impossible for AWS to know how to handle hard caps because there are too many ways to alter what's running and it's too contextual to your business at that moment. That's why they give you tools and calculators and pricing tables so that it's your responsibility (or a potential startup opportunity).

Money is easy to deal with. Alarms work. Bills can be negotiated. But you can't get back lost data, lost service, or lost customers.

ngcc_hk · on March 30, 2020

Should be cap so you have a check. If your system does not allow threshold or assertion, please do not use it. If your cloud system do not have capped budget so you play in and alert you when you soon run out, do not use it.

EpicEng · on March 29, 2020

>In theory ;), you shouldn't have to be a core expert in everything. But yeah... in the real world, things aren't so cut and dry. :/

Right. In my experience, if you don't understand what's going on beneath your abstractions, you're always in for a world of hurt as soon as something goes sideways.

glenngillen · on March 29, 2020

Did you reach out to AWS support or your account manager? They’d definitely have worked something out.

shawabawa3 · on March 30, 2020

Did you contact AWS and let them know it was a mistake?

They have a good track record of cancelling huge bills the first time they happen

seibelj · on March 29, 2020

Assuming you were incorporated and had a business account - declare bankruptcy and the bill goes away. I don’t understand why you would still pay the bill if you were going out of business anyway.

appstorelottery · on March 29, 2020

Why didn't I file bankruptcy? This happened in Australia and declaring bankruptcy was not the right thing to do - for many reasons, not the least of which it makes it much harder to operate as a director of a previously bankrupt company, but in the worst case my bank would have just gone after me as I'd given a personal guarantee.

garmaine · on March 29, 2020

There is no concept of limited liability in Australia?

jkaplowitz · on March 29, 2020

Even in the United States, most small business loans require personal guarantees which narrowly override the corporate limited liability to make that guarantor liable for that debt if the company doesn't pay. There are some rare exceptions, and possibly more for startups funded by big-name VCs, but I don't know.

Scoundreller · on March 29, 2020

But this isn't a small business loan: it's a debt to Amazon.

dragonwriter · on March 30, 2020

I read that as the business owner had a preexisting business loan with a personal guarantee.

namdnay · on March 29, 2020

Except the loan money will go straight to Amazon, and you are now unable to repay the loan to the bank

vetinari · on March 29, 2020

Where exactly does the bank enter the picture?

Scenario 1: Amazon will ask for the payment (if using cc); the bank will respond there are no funds in the account; Amazon deals directly with the company further directly, not with the bank, eventually getting payment order from the court. If the company went bankrupt meanwhile, Amazon might not get their money.

Scenario 2: Amazon will send the invoice; invoice will not get paid. After due date, Amazon will contact the company directly; bank doesn't even enter the picture, until collection order comes from the court. If the company went bankrupt meanwhile, Amazon might not get their money.

There's no scenario where some hypothetical loan would go straight to Amazon, unless Amazon has some instrument, that instruct the bank to pay them. Something like bank guarantee or promisory note, and uses them before declaring bankrupcy.

Talanes · on March 30, 2020

I think they were referring to a scenario where Amazon is draining the funds that have already been loaned. Thus Amazon already has their money, and the bank is the one coming after you during bankruptcy.

vetinari · on March 31, 2020

Not sure how it works in OP's country, but where I live, when you get a loan, you will get a new account. As you draw the loan, you are getting into negative balance; how far you can go is the limit of your loan. As you pay back the principal, you are getting back to zero balance.

So for Amazon draining loaned money, they would have to transfer them to a normal account and pay with debit card paired to that account, with no limit set.

It is not wise to transfer them to a normal account; you pay interest for the balance on the loan account; if you move them to your normal account, you are paying interest for money that is sitting on your normal account.

namdnay · on March 30, 2020

Wouldn't Amazon be draining a credit card directly? Tied to the account you received the loan on?

vetinari · on March 31, 2020

If they used CC (not debit), then any payment would mean creating a debt, so yes, they would have to pay to the bank. Because bank already paid in their name.

That's why you don't pay large sums with CC, but with invoice + bank transfer, and have a limit set on your cards, when you do.

trogdor · on April 2, 2020

Can you explain that more clearly? What is the reason to not pay large sums with a credit card?

vetinari · on April 5, 2020

Several factors:

- control: you are in control, when you do the payment. You can plan your cash flow.

- additional advantages: You also have payment terms, some vendors offer discounts for earlier payments; if your cash flow can handle that, why would you giving up of that?

- liability: with CC, you are getting credit that is drawn at other party leisure. It's you, who is liable for this credit line, even if the other party made a mistake. You are always liable to the bank, never towards the vendors. With bank transfers, every single payment was authorized by you (where by 'you' I mean authorized person at your company) and the liability is towards the vendor, who is not likely to have such a strong position (see Porter's five forces).

- leverage: if another party makes a mistake, they have motivation to correct it. Every company in existence has already received invoices, that are incorrect. Withholding payment until they are corrected is a strong motivator. Without that, you could be left without invoices that can be put into accounting AND without money that you have to account for.

- setting up processes: when you grow beyond certain size, you are going to want to formalize both the procurement, accounts payable and treasury. Having purchasing and payment discipline that are compatible with that already in place will mean less pain from the growth, less things to change.

When we need people in the field purchasing small supplies, we don't want them to handle cash, so they get debit (not credit) cards, with relatively small limits. It is enough for them to get by, but not enough to make any damage of significance. (The exception is fuel and that's what fuel cards are for - basically it has a form factor of a credit or debit card, but works only for fuel, is paired to a license plate and the vendor sends invoice at the end of the month).

Another scenario, where CCs are useful, if you need to pay something right now; you don't or can't want to wait for the order->delivery+invoice->payment cycle. That's fine for consumer impulse purchases, but that should not be a normal way for company purchases.

Of course, if you start a new business relation, some companies would not trust you, that you are going to pay the invoice; sending advance invoice and paying it is fine. In practice, it is quite rare occurrence.

Scoundreller · on March 29, 2020

Depends where Amazon ranks in seniority in bankruptcy (protection). You don't have to run out of money to file for it. Purdue Pharma sure didn't.

garmaine · on March 29, 2020

I’ve worked in many early startups and I’ve never seen anyone use such a loan.

jkaplowitz · on March 30, 2020

Were they in the US and funded by VCs? That kind of startup probably doesn't need to do this. Unsure about VC-funded businesses elsewhere. Many or even most small businesses without VC funding do take that kind of loan.

IG_Semmelweiss · on March 31, 2020

You work at the 1%

The real world is filled with barbershops, daycares, bars, clinics, PVC manufacturers etc

None of them get VC money.

When they need money, they go to a bank and usually have to place a PG in order to get funds.

Tech startups have it easy. Its all equity. You are not pledging your lifetime earnings on a business idea.

Once tech startups lose their upside potential (prob not anytime soon if ever), you will be sitting with the regular folk, those that pledge their skin and life to their business.

hnick · on March 30, 2020

If a director becomes personally bankrupt (such as trying to be the good guy and using personal guarantees to take on company debts in an effort to scrape through) then they're banned from running a company until it clears. If they're the director of a company that goes bankrupt, I believe they get 2 chances (companies) before there's a chance of being banned from running more for a time.

Either way it might be nice to keep your options open, depending on your plans.

scarface74 · on March 29, 2020

Or you could just send an email to support and ask them to waive the charges.

lostlogin · on March 29, 2020

If that got to the right person on the right day and they knew it was going to kill the company, it seems likely to help. And combined with the fact that it would probably guarantee future revenue way off into the future...

scarface74 · on March 29, 2020

I have never heard of a case where they wouldn’t give refunds. AWS is competing with the 95% of compute that is not running in the cloud (their own statistics). The last thing they want is a reputation that one mistake will bankrupt a business.

manigandham · on March 29, 2020

We had spot instances with a mistakenly high bid that incurred thousands overnight when the prices spiked. No refund offered.

I know several other companies that had expensive mistakes without refunds. There's probably a complex decision tree for these issues and I doubt anyone really knows outside of AWS.

Supermancho · on March 29, 2020

> I have never heard of a case where they wouldn’t give refunds.

Really? Working in Southern California a few years ago, refund requests were refused ALL THE TIME. This is why there's a common belief that what you are charged you simply owe them, period.

It may be more progressive now, but let's not be revisionist.

aledalgrande · on March 29, 2020

Once I got something like a year of EC2 charges retroactively reimbursed for a few instances I hadn't used.

staticassertion · on March 29, 2020

I've repeatedly seen requests of this nature handled by AWS - 75% cuts to billing, 90% cuts even.

aojdoiasjdasd · on March 29, 2020

This. I work at Amazon and this is more common than you'd expect. "Customer obsession" and all that.

hnick · on March 30, 2020

I'm not the type to 'want to speak to the manager' for my self-imposed problems but the more I hear about people coming out ahead the more I think I need to change my ways.

gwd · on March 30, 2020

I think you have to think of it a bit more from Amazon's perspective. If you accidentally burn through your entire startup capital and shut down, they lose. If the risk of this sort of thing becomes well-known, then startups will start using other services rather than AWS, and the small fraction that grow big will be less likely to use AWS.

Being an entitled jerk who blames other people for your own negligence is bad, and you shouldn't change that. But openly giving companies the opportunity to be kind (while admitting that it was entirely your fault) potentially helps both them and you.

cowsandmilk · on March 29, 2020

Yep, and an opportunity to educate on things like budgets and billing alarms to try to prevent this in the future.

teddyuk · on March 29, 2020

Yeah, every time I’ve heard this story support have always fixed it, at least the first time per account

101404 · on March 29, 2020

AWS should have a cost cap. Set a max spend value and shut down all servers if you spent it.

dragonwriter · on March 30, 2020

> AWS should have a cost cap. Set a max spend value and shut down all servers if you spent it.

That might make sense for some particular services (e.g., capping the cost on active EC2 instances) but lots of AWS costs of data storage costs, and you probably don't want all your data deleted because you ran too many EC2 instances and hit your budget cap.

Where exactly you are willing to shut off to avoid excess spend and what you don't want to sacrifice automatically varies from customer to customer, so there's no good one-size-fits-all automated solution.

JamesBarney · on March 30, 2020

I think if resources had an option of "At cap: Do nothing, Shut down, shutdown and erase data" that would cover most of the use cases.

jfkebwjsbx · on March 30, 2020

Keeping the data for a week but completely inaccessible would not be a huge cost for AWS yet a big relief for startups.

samstave · on March 30, 2020

We used to have a bunch of billing graphs in stack driver with alerting thresholds to pagerduty to capture exactly situations like this.