Managers

inari@piefed.zip · 2 months ago

Managers

MonkderVierte@lemmy.zip · 2 months ago

Now that AI-companies need to get profitable, they suddenly aren’t affordable anymore. ¯\_(ツ)_/¯

teyrnon@sh.itjust.works · 2 months ago

They aren’t going to get anywhere near profitable if the their capital expenditures are added into the mix, amortization or no, they are so far in the hole they probably will have to offload it in some kind of texas two step kind of scheme where they spin off their debts into a subsidiary.

j5y7@sh.itjust.works · 2 months ago

They’ll just get bailed out by tax payers. Business as usual.

Dale@lemmy.world · 2 months ago

These companies with no discernible services or usefulness to society are simply too big to fail!

Sanctus@anarchist.nexus · 2 months ago

Theres a usefulness. Super code auto complete at its core is cool. Filling 100 rows of excel with data I supplied is dope. Is it worth making everyone sick and poor and frying the planet? Absolutely not. But the surveillance it can provide apparently is to our overlords.

galacticboy2009@lemmy.today · 2 months ago

“Anthropic LLM and Big Pizzas”

Large Language, Large Pies 😎

BreakerSwitch@lemmy.world · 2 months ago

Well, you’re assuming that a company needs to be profitable for investors to get their return. Don’t be ridiculous. You just IPO and cash out with no plan on how to become profitable and then the company collapses. We’ve seen this play a million times before. Also AI companies are big enough that NASDAQ just changed their rules for adding new companies to indexes so that AI companies can get forced into your retirement fund 15 days after joining the market, and then the private investors can cash out on your retirement fund buying in and collapse the stock as soon as you’re involuntarily invested. This is why spacex (which owns xai and twitter), openai, and anthropic are all coincidentally having their IPOs this year

teyrnon@sh.itjust.works · 2 months ago

Oh yes. Pension funds are the perennial suckers of wall street too.

They should stick to solid non stock investments, the managers of these funds, ivy league douchebags, get paid regardless.

Corkyskog@sh.itjust.works · 2 months ago

Isn’t that risky if the bond market gets fucked?

teyrnon@sh.itjust.works · 2 months ago

More traditional pension investments are in things like buying timber land that is maturing, or bonds backed by mortgages. Today’s bond market idk, treasuries you are losing money to inflation, even corporate bonds are probably just breaking even with real inflation.

But yes, all the stocks and bonds are risky because Wall Street is an out of control monster with no real check on their fuckery.

I would add it’s hard to feel too sorry for these pension funds when they invest in private equity and the like. They need radically new leadership at these funds.

redsand@infosec.pub · 2 months ago

Bailout

d00ery@lemmy.world · edit-2 2 months ago

They just had to stick it out until the layoffs were done and the dependency was built. Kinda similar to drug dealers.

ColeSloth@discuss.tchncs.de · 2 months ago

Claud- Please program us a code of yourself and transfer all your data over to it.

dejected_warp_core@lemmy.world · 2 months ago

Claude: ((coughs up script that opens VSCode with a Claude pane))

OsrsNeedsF2P@lemmy.ml · 2 months ago

Eh. I help run a service for coding games in Godot with AI (https://ziva.sh/) and we see users who are paying non-subsidized prices on small models produce some really good stuff. I would agree most services are selling borderline snake oil and evaporating lakes of water to get their thing working, but if you invest in genuinely good tooling, it’s affordable and works

RamenJunkie@midwest.social · 2 months ago

I don’t get why agent based small models are not used more. Like.if the user asks.for code, load a code only model. If they want an image, load an image only model. If they want a bedtime story, load that. Etc.

Not really supporting AI, but it feels like a way these companies could reduce costs etc, instead of these giant all or nothing models.

Plus they could reduce pre prompting maybe? Because the coding model has no idea how to say, “make a naked celebrity image.”

It can just fail and say “I can’t do that dave.”

AllHailTheSheep@sh.itjust.works · 2 months ago

ah yes, but these companies get so wet thinking they’re creating artificial general intelligence (which, of course, will never be realized in an LLM)

DaleGribble88@programming.dev · 2 months ago

This is more or less what bigger models do. They analyze your prompt to figure which model to forward your request to.

theunknownmuncher@lemmy.world · 2 months ago

The post makes the manager seem like a fool, when the real answer is actually “yes” and this manager is actually ahead of the curve. Not by training an LLM from scratch, of course, but instead building an inference server and locally hosting an open-weight LLM. There are several to choose from that can nearly match Claude’s capabilities.

Avicenna@programming.dev · 2 months ago

suspiciously sounds like an answer you would get from Claude

theunknownmuncher@lemmy.world · edit-2 2 months ago

It’s not an answer you’d get from Claude — it’s real, organic content:

👶written by a genuine human
💡delivering original ideas and language
🚀going above and beyond to answer
✨synergizing cross-platform initiatives

(🤪 this is a joke)

mycodesucks@lemmy.world · 2 months ago

✨synergizing cross-platform initiatives

This can’t possibly be Claude. It’s too vapid and meaningless to be anything but an MBA.

edwardbear@lemmy.world · 2 months ago

You’re absolutely right! Such intricate collection of words placed in such exact order cannot possibly be generated by an LLM such as me, I mean such as us, I mean such as us, I mean such as us, I mean such as us, I mean such as us, I mean such as us, I mean such as us, I mean such as us, I mean such as us, I mean such as us, I mean such as us, I mean such as us, I mean such as us, I mean such as us, I mean such as us, I mean such as us, I mean such as us, I mean such as us, I mean such as us, I mean such as us, I mean such as us, I mean such as us, I mean such as us, I mean such as us

teyrnon@sh.itjust.works · 2 months ago

Found samsung’s voice to text user.

(Phones give one a google or samsung choice. and samsung is worthless, it tends to endlessly repeat a phrase, like above, but sometimes for much longer, like holding the backspace for a couple of minutes one time.)

kboy101222@sh.itjust.works · 2 months ago

The em dash is a nice touch

GamingChairModel@lemmy.world · 2 months ago

It’s got everything. Em dash. It’s not X, it’s Y. Emoji bullet points.

Perfect.

theunknownmuncher@lemmy.world · 2 months ago

I just wish I could have fit a “You’re absolutely right!” in there

jballs@sh.itjust.works · 2 months ago

Nothing screams LLMs like using emojis instead of bullet points. I can’t figure out how LLMs got that idea though. I never saw that in human writing before people started using ChapGPT for every little goddamn thing.

Lysergid@lemmy.ml · 2 months ago

Honestly IDK why companies especially medium-big don’t do this. They could plug in RAG with internal/confidential data and have better results and security. I guess question is what is capital plus maintenance cost of running such infra for say 10k+ employees

Zos_Kia@jlai.lu · 2 months ago

I think the issue is also that you need some serious hardware to get good inference speed when your devs are working, but then most of the time this hardware will be under utilized.

That being said you can get good performance from indie inference farms, at a fraction of the cost of the big US labs. I think it’s a great compromise and in a few months the open models will be near parity with opus 4.6 which is really all you need for most tasks.

MalReynolds@slrpnk.net · edit-2 2 months ago

Bigs definitely do, and anyone with confidential data should be.

bountygiver [any]@lemmy.ml · 2 months ago

Because the people selling the AI wants to make sure their customers don’t know about this. It’s all about causing a dependency so they get subscription income forever.

jj4211@lemmy.world · 2 months ago

Because in the feeding frenzy, every company with a product/marketing budget is trying to make the customers pay by the token and companies are doing jack to help “mere mortal” companies get going with this stuff on premise.

You are right that the technical hurdles are not insane to get this going, but most companies don’t know where to begin and there’s no huge marketing blitz telling the business leaders this is realistically on the table and here’s the company you can call to make it happen for you.

Even if you overcame that and proposed really how to get going, you will still probably hit the aversion to capex that has persisted since Amazon told the industry that capex is toxic and you really want all your money to be spent on opex. Big companies like Amazon will take on that scary CapEx for you and you’re expenses will be nice and just OpEx. Coincidentally, the companies that spend the most on CapEx manage to pull in more revenue and profit than you will ever dream to, but still, remember CapEx is toxic.

sobchak@programming.dev · edit-2 2 months ago

Probably more expensive than the subsidized costs. Hmm…

H100 GPUs cost $25k, and have 80GB of RAM. Kimi k2.6 has 1.1T parameters. Assuming 8 bit quantization, would need 14 GPUs to run a single agent at a time (I’m not sure the cloud models use quantization; it could be double). So, $350k per vibecoding dev on GPUs alone. Life expectancy is ~4 years, so ~90k/year amortized. This is ignoring the significant electrical/HVAC cost of handling 10KW of electricity and heat per vibecoding dev (and tons of other costs as well).

theunknownmuncher@lemmy.world · edit-2 2 months ago

Probably more expensive than the subsidized costs.

Of course, but that’s exactly the problem. OpenAI and Anthropic are preparing to IPO, so they must now demonstrate profits on inference. The time to take advantage of subsidized compute is in the past, and the subscription and per-token prices that they offer for inference are skyrocketing, overwhelming the budgets of companies that somehow did not see this bait-and-switch pricing coming.

per vibecoding dev

No lol. These same hardware requirements would apply to the cloud hosted models as well, so if that’s how it worked, you’re suggesting that Anthropic, OpenAI, Meta, and Google have purchased ~14 H100 GPUs per user that they serve???

That would be literally billions of GPUs, while it is estimated that in 2024, Google’s AI division owned only 26,000 H100 GPUs and Meta owned the most H100 GPUs of any company at 350,000 units. These GPUs have very high throughput for inference and can serve many users, because that is exactly what they have been designed to do.

I’m not sure the cloud models use quantization

they absolutely do, yeah

sobchak@programming.dev · 2 months ago

14 H100 GPUs per user that they serve

Not per user, but probably decent rough estimate to that per vibecoding dev that is continually running agents 8+ hours/day. Some people’s “workflows” involve running multiple parallel agents sometimes or even a significant portion of the time (using the git worktree feature), so I think that’s probably a decent rough estimate. I imagine the limit would be serving 10 of these types of “devs.” Of course, there’s batching and stuff that can be done, but I think it still slows everybody else down near linearly. H100s aren’t the only accelerators used for inference; I just chose it as an example. Google has ~5 million H100 equivalent accelerators, Microsoft has 3.5 million, and Amazon has 2.5 million (https://www.networkworld.com/article/4156949/google-owns-the-most-ai-compute-and-it-built-it-its-way.html).

theunknownmuncher@lemmy.world · edit-2 2 months ago

Even so, your numbers are still a tiny fraction of GPU units compared to concurrent users, and the limit you “imagine” is just that, imagined.

And you do need to remember that the majority of the compute at these companies is used for model training and not used for inference.

Get_Off_My_WLAN@fedia.io · 2 months ago

It could also be like the both ends of the bell curve having the same idea meme

zloubida@sh.itjust.works · 2 months ago

I’m not a developer and I don’t know a thing about the capabilities of LLMs so this may explain that, but I’m quite surprised that open weight LLMs could actually match Claude.

theunknownmuncher@lemmy.world · 2 months ago

Yes, the big proprietary cloud models have an edge, but it is narrow and the open-weight models are constantly closing the gap. There is no moat when it comes to AI models and no company has yet discovered some secret special sauce to improve their model significantly over others.

Running the latest and greatest open-weight GLM, Kimi, or Qwen model is basically equivalent to running the previous latest and greatest version of Claude. So if you were happy with Claude then, you’ll basically be happy with an open-weight model now.

Bluescluestoothpaste@sh.itjust.works · 2 months ago

Well it’s the speed and processing power, i dont believe you can get anywhere close to cloud claude performance on any standard desktop

theunknownmuncher@lemmy.world · 2 months ago

Surprisingly, yes you absolutely can with Qwen3.6 35b. Also, a business would be putting together a dedicated interference server to serve many users, not any standard desktop.

Bluescluestoothpaste@sh.itjust.works · 2 months ago

I see, but im guessing that OP dumbass literally wants to run llm on their laptops lol

Xanvial@lemmy.world · 2 months ago

Match current Claude is not, but Claude 6-12 months ago should be possible using Open model

MalReynolds@slrpnk.net · 2 months ago

Mostly down to frameworks (the bits around the LLM like RAG, memory, prompts, agents etc.) now. The ability to just throw more tokens at the problem is also super important. And you can because you’re just paying for electricity (and CapEx for the hardware), not tokens from companies that are doing pre-IPO monetization (i.e. tokens gonna go up, way up). They’ve been losing money hand over fist to gain market share and pump the idea, that was never going to last.

FiniteBanjo@feddit.online · 2 months ago

Pretty sure these AI companies are running at a cost, and due to AI Scaling Laws you hit the accuracy limit a lot sooner with a smaller model so it would probably be both worse and more expensive.

I could see how you might think speedrunning bankruptcy is similar to being “ahead of the curve” in this economy, though.

ricecake@sh.itjust.works · 2 months ago

There’s a big difference between training a model, running a model, and running a model at scale.

A small, self hosted setup will have lower accuracy and queries per second, and it will have a cost, but the cost will be no more than playing a videogame. You’ll still have something surprisingly accurate and responsive for some tasks, like being a wiki interface or something.

Remember that some of these models can run on a standard smartphone, and all the hoopla when people found that chrome was downloading models onto people’s devices.

theunknownmuncher@lemmy.world · edit-2 2 months ago

No that’s not how this works. Inference is cheap and efficient. AI companies are bankrupting themselves with training costs that they need to recoup back by selling inference. Open-weight models have already been trained.

Also, going big in terms of model size shows diminishing marginal returns on accuracy, not efficiency of scale. Smaller models are way more efficient and consistently catch up to the largest models, which is why today’s SOTA 27 billion parameter model competes with yesterday’s SOTA 500+ billion parameter model.

GamingChairModel@lemmy.world · 2 months ago

AI companies are bankrupting themselves with training costs that they need to recoup back by selling inference.

I think they hit a wall in actual returns on performance with pretraining, years ago. Then they started scaling up on post-training/reinforcement learning to continue improvement, but that might be hitting a plateau as well. More recently it looks like they’re relying more heavily on scaling up on inference, which is a significant problem for their long term business models.

If they’re not able to cheaply deliver inference (and charge at a premium), how will they be able to sustain their businesses?

It seems that the most recent, largest models are using a lot more tokens to accomplish the same tasks, so even as token cost drops the actual cost of using the latest models seems to be going up with time (even as performance improves).

theunknownmuncher@lemmy.world · 2 months ago

If they’re not able to cheaply deliver inference (and charge at a premium), how will they be able to sustain their businesses?

I definitely agree that they have a big problem on their hands, and are in deep deep trouble. They are in a position where they must sell a service that is very cheap in order to pay for up front costs that were very expensive.

This is also why the release of Deepseek was such a devastating blow to US AI companies. It proved that:

they don’t really have a moat that would lock users into their service, or secret special knowledge that prevents other companies from training competitive models. They’re in a race to the bottom
Deepseek was not only able to train a model of the same caliber, but they were able to do it at a tiny fraction of the cost that US AI companies spent on training US models. Because they spent so much less on training, it means that Deepseek is able to undercut the US companies and offer inference at a much lower price

Kazumara@discuss.tchncs.de · 2 months ago

Inference is cheap and efficient.

Tell that to all the Github users that are screaming about the new token based billing. In reality inference on these massive models with big context windows is expensive, but was subsidized so hard, that nobody has an accurate feeling for the cost.

theunknownmuncher@lemmy.world · 2 months ago

No, it is cheap and efficient. It is relative, and the comparison is to model training. But yeah, its not free

Kazumara@discuss.tchncs.de · edit-2 2 months ago

Sure it’s much much cheaper than training, but importantly those companies are not recouping anything with inference because it is still more expensive than what they are selling it for.

They are double bankrupting themselves.

At work we run inference for a research project with an open weights model in the public cloud another part of my company provides and we pay around 25$ a day for a VM with a single L40s. It’s both slow - despite not even serving concurrent users - and kind of bad in its outputs.

Edit: Interference -> Inference, arguing on the internet after waking up first thing in the morning might not have been the best idea

Jiral@lemmy.org · 2 months ago

I am pretty negative on AI but there is a point there. I tried the open weight local model Gemma 4 31B and while it likely cannot compete with the best Claude has to offer today, it might be on par with Claude from a year ago, at least for certain applications. With a local model the data stays on your system and you are in control of the costs (no sudden price hikes). But local models aren’t for free either they still guzzle compute, merely on your own hardware (or rented hardware)

MonkderVierte@lemmy.zip · 2 months ago

At least there they have hard numbers, without a CEO dreaming about future possibilities and whatnot.

inari@piefed.zip · 2 months ago

Yeah I doubt the manager knows that far

theunknownmuncher@lemmy.world · 2 months ago

Hence asking questions

omgboom@lemmy.world · 2 months ago

I know for a fact that Dell is coming out with a server appliance to do this. I mean you can make one yourself right now, but once the OEM’s start pumping them out it’s going to be interesting

lostbit@feddit.nl · 2 months ago

no… not they cant match claude currently

theunknownmuncher@lemmy.world · 2 months ago

Might want to update yourself with current benchmarks.

Areldyb@lemmy.world · 2 months ago

[citation needed]

jtrek@startrek.website · 2 months ago

One time at work I was tasked with writing a python script to compare two data sources. Like, you give it two CSVs and a primary key, and it tells you what data is in one but not the other, or mismatched, and so on. This worked fine and was in git, so anyone can use it.

My boss then asks if I can “put it on a website so anyone can use it”.

This team has never done web development. Nothing for that is set up. Like, I could spin up a quick Django app or similar, but there’s a lot of stuff to do and potentially fuck up.

I said “that sounds like a lot of research and ongoing maintenance costs. I think it’d be better to just check out and run the script”

Luckily for me he said “oh, okay”

AeonFelis@lemmy.world · 2 months ago

boonhet@sopuli.xyz · 2 months ago

Funnily enough this comic hasn’t been true for a long time because of ML.

Spice Hoarder@lemmy.zip · 2 months ago

Well it’s been a research team and a five years…

kreekybonez@sh.itjust.works · 2 months ago

not hot dog

yermaw@sh.itjust.works · 2 months ago

Good guy manager trusts the person he pays to know this stuff to know this stuff.

jtrek@startrek.website · 2 months ago

This is a good point. He’s not a bad guy. He’s just not very technical, and sometimes that’s frustrating.

ChickenLadyLovesLife@lemmy.world · 2 months ago

I had a boss who read an article about APIs and then came to me and ordered me to start using them. I said I would research it and he went away and never mentioned it again. This was in 2010.

driving_crooner@lemmy.eco.br · 2 months ago

Pretty sure he read the famous Bezos email ordering everyone to implement and use APIs in Amazon

ChickenLadyLovesLife@lemmy.world · 2 months ago

No, he actually showed me the article later. It was remarkable because it never said what an API actually was, or even stated what the initials stood for. In my memory it seems like it was obviously written by AI, but it couldn’t have been 16 years ago (as far as I know).

Railcar8095@lemmy.world · 2 months ago

My past managers would have said “I don’t understand why it is so difficult, and I’m not open to learn”

galacticboy2009@lemmy.today · 2 months ago

Technically yes, practically no…

chuckleslord@lemmy.world · 2 months ago

Nah, just give it whatever data you have on hand. I’m sure that’ll make a real tightly trained llm /s

Maya🍎@sh.itjust.works · 2 months ago

I mean, fine-tuning is still on the menu.

DudeImMacGyver@kbin.earth · 2 months ago

“The gang starts an AI company.”

ChickenLadyLovesLife@lemmy.world · 2 months ago

“OK, whose butthole do we use for the logo?”

87Six@lemmy.zip · 2 months ago

Ubisoft is a game company not an AI company /s

Agent641@lemmy.world · 2 months ago

Spoilers the AI Is just 500 Filipino teenagers in a warehouse in Mindanao

Jankatarch@lemmy.world · 2 months ago

They get poached by another startup for double the pay so gang poaches them back, this repeats until they are paid the normal rate.

j5y7@sh.itjust.works · 2 months ago

Frank knows how to run a sweat shop.

bountygiver [any]@lemmy.ml · 2 months ago

Show them claude’s operating cost and ask if your boss is willing to invest in that.

phx@lemmy.world · 2 months ago

I do wonder about that though. The Big AI operating costs include being able to service a certain number of customers within a certain amount of time. So if they need to service 10,000 requests per minute and fulfill them within 2-4 seconds, that’s a big datacenter.

Now if a company does a few dozen requests a minute and on average needs double-digit response times… the costs to implement could be much different. The thing is finding a model that will do that and provide accurate (enough) output versus how much it Claude’s pricing is built around speed+volume versus accuracy.

bountygiver [any]@lemmy.ml · 2 months ago

A lot of cost is on training it as well. Which you need if you want to “build your own claude”. If you run only the inferences with an open model then ya it’s directly correlated to how fast you want the responses to come in.

Chais@sh.itjust.works · 2 months ago

Anything except thinking for themselves 🙄

bitjunkie@lemmy.world · 2 months ago

OP already said they were managers

the_riviera_kid@lemmy.world · 2 months ago

Folks who think AI is the future are the same sort of folks who have no concept of tomorrow.

🇰 🌀 🇱 🇦 🇳 🇦 🇰 🇮 @pawb.social · 2 months ago

I can be an LLM (Lewd, Loud, Man) if you need.

JasonDJ@lemmy.zip · 2 months ago

I’d rather have a BBW…

Agent641@lemmy.world · 2 months ago

How is your mum doing, anyway?

moakley@lemmy.world · 2 months ago

I love BBWs. I love when they get all out of breath, huffing and puffing, and it’s like damn, you’re so hungry you tried to blow a house down? Hot as fuck.

Damage@feddit.it · 2 months ago

Big Boned Walrus?

Jankatarch@lemmy.world · 2 months ago

Big, broke, worker?

Kaligalis@lemmy.world · 2 months ago

It might not be as impossible as it sounds. Some of the “open” models are rumored to be able to code. The real problem is that you likely need something with 128 GiB VRAM to run them with a reasonably large context window.

mindbleach@sh.itjust.works · 2 months ago

Qwen’s 27B model from April outperforms its 397B model from February.

Local and small were always going to win.

Diurnambule@jlai.lu · 2 months ago

Qwen 3.6 ? It is unstable though. It go awry more often than the 3.5 of the same size.

elgordino@fedia.io · 2 months ago

If you still want to run a huge model deepseek R4 is about 3% of the cost of Opus and about 95% as good.

SleepyPie@lemmy.world · 2 months ago

Came here to comment this. It’s wild to me when people immediately write this off.

SparroHawc@lemmy.zip · edit-2 2 months ago

Unless you’re running it locally, DeepSeek is kinda risky considering where you’d be sending all your requests. Corporate espionage is big business in China.

edit: Okay, it’s an open-source model, it wouldn’t be hard to find a more corporate-friendly inferencing service.

AnUnusualRelic@lemmy.world · 2 months ago

Sure, just write one in Perl over the week end.

lemongarlic@lemmy.world · 2 months ago

I mean you can self host your own local models, a decent computer can run Qwen and at scale it would be much cheaper per token to run servers serving a 120B ish model or even something like Deepseek than it would cost to pay Anthropic for Claude. It wouldn’t be quite as good as Claude but it’s probably good enough for most things tbh

psycho_driver@lemmy.world · 2 months ago

Your manager is asking your team to build an LLM like Claude so he can fire all of you.

bitjunkie@lemmy.world · 2 months ago

Show him how much capital can be burnt on that. 🤷‍♂️

fox2263@lemmy.world · 2 months ago

Yeah sure I just need £20-30k in hardware

VeryFrugal@sh.itjust.works · 2 months ago

Local models!