Managers

inari@piefed.zip · 4 days ago

Managers

Kazumara@discuss.tchncs.de · 4 days ago

Inference is cheap and efficient.

Tell that to all the Github users that are screaming about the new token based billing. In reality inference on these massive models with big context windows is expensive, but was subsidized so hard, that nobody has an accurate feeling for the cost.

theunknownmuncher@lemmy.world · 3 days ago

No, it is cheap and efficient. It is relative, and the comparison is to model training. But yeah, its not free

Kazumara@discuss.tchncs.de · edit-2 3 days ago

Sure it’s much much cheaper than training, but importantly those companies are not recouping anything with inference because it is still more expensive than what they are selling it for.

They are double bankrupting themselves.

At work we run inference for a research project with an open weights model in the public cloud another part of my company provides and we pay around 25$ a day for a VM with a single L40s. It’s both slow - despite not even serving concurrent users - and kind of bad in its outputs.

Edit: Interference -> Inference, arguing on the internet after waking up first thing in the morning might not have been the best idea