Honestly good for them. US tech CEOs deserve to have their lunch eaten for ducking the industry into stagnation with their short sighted greed.
In one story they’re using PTX on Nvidia H800s. In another they’re on Huawei chips.
Which is it? Are we all just hypothesising?
An unknown quantization of R1 is running on the 3rd iteration of outdated 7nm hardware taken from Sophgo’s work with TSMC last year?
Is this meant to be impressive or alarming? Because I’m neither.
I’m not going to parse this shit article. What does interference mean here? Please and thank you.
That’s a very toxic attitude.
Inference is in principle the process of generation of the AI response. So when you run locally and LLM you are using your GPU only for inference.
Yeah, I misread because I’m stupid. Thanks for replying, non-toxic man.
Training: Creating the model
Inference: Using the modelInference? It’s the actual running of the ai when you use it, as opposed to training.
Sorry. I forgot to mention that I’m dumb.