Avieshek@lemmy.world to Technology@lemmy.worldEnglish · 19 days agoEdward Snowden slams Nvidia's RTX 50-series 'F-tier value,' whistleblows on lackluster VRAM capacitywww.tomshardware.comexternal-linkmessage-square97fedilinkarrow-up1352arrow-down190
arrow-up1262arrow-down1external-linkEdward Snowden slams Nvidia's RTX 50-series 'F-tier value,' whistleblows on lackluster VRAM capacitywww.tomshardware.comAvieshek@lemmy.world to Technology@lemmy.worldEnglish · 19 days agomessage-square97fedilink
minus-squareThe Hobbyist@lemmy.ziplinkfedilinkEnglisharrow-up15·18 days agoYou can. I’m running a 14B deepseek model on mine. It achieves 28 t/s.
minus-squareJeena@piefed.jeena.netlinkfedilinkEnglisharrow-up6·18 days agoOh nice, that’s faster than I imagined.
minus-squarelevzzz@lemmy.worldlinkfedilinkEnglisharrow-up4·18 days agoYou need a pretty large context window to fit all the reasoning, ollama forces 2048 by default and more uses more memory
You can. I’m running a 14B deepseek model on mine. It achieves 28 t/s.
Oh nice, that’s faster than I imagined.
You need a pretty large context window to fit all the reasoning, ollama forces 2048 by default and more uses more memory