We will use Grok 3.5 (maybe we should call it 4), which has advanced reasoning, to rewrite the entire corpus of human knowledge, adding missing information and deleting errors.
Then retrain on that.
Far too much garbage in any foundation model trained on uncorrected data.
You literally called it borderline magic.
Don’t do that? They’re pattern recognition engines, they can produce some neat results and are good for niche tasks and interesting as toys, but they really aren’t that impressive. This “borderline magic” line is why they’re trying to shove these chatbots into literally everything, even though they aren’t good at most tasks.
It’s clear you don’t really understand the wider context and how historically hard these tasks have been. I’ve been doing this for a decade and the fact that these foundational models can be pretrained on unrelated things then jump that generalization gap so easily (within reason) is amazing. You just see the end result of corporate uses in the news, but this technology is used in every aspect of science and life in general (source: I do this for many important applications).