There is a project I’ve discovered recently which is similar to GPT4All, except you can throw multiple GPUs at the workloads (and yes it can use Vulkan): https://github.com/LostRuins/koboldcpp
I haven’t messed much with it but it builds and works fine on Linux. The only thing I don’t like is that the source tree has a bunch of Windows binaries in it.
I don’t use Gentoo but I still frequent the Gentoo Wiki and pick apart packages because it’s such a great resource for OpenRC.