What can I use for an offline, selfhosted LLM client, pref with images,charts, python code execution
-
I was looking back at some old lemmee posts and came across GPT4All. Didn't get much sleep last night as it's awesome, even on my old (10yo) laptop with a Compute 5.0 NVidia card.
Still, I'm after more, I'd like to be able to get image creation and view it in the conversation, if it generates python code, to be able to run it (I'm using Debian, and have a default python env set up). Local file analysis also useful. CUDA Compute 5.0 / vulkan compatibility needed too with the option to use some of the smaller models (1-3B for example). Also a local API would be nice for my own python experiments.
Is there anything that can tick the boxes? Even if I have to scoot across models for some of the features? I'd prefer more of a desktop client application than a docker container running in the background.
Maybe LocalAI? It doesn't do python code execution, but pretty much all of the rest.
-
I was looking back at some old lemmee posts and came across GPT4All. Didn't get much sleep last night as it's awesome, even on my old (10yo) laptop with a Compute 5.0 NVidia card.
Still, I'm after more, I'd like to be able to get image creation and view it in the conversation, if it generates python code, to be able to run it (I'm using Debian, and have a default python env set up). Local file analysis also useful. CUDA Compute 5.0 / vulkan compatibility needed too with the option to use some of the smaller models (1-3B for example). Also a local API would be nice for my own python experiments.
Is there anything that can tick the boxes? Even if I have to scoot across models for some of the features? I'd prefer more of a desktop client application than a docker container running in the background.
You can tell Open Interpreter to run commands based on you human-language input. If you want local only LLM, you can pair it with Ollama. It works for "interactive" use where you're asked for confirmation before a command is run.
I set this up in a VM because I wanted a full automatic coding "agent" which can run commands without my intervention and I did not want it to blow up main system. It did not really work though because as far as I know Open Interpreter does not have a way to "pipe" a command's output back into the LLM so that it could create feedback with linters and stuff.
Another issue was that Starcoder2, which is the only LLM trained on permissive licensed code I could find, only has a 15B "human-like" model. The smaller models only speak code so I don't know how that would work for agentic usage and the 15B is really slow running on DDR4 CPU. I think agents are cool though so I would like to try Aider which is a supposedly good open source agent and unlike Open Interpreter is not abandonware.
Thanks for coming to my blabering talk, hope this might be useful for someone.
-
I've discovered jan.ai which is far faster than GPT4All, and visually a little nicer.
EDIT: After using it for an hour or so, it seems to crash all the time, I keep on having to reset it, and currently am facing it freezing for no reason.
-
wrote last edited by [email protected]
Try the beta on the github repo, and use a smaller model!
-
Maybe LocalAI? It doesn't do python code execution, but pretty much all of the rest.
This looks interesting - do you have experience of it? How reliable / efficient is it?
-
I was looking back at some old lemmee posts and came across GPT4All. Didn't get much sleep last night as it's awesome, even on my old (10yo) laptop with a Compute 5.0 NVidia card.
Still, I'm after more, I'd like to be able to get image creation and view it in the conversation, if it generates python code, to be able to run it (I'm using Debian, and have a default python env set up). Local file analysis also useful. CUDA Compute 5.0 / vulkan compatibility needed too with the option to use some of the smaller models (1-3B for example). Also a local API would be nice for my own python experiments.
Is there anything that can tick the boxes? Even if I have to scoot across models for some of the features? I'd prefer more of a desktop client application than a docker container running in the background.
You should try https://cherry-ai.com/ .. It's the most advanced client out there. I personally use Ollama for running the models and Mistral API for advnaced tasks.
-
I was looking back at some old lemmee posts and came across GPT4All. Didn't get much sleep last night as it's awesome, even on my old (10yo) laptop with a Compute 5.0 NVidia card.
Still, I'm after more, I'd like to be able to get image creation and view it in the conversation, if it generates python code, to be able to run it (I'm using Debian, and have a default python env set up). Local file analysis also useful. CUDA Compute 5.0 / vulkan compatibility needed too with the option to use some of the smaller models (1-3B for example). Also a local API would be nice for my own python experiments.
Is there anything that can tick the boxes? Even if I have to scoot across models for some of the features? I'd prefer more of a desktop client application than a docker container running in the background.
You should try https://cherry-ai.com/ .. It's the most advanced client out there. I personally use Ollama for running the models and Mistral API for advnaced tasks.
-
You should try https://cherry-ai.com/ .. It's the most advanced client out there. I personally use Ollama for running the models and Mistral API for advnaced tasks.
It's fully open source and free (as in beer).
-
This looks interesting - do you have experience of it? How reliable / efficient is it?
LocalAI is pretty good but resource-intensive. I ran it on a vps in the past.
-
Ollama for API, which you can integrate into Open WebUI. You can also integrate image generation with ComfyUI I believe.
It's less of a hassle to use Docker for Open WebUI, but ollama works as a regular CLI tool.
This is what I do its excellent.
-
You should try https://cherry-ai.com/ .. It's the most advanced client out there. I personally use Ollama for running the models and Mistral API for advnaced tasks.
But its website is Chinese. Also what's the github?
-
Ollama for API, which you can integrate into Open WebUI. You can also integrate image generation with ComfyUI I believe.
It's less of a hassle to use Docker for Open WebUI, but ollama works as a regular CLI tool.
wrote last edited by [email protected]But won't this be a mish-mash of different docker containers and projects creating an installation, dependency, upgrade nightmare?
-
But won't this be a mish-mash of different docker containers and projects creating an installation, dependency, upgrade nightmare?
All the ones I mentioned can be installed with pip or uv if I am not mistaken. It would probably be more finicky than containers that you can put behind a reverse proxy, but it is possible if you wish to go that route. Ollama will also run system-wide, so any project will be able to use its API without you having to create a separate environment and download the same model twice in order to use it.
-
But its website is Chinese. Also what's the github?