Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse

NodeBB

  1. Home
  2. Selfhosted
  3. I've just created c/Ollama!

I've just created c/Ollama!

Scheduled Pinned Locked Moved Selfhosted
selfhosted
29 Posts 12 Posters 29 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • C [email protected]

    I've just re-discovered ollama and it's come on a long way and has reduced the very difficult task of locally hosting your own LLM (and getting it running on a GPU) to simply installing a deb! It also works for Windows and Mac, so can help everyone.

    I'd like to see Lemmy become useful for specific technical sub branches instead of trying to find the best existing community which can be subjective making information difficult to find, so I created [email protected] for everyone to discuss, ask questions, and help each other out with ollama!

    So, please, join, subscribe and feel free to post, ask questions, post tips / projects, and help out where you can!

    Thanks!

    gashead76@lemmy.worldG This user is from outside of this forum
    gashead76@lemmy.worldG This user is from outside of this forum
    [email protected]
    wrote last edited by
    #2

    Cool! I'll subscribe. I've got about a dozen projects I'd like to build with Ollama, if I'll get the motivation and free time who knows?

    C 1 Reply Last reply
    9
    • gashead76@lemmy.worldG [email protected]

      Cool! I'll subscribe. I've got about a dozen projects I'd like to build with Ollama, if I'll get the motivation and free time who knows?

      C This user is from outside of this forum
      C This user is from outside of this forum
      [email protected]
      wrote last edited by [email protected]
      #3

      Start now! Install it, get a python environment up and running if you haven't already, and get that first play-around project working which you work outwards from!

      gashead76@lemmy.worldG 1 Reply Last reply
      3
      • C [email protected]

        I've just re-discovered ollama and it's come on a long way and has reduced the very difficult task of locally hosting your own LLM (and getting it running on a GPU) to simply installing a deb! It also works for Windows and Mac, so can help everyone.

        I'd like to see Lemmy become useful for specific technical sub branches instead of trying to find the best existing community which can be subjective making information difficult to find, so I created [email protected] for everyone to discuss, ask questions, and help each other out with ollama!

        So, please, join, subscribe and feel free to post, ask questions, post tips / projects, and help out where you can!

        Thanks!

        I This user is from outside of this forum
        I This user is from outside of this forum
        [email protected]
        wrote last edited by [email protected]
        #4

        Instance independent link: [email protected]

        Share links to communities this way, so everyone can subscribe easily.

        You should also post about this in [email protected] and [email protected] for better discoverability!

        C 1 Reply Last reply
        23
        • C [email protected]

          Start now! Install it, get a python environment up and running if you haven't already, and get that first play-around project working which you work outwards from!

          gashead76@lemmy.worldG This user is from outside of this forum
          gashead76@lemmy.worldG This user is from outside of this forum
          [email protected]
          wrote last edited by
          #5

          Already setup! I think the first thing I want to do is setup retrieval augmented generation. Several of my hobby ideas will require it I think. Started trying to read up on it a couple days ago and I had a serious lack of focus going on.

          I've been kind of hoping to come across a super simple way to implement it, but haven't exactly looked much yet.

          C 1 Reply Last reply
          2
          • I [email protected]

            Instance independent link: [email protected]

            Share links to communities this way, so everyone can subscribe easily.

            You should also post about this in [email protected] and [email protected] for better discoverability!

            C This user is from outside of this forum
            C This user is from outside of this forum
            [email protected]
            wrote last edited by
            #6

            Thanks will do all that!

            1 Reply Last reply
            5
            • gashead76@lemmy.worldG [email protected]

              Already setup! I think the first thing I want to do is setup retrieval augmented generation. Several of my hobby ideas will require it I think. Started trying to read up on it a couple days ago and I had a serious lack of focus going on.

              I've been kind of hoping to come across a super simple way to implement it, but haven't exactly looked much yet.

              C This user is from outside of this forum
              C This user is from outside of this forum
              [email protected]
              wrote last edited by
              #7

              Sounds like a great first question! Go for it!

              1 Reply Last reply
              1
              • C [email protected]

                I've just re-discovered ollama and it's come on a long way and has reduced the very difficult task of locally hosting your own LLM (and getting it running on a GPU) to simply installing a deb! It also works for Windows and Mac, so can help everyone.

                I'd like to see Lemmy become useful for specific technical sub branches instead of trying to find the best existing community which can be subjective making information difficult to find, so I created [email protected] for everyone to discuss, ask questions, and help each other out with ollama!

                So, please, join, subscribe and feel free to post, ask questions, post tips / projects, and help out where you can!

                Thanks!

                B This user is from outside of this forum
                B This user is from outside of this forum
                [email protected]
                wrote last edited by [email protected]
                #8

                TBH you should fold this into localllama? Or open source AI?

                I have very mixed (mostly bad) feelings on ollama. In a nutshell, they're kinda Twitter attention grabbers that give zero credit/contribution to the underlying framework (llama.cpp). And that's just the tip of the iceberg, they've made lots of controversial moves, and it seems like they're headed for commercial enshittification.

                They're... slimy.

                They like to pretend they're the only way to run local LLMs and blot out any other discussion, which is why I feel kinda bad about a dedicated ollama community.

                It's also a highly suboptimal way for most people to run LLMs, especially if you're willing to tweak.

                I would always recommend Kobold.cpp, tabbyAPI, ik_llama.cpp, Aphrodite, LM Studio, the llama.cpp server, sglang, the AMD lemonade server, any number of backends over them. Literally anything but ollama.


                ...TL;DR I don't the the idea of focusing on ollama at the expense of other backends. Running LLMs locally should be the community, not ollama specifically.

                S tal@lemmy.todayT T S 4 Replies Last reply
                55
                • B [email protected]

                  TBH you should fold this into localllama? Or open source AI?

                  I have very mixed (mostly bad) feelings on ollama. In a nutshell, they're kinda Twitter attention grabbers that give zero credit/contribution to the underlying framework (llama.cpp). And that's just the tip of the iceberg, they've made lots of controversial moves, and it seems like they're headed for commercial enshittification.

                  They're... slimy.

                  They like to pretend they're the only way to run local LLMs and blot out any other discussion, which is why I feel kinda bad about a dedicated ollama community.

                  It's also a highly suboptimal way for most people to run LLMs, especially if you're willing to tweak.

                  I would always recommend Kobold.cpp, tabbyAPI, ik_llama.cpp, Aphrodite, LM Studio, the llama.cpp server, sglang, the AMD lemonade server, any number of backends over them. Literally anything but ollama.


                  ...TL;DR I don't the the idea of focusing on ollama at the expense of other backends. Running LLMs locally should be the community, not ollama specifically.

                  S This user is from outside of this forum
                  S This user is from outside of this forum
                  [email protected]
                  wrote last edited by
                  #9

                  What would you recommend to hook to my home assistant?

                  B T 2 Replies Last reply
                  5
                  • S [email protected]

                    What would you recommend to hook to my home assistant?

                    B This user is from outside of this forum
                    B This user is from outside of this forum
                    [email protected]
                    wrote last edited by [email protected]
                    #10

                    Totally depends on your hardware, and what you tend to ask it. What are you running? What do you use it for? Do you prefer speed over accuracy?

                    E W S 3 Replies Last reply
                    2
                    • C [email protected]

                      I've just re-discovered ollama and it's come on a long way and has reduced the very difficult task of locally hosting your own LLM (and getting it running on a GPU) to simply installing a deb! It also works for Windows and Mac, so can help everyone.

                      I'd like to see Lemmy become useful for specific technical sub branches instead of trying to find the best existing community which can be subjective making information difficult to find, so I created [email protected] for everyone to discuss, ask questions, and help each other out with ollama!

                      So, please, join, subscribe and feel free to post, ask questions, post tips / projects, and help out where you can!

                      Thanks!

                      otter@lemmy.caO This user is from outside of this forum
                      otter@lemmy.caO This user is from outside of this forum
                      [email protected]
                      wrote last edited by [email protected]
                      #11

                      There is also [email protected] 🙂

                      crossposting between the communities can help grow both

                      1 Reply Last reply
                      7
                      • B [email protected]

                        TBH you should fold this into localllama? Or open source AI?

                        I have very mixed (mostly bad) feelings on ollama. In a nutshell, they're kinda Twitter attention grabbers that give zero credit/contribution to the underlying framework (llama.cpp). And that's just the tip of the iceberg, they've made lots of controversial moves, and it seems like they're headed for commercial enshittification.

                        They're... slimy.

                        They like to pretend they're the only way to run local LLMs and blot out any other discussion, which is why I feel kinda bad about a dedicated ollama community.

                        It's also a highly suboptimal way for most people to run LLMs, especially if you're willing to tweak.

                        I would always recommend Kobold.cpp, tabbyAPI, ik_llama.cpp, Aphrodite, LM Studio, the llama.cpp server, sglang, the AMD lemonade server, any number of backends over them. Literally anything but ollama.


                        ...TL;DR I don't the the idea of focusing on ollama at the expense of other backends. Running LLMs locally should be the community, not ollama specifically.

                        tal@lemmy.todayT This user is from outside of this forum
                        tal@lemmy.todayT This user is from outside of this forum
                        [email protected]
                        wrote last edited by [email protected]
                        #12

                        While I don't think that llama.cpp is specifically a special risk, I think that running generative AI software in a container is probably a good idea. It's a rapidly-moving field with a lot of people contributing a lot of code that very quickly gets run on a lot of systems by a lot of people. There's been malware that's shown up in extensions for (for example) ComfyUI. And the software really doesn't need to poke around at outside data.

                        Also, because the software has to touch the GPU, it needs a certain amount of outside access. Containerizing that takes some extra effort.

                        https://old.reddit.com/r/comfyui/comments/1hjnf8s/psa_please_secure_your_comfyui_instance/

                        ComfyUI users has been hit time and time again with malware from custom nodes or their dependencies. If you're just using the vanilla nodes, or nodes you've personally developed yourself or vet yourself every update, then you're fine. But you're probably using custom nodes. They're the great thing about ComfyUI, but also its great security weakness.

                        Half a year ago the LLMVISION node was found to contain an info stealer. Just this month the ultralytics library, used in custom nodes like the Impact nodes, was compromised, and a cryptominer was shipped to thousands of users.

                        Granted, the developers have been doing their best to try to help all involved by spreading awareness of the malware and by setting up an automated scanner to inform users if they've been affected, but what's better than knowing how to get rid of the malware is not getting the malware at all. '

                        Why Containerization is a solution

                        So what can you do to secure ComfyUI, which has a main selling point of being able to use nodes with arbitrary code in them? I propose a band-aid solution that, I think, isn't horribly difficult to implement that significantly reduces your attack surface for malicious nodes or their dependencies: containerization.

                        Ollama means sticking llama.cpp in a Docker container, and that is, I think, a positive thing.

                        If there were a close analog to ollama, like some software package that could take a given LLM model and run in podman or Docker or something, I think that that'd be great. But I think that putting the software in a container is probably a good move relative to running it uncontainerized.

                        B 1 Reply Last reply
                        3
                        • B [email protected]

                          Totally depends on your hardware, and what you tend to ask it. What are you running? What do you use it for? Do you prefer speed over accuracy?

                          E This user is from outside of this forum
                          E This user is from outside of this forum
                          [email protected]
                          wrote last edited by
                          #13

                          I’m going to go out on a limb and say they probably just want a comparable solution to Ollama.

                          B 1 Reply Last reply
                          2
                          • E [email protected]

                            I’m going to go out on a limb and say they probably just want a comparable solution to Ollama.

                            B This user is from outside of this forum
                            B This user is from outside of this forum
                            [email protected]
                            wrote last edited by [email protected]
                            #14

                            OK.

                            Then LM Studio. With Qwen3 30B IQ4_XS, low temperature MinP sampling.

                            That’s what I’m trying to say though, there is no one click solution, that’s kind of a lie. LLMs work a bajillion times better with just a little personal configuration. They are not magic boxes, they are specialized tools.

                            Random example: on a Mac? Grab an MLX distillation, it’ll be way faster and better.

                            Nvidia gaming PC? TabbyAPI with an exl3. Small GPU laptop? ik_llama.cpp APU? Lemonade. Raspberry Pi? That’s important to know!

                            What do you ask it to do? Set timers? Look at pictures? Cooking recipes? Search the web? Look at documents? Do you need stuff faster or accurate?

                            This is one reason why ollama is so suboptimal, with the other being just bad defaults (Q4_0 quants, 2048 context, no imatrix or anything outside GGUF, bad sampling last I checked, chat template errors, bugs with certain models, I can go on). A lot of people just try “ollama run” I guess, then assume local LLMs are bad when it doesn’t work right.

                            1 Reply Last reply
                            4
                            • tal@lemmy.todayT [email protected]

                              While I don't think that llama.cpp is specifically a special risk, I think that running generative AI software in a container is probably a good idea. It's a rapidly-moving field with a lot of people contributing a lot of code that very quickly gets run on a lot of systems by a lot of people. There's been malware that's shown up in extensions for (for example) ComfyUI. And the software really doesn't need to poke around at outside data.

                              Also, because the software has to touch the GPU, it needs a certain amount of outside access. Containerizing that takes some extra effort.

                              https://old.reddit.com/r/comfyui/comments/1hjnf8s/psa_please_secure_your_comfyui_instance/

                              ComfyUI users has been hit time and time again with malware from custom nodes or their dependencies. If you're just using the vanilla nodes, or nodes you've personally developed yourself or vet yourself every update, then you're fine. But you're probably using custom nodes. They're the great thing about ComfyUI, but also its great security weakness.

                              Half a year ago the LLMVISION node was found to contain an info stealer. Just this month the ultralytics library, used in custom nodes like the Impact nodes, was compromised, and a cryptominer was shipped to thousands of users.

                              Granted, the developers have been doing their best to try to help all involved by spreading awareness of the malware and by setting up an automated scanner to inform users if they've been affected, but what's better than knowing how to get rid of the malware is not getting the malware at all. '

                              Why Containerization is a solution

                              So what can you do to secure ComfyUI, which has a main selling point of being able to use nodes with arbitrary code in them? I propose a band-aid solution that, I think, isn't horribly difficult to implement that significantly reduces your attack surface for malicious nodes or their dependencies: containerization.

                              Ollama means sticking llama.cpp in a Docker container, and that is, I think, a positive thing.

                              If there were a close analog to ollama, like some software package that could take a given LLM model and run in podman or Docker or something, I think that that'd be great. But I think that putting the software in a container is probably a good move relative to running it uncontainerized.

                              B This user is from outside of this forum
                              B This user is from outside of this forum
                              [email protected]
                              wrote last edited by [email protected]
                              #15

                              I don’t understand.

                              Ollama is not actually docker, right? It’s running the same llama.cpp engine, it’s just embedded inside the wrapper app, not containerized. It has a docker preset you can use, yeah.

                              And basically every LLM project ships a docker container. I know for a fact llama.cpp, TabbyAPI, Aphrodite, Lemonade, vllm and sglang do. It’s basically standard. There’s all sorts of wrappers around them too.

                              You are 100% right about security though, in fact there’s a huge concern with compromised Python packages. This one almost got me: https://pytorch.org/blog/compromised-nightly-dependency/

                              This is actually a huge advantage for llama.cpp, as it’s free of python and external dependencies by design. This is very unlike ComfyUI which pulls in a gazillian external repos. Theoretically the main llama.cpp git could be compromised, but it’s a single, very well monitored point of failure there, and literally every outside architecture and feature is implemented from scratch, making it harder to sneak stuff in.

                              tal@lemmy.todayT 1 Reply Last reply
                              8
                              • B [email protected]

                                I don’t understand.

                                Ollama is not actually docker, right? It’s running the same llama.cpp engine, it’s just embedded inside the wrapper app, not containerized. It has a docker preset you can use, yeah.

                                And basically every LLM project ships a docker container. I know for a fact llama.cpp, TabbyAPI, Aphrodite, Lemonade, vllm and sglang do. It’s basically standard. There’s all sorts of wrappers around them too.

                                You are 100% right about security though, in fact there’s a huge concern with compromised Python packages. This one almost got me: https://pytorch.org/blog/compromised-nightly-dependency/

                                This is actually a huge advantage for llama.cpp, as it’s free of python and external dependencies by design. This is very unlike ComfyUI which pulls in a gazillian external repos. Theoretically the main llama.cpp git could be compromised, but it’s a single, very well monitored point of failure there, and literally every outside architecture and feature is implemented from scratch, making it harder to sneak stuff in.

                                tal@lemmy.todayT This user is from outside of this forum
                                tal@lemmy.todayT This user is from outside of this forum
                                [email protected]
                                wrote last edited by
                                #16

                                I'm sorry, you are correct. The syntax and interface mirrors docker, and one can run ollama in Docker, so I'd thought that it was a thin wrapper around Docker, but I just went to check, and you are right --- it's not running in Docker by default. Sorry, folks! Guess now I've got one more thing to look into getting inside a container myself.

                                H 1 Reply Last reply
                                8
                                • B [email protected]

                                  TBH you should fold this into localllama? Or open source AI?

                                  I have very mixed (mostly bad) feelings on ollama. In a nutshell, they're kinda Twitter attention grabbers that give zero credit/contribution to the underlying framework (llama.cpp). And that's just the tip of the iceberg, they've made lots of controversial moves, and it seems like they're headed for commercial enshittification.

                                  They're... slimy.

                                  They like to pretend they're the only way to run local LLMs and blot out any other discussion, which is why I feel kinda bad about a dedicated ollama community.

                                  It's also a highly suboptimal way for most people to run LLMs, especially if you're willing to tweak.

                                  I would always recommend Kobold.cpp, tabbyAPI, ik_llama.cpp, Aphrodite, LM Studio, the llama.cpp server, sglang, the AMD lemonade server, any number of backends over them. Literally anything but ollama.


                                  ...TL;DR I don't the the idea of focusing on ollama at the expense of other backends. Running LLMs locally should be the community, not ollama specifically.

                                  T This user is from outside of this forum
                                  T This user is from outside of this forum
                                  [email protected]
                                  wrote last edited by
                                  #17

                                  Indeed, Ollama is going a shady route.
                                  https://github.com/ggml-org/llama.cpp/pull/11016#issuecomment-2599740463

                                  I started playing with Ramalama (the name is a mouthful) and it works great. There is one or two more steps in the setup but I've achieved great performance and the project is making good use of standards (OCI, jinja, unmodified llama.cpp, from what I understand).

                                  Go and check it out, they are compatible with models from HF and Ollama too.

                                  https://github.com/containers/ramalama

                                  1 Reply Last reply
                                  7
                                  • S [email protected]

                                    What would you recommend to hook to my home assistant?

                                    T This user is from outside of this forum
                                    T This user is from outside of this forum
                                    [email protected]
                                    wrote last edited by
                                    #18

                                    Perhaps give Ramalama a try?

                                    https://github.com/containers/ramalama

                                    1 Reply Last reply
                                    1
                                    • tal@lemmy.todayT [email protected]

                                      I'm sorry, you are correct. The syntax and interface mirrors docker, and one can run ollama in Docker, so I'd thought that it was a thin wrapper around Docker, but I just went to check, and you are right --- it's not running in Docker by default. Sorry, folks! Guess now I've got one more thing to look into getting inside a container myself.

                                      H This user is from outside of this forum
                                      H This user is from outside of this forum
                                      [email protected]
                                      wrote last edited by
                                      #19

                                      Try ramalama, it's designed to run models override oci containers

                                      1 Reply Last reply
                                      1
                                      • B [email protected]

                                        Totally depends on your hardware, and what you tend to ask it. What are you running? What do you use it for? Do you prefer speed over accuracy?

                                        W This user is from outside of this forum
                                        W This user is from outside of this forum
                                        [email protected]
                                        wrote last edited by
                                        #20

                                        I have a MacBook 2 pro (Apple silicon) and would kind of like to replace Google's Gemini as my go-to LLM. I think I'd like to run something like Mistral, probably. Currently I do have Ollama and some version of Mistral running, but I almost never used it as it's on my laptop, not my phone.

                                        I'm not big on LLMs and if I can find an LLM that I run locally and helps me get off of using Google Search and Gimini, that could be awesome. Currently I use a combo of Firefox, Qwant, Google Search, and Gemini for my daily needs. I'm not big into the direction Firefox is headed, I've heard there are arguments against Qwant, and using Gemini feels like the wrong answer for my beliefs and opinions.

                                        I'm looking for something better without too much time being sunk into something I may only sort of like. Tall order, I know, but I figured I'd give you as much info as I can.

                                        B 2 Replies Last reply
                                        0
                                        • W [email protected]

                                          I have a MacBook 2 pro (Apple silicon) and would kind of like to replace Google's Gemini as my go-to LLM. I think I'd like to run something like Mistral, probably. Currently I do have Ollama and some version of Mistral running, but I almost never used it as it's on my laptop, not my phone.

                                          I'm not big on LLMs and if I can find an LLM that I run locally and helps me get off of using Google Search and Gimini, that could be awesome. Currently I use a combo of Firefox, Qwant, Google Search, and Gemini for my daily needs. I'm not big into the direction Firefox is headed, I've heard there are arguments against Qwant, and using Gemini feels like the wrong answer for my beliefs and opinions.

                                          I'm looking for something better without too much time being sunk into something I may only sort of like. Tall order, I know, but I figured I'd give you as much info as I can.

                                          B This user is from outside of this forum
                                          B This user is from outside of this forum
                                          [email protected]
                                          wrote last edited by [email protected]
                                          #21

                                          Honestly perplexity, the online service, is pretty good.

                                          As for local running, one question first: how much RAM does your Mac have? This is basically the factor for what model you can and should run.

                                          1 Reply Last reply
                                          1
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          Powered by NodeBB Contributors
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups