Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse

NodeBB

  1. Home
  2. Selfhosted
  3. What is a self-hosted small LLM actually good for (<= 3B)

What is a self-hosted small LLM actually good for (<= 3B)

Scheduled Pinned Locked Moved Selfhosted
selfhosted
76 Posts 34 Posters 22 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • C [email protected]

    I've tried coding and every one I've tried fails unless really, really basic small functions like what you learn as a newbie compared to say 4o mini that can spit out more sensible stuff that works.

    I've tried explanations and they just regurgitate sentences that can be irrelevant, wrong, or get stuck in a loop.

    So. what can I actually use a small LLM for? Which ones? I ask because I have an old laptop and the GPU can't really handle anything above 4B in a timely manner. 8B is about 1 t/s!

    S This user is from outside of this forum
    S This user is from outside of this forum
    [email protected]
    wrote last edited by
    #21

    I installed Llama. I've not found any use for it. I mean, I've asked it for a recipe because recipe websites suck, but that's about it.

    G 1 Reply Last reply
    18
    • E [email protected]

      I've integrated mine into Home Assistant, which makes it easier to use their voice commands.

      I haven't done a ton with it yet besides set it up, though, since I'm still getting proxmox configured on my gaming rig.

      P This user is from outside of this forum
      P This user is from outside of this forum
      [email protected]
      wrote last edited by
      #22

      What are you using for voice integration? I really don't want to buy and assemble their solution if I don't have to

      E 1 Reply Last reply
      2
      • C [email protected]

        I've tried coding and every one I've tried fails unless really, really basic small functions like what you learn as a newbie compared to say 4o mini that can spit out more sensible stuff that works.

        I've tried explanations and they just regurgitate sentences that can be irrelevant, wrong, or get stuck in a loop.

        So. what can I actually use a small LLM for? Which ones? I ask because I have an old laptop and the GPU can't really handle anything above 4B in a timely manner. 8B is about 1 t/s!

        R This user is from outside of this forum
        R This user is from outside of this forum
        [email protected]
        wrote last edited by
        #23

        I've run a few models that I could on my GPU. I don't think the smaller models are really good enough. They can do stuff, sure, but to get anything out of it, I think you need the larger models.

        They can be used for basic things, though. There are coder specific models you can look at. Deepseek and qwen coder are some popular ones

        S C 2 Replies Last reply
        3
        • shnizmuffin@lemmy.inbutts.lolS [email protected]

          Hey, you're treating that data with the respect it demands, right? And you definitely collected consent from those chat participants before you Hoover'd up their [re-reads example] extremely Personal Identification Information AND Personal Health Information, right? Because if you didn't, you're in violation of a bunch of laws and the Twitch TOS.

          C This user is from outside of this forum
          C This user is from outside of this forum
          [email protected]
          wrote last edited by [email protected]
          #24

          If I say my name is Doo doo head, in a public park, and someone happens to overhear it - they can do with that information whatever they want. Same thing. If you wanna spew your personal life on Twitch, there are bots that listen to all of the channels everywhere on twitch. They aren't violating any laws, or Twitch TOS. So, *buzzer* WRONG.

          Right now, the same thing is being done to you on Lemmy. And Reddit. And Facebook. And everywhere else.

          Look at a bot called "FrostyTools" for Twitch. Reads Twitch chat, Uses an AI to provide summaries of chat every 30 minutes or so. If that's not violating TOS, then neither am I. And thousands upon thousands of people use FrostyTools.

          I have the consent of the streamer, I have the consent of Twitch (through their developer API), and upon using Twitch, you give the right to them to collect, distribute, and use that data at their whim.

          A C shnizmuffin@lemmy.inbutts.lolS 3 Replies Last reply
          5
          • C [email protected]

            Surely none of that uses a small LLM <= 3B?

            C This user is from outside of this forum
            C This user is from outside of this forum
            [email protected]
            wrote last edited by [email protected]
            #25

            Yes. The small LLM isn't retrieving data, it's just understanding context of text enough to know what "Facts" need to be written to a file. I'm using the publicly released Deepseek models from a couple of months ago.

            C 1 Reply Last reply
            1
            • S [email protected]

              I installed Llama. I've not found any use for it. I mean, I've asked it for a recipe because recipe websites suck, but that's about it.

              G This user is from outside of this forum
              G This user is from outside of this forum
              [email protected]
              wrote last edited by
              #26

              you can do a lot with it.

              I heated my office with it this past winter.

              1 Reply Last reply
              44
              • R [email protected]

                I've run a few models that I could on my GPU. I don't think the smaller models are really good enough. They can do stuff, sure, but to get anything out of it, I think you need the larger models.

                They can be used for basic things, though. There are coder specific models you can look at. Deepseek and qwen coder are some popular ones

                S This user is from outside of this forum
                S This user is from outside of this forum
                [email protected]
                wrote last edited by
                #27

                Been coming to similar conclusions with some local adventures. It's decent but not as able to process larger contexts.

                1 Reply Last reply
                0
                • P [email protected]

                  What are you using for voice integration? I really don't want to buy and assemble their solution if I don't have to

                  E This user is from outside of this forum
                  E This user is from outside of this forum
                  [email protected]
                  wrote last edited by
                  #28

                  I just use the companion app for now. But I am designing a HAL9000 system for my home.

                  shnizmuffin@lemmy.inbutts.lolS 1 Reply Last reply
                  2
                  • M [email protected]

                    RAG is basically like telling an LLM "look here for more info before you answer" so it can check out local documents to give an answer that is more relevant to you.

                    You just search "open web ui rag" and find plenty kf explanations and tutorials

                    I This user is from outside of this forum
                    I This user is from outside of this forum
                    [email protected]
                    wrote last edited by [email protected]
                    #29

                    I think RAG will be surpassed by LLMs in a loop with tool calling (aka agents), with search being one of the tools.

                    I 1 Reply Last reply
                    3
                    • C [email protected]

                      I've tried coding and every one I've tried fails unless really, really basic small functions like what you learn as a newbie compared to say 4o mini that can spit out more sensible stuff that works.

                      I've tried explanations and they just regurgitate sentences that can be irrelevant, wrong, or get stuck in a loop.

                      So. what can I actually use a small LLM for? Which ones? I ask because I have an old laptop and the GPU can't really handle anything above 4B in a timely manner. 8B is about 1 t/s!

                      ikidd@lemmy.worldI This user is from outside of this forum
                      ikidd@lemmy.worldI This user is from outside of this forum
                      [email protected]
                      wrote last edited by
                      #30

                      It'll work for quick bash scripts and one-off things like that. But there's not usually enough context window unless you're using a 24G GPU or such.

                      S C 2 Replies Last reply
                      10
                      • I [email protected]

                        I think RAG will be surpassed by LLMs in a loop with tool calling (aka agents), with search being one of the tools.

                        I This user is from outside of this forum
                        I This user is from outside of this forum
                        [email protected]
                        wrote last edited by
                        #31

                        LLMs that train LoRas on the fly then query themselves with the LoRa applied

                        1 Reply Last reply
                        4
                        • C [email protected]

                          Most US states are single party consent. https://recordinglaw.com/united-states-recording-laws/one-party-consent-states/

                          I This user is from outside of this forum
                          I This user is from outside of this forum
                          [email protected]
                          wrote last edited by
                          #32

                          There is no expectation of privacy in public spaces. Participants to these streams which are open to all do not have a prohibition on repeating what they have heard.

                          C K 2 Replies Last reply
                          1
                          • C [email protected]

                            If I say my name is Doo doo head, in a public park, and someone happens to overhear it - they can do with that information whatever they want. Same thing. If you wanna spew your personal life on Twitch, there are bots that listen to all of the channels everywhere on twitch. They aren't violating any laws, or Twitch TOS. So, *buzzer* WRONG.

                            Right now, the same thing is being done to you on Lemmy. And Reddit. And Facebook. And everywhere else.

                            Look at a bot called "FrostyTools" for Twitch. Reads Twitch chat, Uses an AI to provide summaries of chat every 30 minutes or so. If that's not violating TOS, then neither am I. And thousands upon thousands of people use FrostyTools.

                            I have the consent of the streamer, I have the consent of Twitch (through their developer API), and upon using Twitch, you give the right to them to collect, distribute, and use that data at their whim.

                            A This user is from outside of this forum
                            A This user is from outside of this forum
                            [email protected]
                            wrote last edited by
                            #33

                            So, buzzer WRONG.

                            Quite arrogant after you just constructed a faulty comparison.

                            If I say my name is Doo doo head, in a public park, and someone happens to overhear it - they can do with that information whatever they want. Same thing.

                            That's absolutely not the same thing. Overhearing something that is in the background is fundamentally different from actively recording everything going on in a public space. You film yourself or some performance in a park and someone happens to be in the background? No problem. You build a system to identify everyone in the park and collect recordings of their conversations? Absolutely a problem, depending on the jurisdiction. The intent of the recording(s) and the reasonable expectations of the people recorded are factored in in many jurisdictions, and being in public doesn't automatically entail consent to being recorded.

                            See for example https://www.freedomforum.org/recording-in-public/

                            (And just to clarify: I am not arguing against your explanation of Twitch's TOS, only against the bad comparison you brought.)

                            C K 2 Replies Last reply
                            9
                            • ikidd@lemmy.worldI [email protected]

                              It'll work for quick bash scripts and one-off things like that. But there's not usually enough context window unless you're using a 24G GPU or such.

                              S This user is from outside of this forum
                              S This user is from outside of this forum
                              [email protected]
                              wrote last edited by
                              #34

                              Snippets are a great use.

                              I use StableCode on my phone as a programming tutor for learning Python. It is outstanding in both speed and in accuracy for this task. I have it generate definitions which I copy and paste into Anki the flashcard app. Whenever I'm on a bus or airplane I just start studying. Wish that it could also quiz me interactively.

                              C 1 Reply Last reply
                              2
                              • C [email protected]

                                I've tried coding and every one I've tried fails unless really, really basic small functions like what you learn as a newbie compared to say 4o mini that can spit out more sensible stuff that works.

                                I've tried explanations and they just regurgitate sentences that can be irrelevant, wrong, or get stuck in a loop.

                                So. what can I actually use a small LLM for? Which ones? I ask because I have an old laptop and the GPU can't really handle anything above 4B in a timely manner. 8B is about 1 t/s!

                                swelter_spark@reddthat.comS This user is from outside of this forum
                                swelter_spark@reddthat.comS This user is from outside of this forum
                                [email protected]
                                wrote last edited by
                                #35

                                7b is the smallest I've found useful. I'd try a smaller quant before going lower, if I had super small vram.

                                1 Reply Last reply
                                7
                                • C [email protected]

                                  If I say my name is Doo doo head, in a public park, and someone happens to overhear it - they can do with that information whatever they want. Same thing. If you wanna spew your personal life on Twitch, there are bots that listen to all of the channels everywhere on twitch. They aren't violating any laws, or Twitch TOS. So, *buzzer* WRONG.

                                  Right now, the same thing is being done to you on Lemmy. And Reddit. And Facebook. And everywhere else.

                                  Look at a bot called "FrostyTools" for Twitch. Reads Twitch chat, Uses an AI to provide summaries of chat every 30 minutes or so. If that's not violating TOS, then neither am I. And thousands upon thousands of people use FrostyTools.

                                  I have the consent of the streamer, I have the consent of Twitch (through their developer API), and upon using Twitch, you give the right to them to collect, distribute, and use that data at their whim.

                                  C This user is from outside of this forum
                                  C This user is from outside of this forum
                                  [email protected]
                                  wrote last edited by
                                  #36

                                  Doesn't Twitch own all data that is written and their TOS will state something like you can't store data yourself locally.

                                  C 1 Reply Last reply
                                  2
                                  • I [email protected]

                                    There is no expectation of privacy in public spaces. Participants to these streams which are open to all do not have a prohibition on repeating what they have heard.

                                    C This user is from outside of this forum
                                    C This user is from outside of this forum
                                    [email protected]
                                    wrote last edited by
                                    #37

                                    Right and what I was saying was even if it wasnt “public”, single party consent means the person recording can be that single party- so still a non-issue.

                                    1 Reply Last reply
                                    0
                                    • C [email protected]

                                      Doesn't Twitch own all data that is written and their TOS will state something like you can't store data yourself locally.

                                      C This user is from outside of this forum
                                      C This user is from outside of this forum
                                      [email protected]
                                      wrote last edited by [email protected]
                                      #38

                                      I'm not storing their data. I'm feeding it to an LLM which infers things and storing that data. Other Twitch bots store twitch data too. Everything from birthdays to imaginary internet points.

                                      C 2 Replies Last reply
                                      2
                                      • A [email protected]

                                        So, buzzer WRONG.

                                        Quite arrogant after you just constructed a faulty comparison.

                                        If I say my name is Doo doo head, in a public park, and someone happens to overhear it - they can do with that information whatever they want. Same thing.

                                        That's absolutely not the same thing. Overhearing something that is in the background is fundamentally different from actively recording everything going on in a public space. You film yourself or some performance in a park and someone happens to be in the background? No problem. You build a system to identify everyone in the park and collect recordings of their conversations? Absolutely a problem, depending on the jurisdiction. The intent of the recording(s) and the reasonable expectations of the people recorded are factored in in many jurisdictions, and being in public doesn't automatically entail consent to being recorded.

                                        See for example https://www.freedomforum.org/recording-in-public/

                                        (And just to clarify: I am not arguing against your explanation of Twitch's TOS, only against the bad comparison you brought.)

                                        C This user is from outside of this forum
                                        C This user is from outside of this forum
                                        [email protected]
                                        wrote last edited by [email protected]
                                        #39

                                        You build a system to identify everyone in the park and collect recordings of their conversations? Absolutely a problem, depending on the jurisdiction.

                                        Literally not. The police use this right now to record your location and time seen using license plates all over the nation - with private corporations providing the service.

                                        and being in public doesn't automatically entail consent to being recorded.

                                        And yes, it's called 'expectation to the right of privacy'. Public venues are not 'private' locations, and thus do not need consent. You can, quite literally, record anyone in public.

                                        Even the link you provided agrees.

                                        T 1 Reply Last reply
                                        1
                                        • M [email protected]

                                          I've used smollm2:135m for projects in DBeaver building larger queries. The box it runs on is Intel HD graphics with an old Ryzen processor. Doesn't seem to really stress the CPU.

                                          UPDATE: I apologize to the downvoter for not masochistically wanting to build a 1000 line bulk insert statement by hand.

                                          H This user is from outside of this forum
                                          H This user is from outside of this forum
                                          [email protected]
                                          wrote last edited by
                                          #40

                                          How, exactly, do you have Intel HD graphics, found on Intel APUs, on a Ryzen AMD system?

                                          1 Reply Last reply
                                          2
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          Powered by NodeBB Contributors
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups