Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse

NodeBB

  1. Home
  2. Selfhosted
  3. What is a self-hosted small LLM actually good for (<= 3B)

What is a self-hosted small LLM actually good for (<= 3B)

Scheduled Pinned Locked Moved Selfhosted
selfhosted
76 Posts 34 Posters 23 Views
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • C [email protected]

    I've tried coding and every one I've tried fails unless really, really basic small functions like what you learn as a newbie compared to say 4o mini that can spit out more sensible stuff that works.

    I've tried explanations and they just regurgitate sentences that can be irrelevant, wrong, or get stuck in a loop.

    So. what can I actually use a small LLM for? Which ones? I ask because I have an old laptop and the GPU can't really handle anything above 4B in a timely manner. 8B is about 1 t/s!

    30p87@feddit.org3 This user is from outside of this forum
    30p87@feddit.org3 This user is from outside of this forum
    [email protected]
    wrote last edited by
    #2

    Nothing.

    1 Reply Last reply
    8
    • C [email protected]

      I've tried coding and every one I've tried fails unless really, really basic small functions like what you learn as a newbie compared to say 4o mini that can spit out more sensible stuff that works.

      I've tried explanations and they just regurgitate sentences that can be irrelevant, wrong, or get stuck in a loop.

      So. what can I actually use a small LLM for? Which ones? I ask because I have an old laptop and the GPU can't really handle anything above 4B in a timely manner. 8B is about 1 t/s!

      I This user is from outside of this forum
      I This user is from outside of this forum
      [email protected]
      wrote last edited by
      #3

      Converting free text to standardized forms such as json

      M 1 Reply Last reply
      24
      • C [email protected]

        I've tried coding and every one I've tried fails unless really, really basic small functions like what you learn as a newbie compared to say 4o mini that can spit out more sensible stuff that works.

        I've tried explanations and they just regurgitate sentences that can be irrelevant, wrong, or get stuck in a loop.

        So. what can I actually use a small LLM for? Which ones? I ask because I have an old laptop and the GPU can't really handle anything above 4B in a timely manner. 8B is about 1 t/s!

        H This user is from outside of this forum
        H This user is from outside of this forum
        [email protected]
        wrote last edited by
        #4

        Sorry, I am just gonne dump you some links from my bookmarks that were related and interesting to read, cause I am traveling and have to get up in a minute, but I've been interested in this topic for a while. All of the links discuss at least some usecases. For some reason microsoft is really into tiny models and made big breakthroughs there.

        https://reddit.com/r/LocalLLaMA/comments/1cdrw7p/what_are_the_potential_uses_of_small_less_than_3b/

        https://github.com/microsoft/BitNet

        https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/

        https://news.microsoft.com/source/features/ai/the-phi-3-small-language-models-with-big-potential/

        https://techcommunity.microsoft.com/blog/aiplatformblog/introducing-phi-4-microsoft’s-newest-small-language-model-specializing-in-comple/4357090

        1 Reply Last reply
        42
        • I [email protected]

          Converting free text to standardized forms such as json

          M This user is from outside of this forum
          M This user is from outside of this forum
          [email protected]
          wrote last edited by
          #5

          Oh—do you happen to have any recommendations for that?

          I 1 Reply Last reply
          4
          • C [email protected]

            I've tried coding and every one I've tried fails unless really, really basic small functions like what you learn as a newbie compared to say 4o mini that can spit out more sensible stuff that works.

            I've tried explanations and they just regurgitate sentences that can be irrelevant, wrong, or get stuck in a loop.

            So. what can I actually use a small LLM for? Which ones? I ask because I have an old laptop and the GPU can't really handle anything above 4B in a timely manner. 8B is about 1 t/s!

            E This user is from outside of this forum
            E This user is from outside of this forum
            [email protected]
            wrote last edited by [email protected]
            #6

            I've integrated mine into Home Assistant, which makes it easier to use their voice commands.

            I haven't done a ton with it yet besides set it up, though, since I'm still getting proxmox configured on my gaming rig.

            P 1 Reply Last reply
            14
            • C [email protected]

              I've tried coding and every one I've tried fails unless really, really basic small functions like what you learn as a newbie compared to say 4o mini that can spit out more sensible stuff that works.

              I've tried explanations and they just regurgitate sentences that can be irrelevant, wrong, or get stuck in a loop.

              So. what can I actually use a small LLM for? Which ones? I ask because I have an old laptop and the GPU can't really handle anything above 4B in a timely manner. 8B is about 1 t/s!

              H This user is from outside of this forum
              H This user is from outside of this forum
              [email protected]
              wrote last edited by [email protected]
              #7

              I think that's a size where it's a bit more than a good autocomplete. Could be part of a chain for retrieval augmented generation. Maybe some specific tasks. And there are small machine learning models that can do translation or sentiment analysis, though I don't think those are your regular LLM chatbots... And well, you can ask basic questions and write dialogue. Something like "What is an Alpaca?" will work. But they don't have much knowledge under 8B parameters and they regularly struggle to apply their knowledge to a given task at smaller sizes. At least that's my experience. They've become way better at smaller sizes during the last year or so. But they're very limited.

              I'm not sure what you intend to do. If you have some specific thing you'd like an LLM to do, you need to pick the correct one. If you don't have any use-case... just run an arbitrary one and tinker around?

              1 Reply Last reply
              2
              • C [email protected]

                I've tried coding and every one I've tried fails unless really, really basic small functions like what you learn as a newbie compared to say 4o mini that can spit out more sensible stuff that works.

                I've tried explanations and they just regurgitate sentences that can be irrelevant, wrong, or get stuck in a loop.

                So. what can I actually use a small LLM for? Which ones? I ask because I have an old laptop and the GPU can't really handle anything above 4B in a timely manner. 8B is about 1 t/s!

                I This user is from outside of this forum
                I This user is from outside of this forum
                [email protected]
                wrote last edited by
                #8

                absolutely nothing

                1 Reply Last reply
                4
                • C [email protected]

                  I've tried coding and every one I've tried fails unless really, really basic small functions like what you learn as a newbie compared to say 4o mini that can spit out more sensible stuff that works.

                  I've tried explanations and they just regurgitate sentences that can be irrelevant, wrong, or get stuck in a loop.

                  So. what can I actually use a small LLM for? Which ones? I ask because I have an old laptop and the GPU can't really handle anything above 4B in a timely manner. 8B is about 1 t/s!

                  C This user is from outside of this forum
                  C This user is from outside of this forum
                  [email protected]
                  wrote last edited by [email protected]
                  #9

                  Currently I've been using a local AI (a couple different kinds) to first - take the audio from a Twitch stream; so that I have context about the conversation, convert it to text, and then use a second AI; an LLM fed the first AIs translation + twitch chat and store 'facts' about specific users so that they can be referenced quickly for a streamer who has ADHD in order to be more personable.

                  That way, the guy can ask User X how their mothers surgery went. Or he can remember that User K has a birthday coming up. Or remember that User G's son just got a PS5 for Christmas, and wants a specific game.

                  It allows him to be more personable because he has issues remembering details about his users. It's still kind of a big alpha test at the moment, because we don't know the best way to display the 'data', but it functions as an aid.

                  hadowenkiroast@piefed.socialH shnizmuffin@lemmy.inbutts.lolS C 3 Replies Last reply
                  7
                  • C [email protected]

                    Currently I've been using a local AI (a couple different kinds) to first - take the audio from a Twitch stream; so that I have context about the conversation, convert it to text, and then use a second AI; an LLM fed the first AIs translation + twitch chat and store 'facts' about specific users so that they can be referenced quickly for a streamer who has ADHD in order to be more personable.

                    That way, the guy can ask User X how their mothers surgery went. Or he can remember that User K has a birthday coming up. Or remember that User G's son just got a PS5 for Christmas, and wants a specific game.

                    It allows him to be more personable because he has issues remembering details about his users. It's still kind of a big alpha test at the moment, because we don't know the best way to display the 'data', but it functions as an aid.

                    hadowenkiroast@piefed.socialH This user is from outside of this forum
                    hadowenkiroast@piefed.socialH This user is from outside of this forum
                    [email protected]
                    wrote last edited by
                    #10

                    sounds like salesforce for a twitch setting. cool use case, must make fun moments when he mentions such things.

                    jlow@discuss.tchncs.deJ 1 Reply Last reply
                    1
                    • hadowenkiroast@piefed.socialH [email protected]

                      sounds like salesforce for a twitch setting. cool use case, must make fun moments when he mentions such things.

                      jlow@discuss.tchncs.deJ This user is from outside of this forum
                      jlow@discuss.tchncs.deJ This user is from outside of this forum
                      [email protected]
                      wrote last edited by
                      #11

                      Esp. if the LLM just hallucinates 50% of the "facts" a about the users 👌

                      C 1 Reply Last reply
                      6
                      • M [email protected]

                        Oh—do you happen to have any recommendations for that?

                        I This user is from outside of this forum
                        I This user is from outside of this forum
                        [email protected]
                        wrote last edited by
                        #12

                        DeepSeek-R1-Distill-Qwen-1.5B

                        1 Reply Last reply
                        15
                        • C [email protected]

                          I've tried coding and every one I've tried fails unless really, really basic small functions like what you learn as a newbie compared to say 4o mini that can spit out more sensible stuff that works.

                          I've tried explanations and they just regurgitate sentences that can be irrelevant, wrong, or get stuck in a loop.

                          So. what can I actually use a small LLM for? Which ones? I ask because I have an old laptop and the GPU can't really handle anything above 4B in a timely manner. 8B is about 1 t/s!

                          M This user is from outside of this forum
                          M This user is from outside of this forum
                          [email protected]
                          wrote last edited by
                          #13

                          I've used smollm2:135m for projects in DBeaver building larger queries. The box it runs on is Intel HD graphics with an old Ryzen processor. Doesn't seem to really stress the CPU.

                          UPDATE: I apologize to the downvoter for not masochistically wanting to build a 1000 line bulk insert statement by hand.

                          H 1 Reply Last reply
                          3
                          • jlow@discuss.tchncs.deJ [email protected]

                            Esp. if the LLM just hallucinates 50% of the "facts" a about the users 👌

                            C This user is from outside of this forum
                            C This user is from outside of this forum
                            [email protected]
                            wrote last edited by [email protected]
                            #14

                            That hasn't been a problem at all for the 200+ users it's tracking so far for about 4 months.

                            I don't know a human that could ever keep up with this kind of thing. People just think he's super personable, but in reality he's not. He's just got a really cool tool to use.

                            He's managed some really good numbers because being that personal with people brings them back and keeps them chatting. He'll be pushing for partner after streaming for only a year and he's just some guy I found playing Wild Hearts with 0 viewers one day... 😛

                            1 Reply Last reply
                            4
                            • C [email protected]

                              Currently I've been using a local AI (a couple different kinds) to first - take the audio from a Twitch stream; so that I have context about the conversation, convert it to text, and then use a second AI; an LLM fed the first AIs translation + twitch chat and store 'facts' about specific users so that they can be referenced quickly for a streamer who has ADHD in order to be more personable.

                              That way, the guy can ask User X how their mothers surgery went. Or he can remember that User K has a birthday coming up. Or remember that User G's son just got a PS5 for Christmas, and wants a specific game.

                              It allows him to be more personable because he has issues remembering details about his users. It's still kind of a big alpha test at the moment, because we don't know the best way to display the 'data', but it functions as an aid.

                              shnizmuffin@lemmy.inbutts.lolS This user is from outside of this forum
                              shnizmuffin@lemmy.inbutts.lolS This user is from outside of this forum
                              [email protected]
                              wrote last edited by
                              #15

                              Hey, you're treating that data with the respect it demands, right? And you definitely collected consent from those chat participants before you Hoover'd up their [re-reads example] extremely Personal Identification Information AND Personal Health Information, right? Because if you didn't, you're in violation of a bunch of laws and the Twitch TOS.

                              C C 2 Replies Last reply
                              12
                              • shnizmuffin@lemmy.inbutts.lolS [email protected]

                                Hey, you're treating that data with the respect it demands, right? And you definitely collected consent from those chat participants before you Hoover'd up their [re-reads example] extremely Personal Identification Information AND Personal Health Information, right? Because if you didn't, you're in violation of a bunch of laws and the Twitch TOS.

                                C This user is from outside of this forum
                                C This user is from outside of this forum
                                [email protected]
                                wrote last edited by
                                #16

                                Most US states are single party consent. https://recordinglaw.com/united-states-recording-laws/one-party-consent-states/

                                I 1 Reply Last reply
                                2
                                • C [email protected]

                                  Currently I've been using a local AI (a couple different kinds) to first - take the audio from a Twitch stream; so that I have context about the conversation, convert it to text, and then use a second AI; an LLM fed the first AIs translation + twitch chat and store 'facts' about specific users so that they can be referenced quickly for a streamer who has ADHD in order to be more personable.

                                  That way, the guy can ask User X how their mothers surgery went. Or he can remember that User K has a birthday coming up. Or remember that User G's son just got a PS5 for Christmas, and wants a specific game.

                                  It allows him to be more personable because he has issues remembering details about his users. It's still kind of a big alpha test at the moment, because we don't know the best way to display the 'data', but it functions as an aid.

                                  C This user is from outside of this forum
                                  C This user is from outside of this forum
                                  [email protected]
                                  wrote last edited by
                                  #17

                                  Surely none of that uses a small LLM <= 3B?

                                  C 1 Reply Last reply
                                  0
                                  • C [email protected]

                                    I've tried coding and every one I've tried fails unless really, really basic small functions like what you learn as a newbie compared to say 4o mini that can spit out more sensible stuff that works.

                                    I've tried explanations and they just regurgitate sentences that can be irrelevant, wrong, or get stuck in a loop.

                                    So. what can I actually use a small LLM for? Which ones? I ask because I have an old laptop and the GPU can't really handle anything above 4B in a timely manner. 8B is about 1 t/s!

                                    M This user is from outside of this forum
                                    M This user is from outside of this forum
                                    [email protected]
                                    wrote last edited by
                                    #18

                                    Have you tried RAG? I believe that they are actually pretty good for searching and compiling content from RAG.

                                    So in theory you could have it connect to all of you local documents and use it for quick questions. Or maybe connected to your signal/whatsapp/sms chat history to ask questions about past conversations

                                    C 1 Reply Last reply
                                    13
                                    • M [email protected]

                                      Have you tried RAG? I believe that they are actually pretty good for searching and compiling content from RAG.

                                      So in theory you could have it connect to all of you local documents and use it for quick questions. Or maybe connected to your signal/whatsapp/sms chat history to ask questions about past conversations

                                      C This user is from outside of this forum
                                      C This user is from outside of this forum
                                      [email protected]
                                      wrote last edited by
                                      #19

                                      No, what is it? How do I try it?

                                      M 1 Reply Last reply
                                      4
                                      • C [email protected]

                                        No, what is it? How do I try it?

                                        M This user is from outside of this forum
                                        M This user is from outside of this forum
                                        [email protected]
                                        wrote last edited by
                                        #20

                                        RAG is basically like telling an LLM "look here for more info before you answer" so it can check out local documents to give an answer that is more relevant to you.

                                        You just search "open web ui rag" and find plenty kf explanations and tutorials

                                        I 1 Reply Last reply
                                        13
                                        • C [email protected]

                                          I've tried coding and every one I've tried fails unless really, really basic small functions like what you learn as a newbie compared to say 4o mini that can spit out more sensible stuff that works.

                                          I've tried explanations and they just regurgitate sentences that can be irrelevant, wrong, or get stuck in a loop.

                                          So. what can I actually use a small LLM for? Which ones? I ask because I have an old laptop and the GPU can't really handle anything above 4B in a timely manner. 8B is about 1 t/s!

                                          S This user is from outside of this forum
                                          S This user is from outside of this forum
                                          [email protected]
                                          wrote last edited by
                                          #21

                                          I installed Llama. I've not found any use for it. I mean, I've asked it for a recipe because recipe websites suck, but that's about it.

                                          G 1 Reply Last reply
                                          18
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Login or register to search.
                                          Powered by NodeBB Contributors
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups