Wondering about services to test on either a 16gb ram “AI Capable” arm64 board or on a laptop with modern rtx. Only looking for open source options, but curious to hear what people say. Cheers!

  • SmokeyDope@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    6 days ago

    I run kobold.cpp which is a cutting edge local model engine, on my local gaming rig turned server. I like to play around with the latest models to see how they improve/change over time. The current chain of thought thinking models like deepseek r1 distills and qwen qwq are fun to poke at with advanced open ended STEM questions.

    STEM questions like “What does Gödel’s incompleteness theorem imply about scientific theories of everything?” Or “Could the speed of light be more accurately refered to as ‘the speed of causality’?”

    As for actual daily use, I prefer using mistral small 24b and treating it like a local search engine with the legitimacy of wikipedia. Its a starting point to ask questions about general things I don’t know about or want advice on, then do further research through more legitimate sources.

    Its important to not take the LLM too seriously as theres always a small statistical chance it hallucinates some bullshit but most of the time its fairly accurate and is a pretty good jumping off point for further research.

    Lets say I want an overview of how can I repair small holes forming in concrete, or general ideas on how to invest financially, how to change fluids in a car, how much fat and protein is in an egg, ect.

    If the LLM says a word or related concept I don’t recognize I grill it for clarifying info and follow it through the infinite branching garden of related information.

    I’ve used an LLM to help me go through old declassified documents and speculate on internal gov terminalogy I was unfamiliar with.

    I’ve used a speech to text model and get it to speek just for fun. Ive used multimodal model and get it to see/scan documents for info.

    Ive used websearch to get the model to retrieve information it didn’t know off a ddg search, again mostly for fun.

    Feel free to ask me anything, I’m glad to help get newbies started.

    • gdog05@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      5 days ago

      Once I changed the default model, immich search became amazing. I want to show it off to people but alas, way too many NSFW pics in my library. I would create a second “clean” version to show off to people but I’ve been too lazy.

  • kata1yst@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    1
    ·
    6 days ago

    I use OLlama & Open-WebUI, OLlama on my gaming rig and Open-WebUI as a frontend on my server.

    It’s been a really powerful combo!

    • kiol@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      0
      ·
      6 days ago

      Would you please talk more about it. I forgot about Open-webui, but intending to start playing with. Honestly, what do you actually do with it?

      • afk_strats@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        3 days ago

        I have this exact same setup. Open Web UI has more features than I’ve been able to use such as functions and pipelines.

        I use it to share my LLMs across my network. It has really good user management so I can set up a user for my wife or brother in law and give them general use LLM while my dad and I can take advantage of Coding-tuned models.

        The code formatting and code execution functions are great. It’s overall a great UI.

        Ive used LLMs to rewrite code, help format PowerPoint slides, summarize my notes from work, create D&D characters, plan lessons, etc

      • Oisteink@feddit.nl
        link
        fedilink
        English
        arrow-up
        0
        ·
        edit-2
        6 days ago

        I have the same setup, but its not very usable as my graphics card has 6gb ram. I want one with 20 or 24, as the 6b models are pain and the tiny ones don’t give me much.

        Ollama was pretty easy to set up on windows, and its eqsy to download and test the models ollama has available

          • Oisteink@feddit.nl
            link
            fedilink
            English
            arrow-up
            0
            ·
            edit-2
            6 days ago

            Possibly. Been running it since last summer, but like i say the small models dont do much good for me. I have tried llama3.1 olmo2, deepseek r1 in a few variants, qwen2. Qwen2.5 coder, mistral, codellama, starcoder2, nemotron-mini, llama3.2, qwen2.5-coder, gamma2 and llava.

            I use perplexity and mistral as paid, with much better quality. Openwebui is great though, but my hardware is lacking

            Edit: saw that my mate is still using it a bit so i’ll update openwebu frpm 0.4 to 0.5.20 for him. Hes a bit anxious about sending data to the cloud so he dont mind the quality

            • Oisteink@feddit.nl
              link
              fedilink
              English
              arrow-up
              1
              ·
              6 days ago

              Scrap that - after upgrading it went bonkers and will always use one of my «knowledges» no matter what I try. The websearch fails even with ddg as engine. Its aways seemed like the ui was made by unskilled labour, but this is just horrible. 2/10 not recommended

  • ikidd@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    6 days ago

    LMStudio is pretty much the standard. I think it’s opensource except for the UI. Even if you don’t end up using it long-term, it’s great for getting used to a lot of the models.

    Otherwise there’s OpenWebUI that I would imagine would work as a docker compose, as I think there’s ARM images for OWU and ollama

    • L_Acacia@lemmy.ml
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 days ago

      Well they are fully closed source except for the open source project they are a wrapper on. The open source part is llama.cpp

      • ikidd@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 days ago

        Fair enough, but it’s damn handy and simple to use. And I don’t know how to do speculative decoding with ollama, which massively speeds up the models for me.

    • L_Acacia@lemmy.ml
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 days ago

      Qwen coder or the new gemma3.

      But at this size using privacy respecting api might be both cheaper and lead to better results.

  • truxnell@infosec.pub
    link
    fedilink
    English
    arrow-up
    0
    ·
    6 days ago

    I run ollama and auto1111 on my desktop when it’s powers on. Using open-webui in my homelab always on, and also connected to openrouter. This way I can always use openwebui with openrouter models and it’s pretty cheap per query and a little more private that using a big tech chatbot. And if I want local, I turn on the desktop and have local lamma and stab diff.

    I also get bugger all benefit out of it., it’s a cute toy.

      • L_Acacia@lemmy.ml
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 days ago

        The project is a bit out of date for newer models, Though Older ones work great.

        I recommand ComfyUi if you want fine grained control over the generation and you like to tinker.

        Swarm / Reforge / Invoke if you want neat, up to date UI.

  • colourlesspony@pawb.social
    link
    fedilink
    English
    arrow-up
    0
    ·
    6 days ago

    I messed around with home assistant and the ollama integration. I have passed on it and just use the default one with voice commands I set up. I couldn’t really get ollama to do or say anything useful. Like I asked it what’s a good time to run on a treadmill for beginners and it told me it’s not a doctor.

    • metoosalem@feddit.org
      link
      fedilink
      English
      arrow-up
      1
      ·
      6 days ago

      Like I asked it what’s a good time to run on a treadmill for beginners and it told me it’s not a doctor.

      Kirkland brand meseeks energy.

      • psmgx@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        5 days ago

        Hey now Kirkland brand is respectable, usually premium brands repackaged. Such as how Costco vodka was secretly (“secretly”) Grey Goose

    • state_electrician@discuss.tchncs.de
      link
      fedilink
      English
      arrow-up
      0
      ·
      5 days ago

      Yeah. I have a mini PC with an AMD GPU. Even if I were to buy a big GPU I couldn’t use it. That frustrates me, because I’d love to play around with some models locally. I refuse to use anything hosted by other people.