Let’s talk about our experiences working with different models, either known or lesser-known.

Which locally run language models have you tried out? Share your insights, challenges, or anything you found interesting during your encounters with those models.

  • Kerfuffle@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 year ago

    guanaco-65B is my favorite. It’s pretty hard to go back to 33B models after you’ve tried a 65B.

    It’s slow and requires a lot of resources to run though. Also, not like there are a lot of 65B model choices.

      • Kerfuffle@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 year ago

        With a quantized GGML version you can just run on it on CPU if you have 64GB RAM. It is fairly slow though, I get about 800ms/token on a 5900X. Basically you start it generating something and come back in 30minutes or so. Can’t really carry on a conversation.

        • planish@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 year ago

          Is it smart enough that it can get the thread of what you are looking for without as much rerolling or handholding, so this comes out better?