Hey everyone, I’ve been searching for a bit on getting local LLM inference to process legal paperwork (I am not a lawyer, I just have trouble through large documents to figure out my rights). This would help me have conversations with my landlord and various other people who will withhold crucial information such as your rights during a unit inspection or accuse you of things you did not etc.

Given that there are 1000s of pre-trained models, would it be better to train a small model myself on an RTX 4090 or a Daisy chain of other GPUs? Is there a legal archive somewhere that I’m just not seeing or where should I direct my energy? I think lots of us could benefit from a pocket law reference that can serve as an aid to see what to do next.

  • dartos@reddthat.com
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    10 months ago

    Generally, training an llm is a bad way to provide it with information. “In-context learning” is probably what you’re looking for. Basically just pasting relevant info and documents into your prompt.

    You might try fine tuning an existing model on a large dataset of legalese, but then it’ll be more likely to generate responses that sound like legalese, which defeats the purpose

    TL;DR Use in context learning to provide information to an LLM Use training and fine tuning to change how the language the llm generates sounds.

    • gronjo45@lemm.eeOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      I’ll read more into “in context learning” and see if I can figure out something useful from the vast corpora of datasets out there.

      I guess I can’t relegate my thinking entirely to a mathematically optimized black box, but one can hope that it could help point me in the direction to understand my rights in my housing complex.