cross-posted from: https://lemmy.world/post/76533

One of the arguments made for Reddit’s API changes is that they are now the go to place for LLM training data (e.g. for ChatGPT).

https://www.reddit.com/r/reddit/comments/145bram/addressing_the_community_about_changes_to_our_api/jnk9izp/?context=3

I haven’t seen a whole lot of discussion around this and would like to hear people’s opinions. Are you concerned about your posts being used for LLM training? Do you not care? Do you prefer that your comments are available to train open source LLMs?

(I will post my personal opinion in a comment so it can be up/down voted separately)

  • FearTheCron@lemmy.worldOP
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    I hope cross posts are OK. But I am curious about Experienced Dev’s perspective on this as well since the question is rather technical.

    Copying my opinion from the other thread in case you don’t want to look at my other thread:

    My personal opinion is that high API usage fees hurt open source LLMs (e.g. GPT4All). I would rather not see this new technology monopolized by those who can pay API fees.

    • Clifspeare@programming.dev
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 year ago

      I’d tend to agree. There are enough barriers to training large models without artificially increase them just because the largest players can afford it.