Large language models (LLMs) like GPT-4 can identify a person’s age, location, gender and income with up to 85 per cent accuracy simply by analysing their posts on social media.

But the AIs also picked up on subtler cues, like location-specific slang, and could estimate a salary range from a user’s profession and location.

Reference:

arXiv DOI: 10.48550/arXiv.2310.07298

    • AbouBenAdhem@lemmy.world
      link
      fedilink
      English
      arrow-up
      24
      ·
      edit-2
      8 months ago

      It sounds like the reason they used reddit was so they could easily find users who had expressly revealed the information in question, and use it to verify that the AI was accurately deducing the same info from style alone.

      • imgprojts@lemmy.ml
        link
        fedilink
        English
        arrow-up
        2
        ·
        8 months ago

        They used reddit because it has corraled dumb users. Users a no longer around anywhere else in the Internet, just here on social media. And yes, what better place to find dumb users than on reddit!

    • 👍Maximum Derek👍@discuss.tchncs.de
      link
      fedilink
      English
      arrow-up
      9
      ·
      8 months ago

      Yeah, even if I didn’t belong to a local community and a bunch of communities surrounding my profession, the amount of intrigue and fascination emanating from my comments would cause anyone to guess that I’m the Dos Equis guy.

    • chatokun@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      2
      ·
      8 months ago

      Same. I’m sure I’ve posted about my location, my job, my race, my history, my real first name, general details of my family makeup etc. I also have a pretty unique name so searching just my first and last name will find stuff about me anyway. I’m even listed by name in books (I was young and dumb and answered some questions about work life).