• kerrigan778@lemmy.world
    link
    fedilink
    English
    arrow-up
    43
    arrow-down
    1
    ·
    5 months ago

    That license would require chatgpt to provide attribution every time it used training data of anyone there and also would require every output using that training data to be placed under the same license. This would actually legally prevent anything chatgpt created even in part using this training data from being closed source. Assuming they obviously aren’t planning on doing that this is massively shitting on the concept of licensing.

    • JohnEdwa@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      11
      ·
      edit-2
      5 months ago

      CC attribution doesn’t require you to necessarily have the credits immediately with the content, but it would result in one of the world’s longest web pages as it would need to have the name of the poster and a link to every single comment they used as training data, and stack overflow has roughly 60 million questions and answers combined.

    • theherk@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      2
      ·
      5 months ago

      Maybe but I don’t think that is well tested legally yet. For instance, I’ve learned things from there, but when I share some knowledge I don’t attribute it to all the underlying sources of my knowledge. If, on the other hand, I shared a quote or copypasta from there I’d be compelled to do so I suppose.

      I’m just not sure how neural networks will be treated in this regard. I assume they’ll conveniently claim that they can’t tie answers directly to underpinning training data.