• mushroomman_toad@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    12
    arrow-down
    3
    ·
    edit-2
    3 days ago

    Meanwhile, we have people making the web worse by not linking to source & giving us images of text instead of proper, accessible, searchable, failure tolerant text.

    - OpenAI Text Crawler

    • lmmarsano@lemmynsfw.com
      link
      fedilink
      English
      arrow-up
      9
      arrow-down
      2
      ·
      edit-2
      3 days ago

      You don’t think the disabled use technology? Or that search engine optimization existed before LLMs? Or that text sticks around when images break?

      Lack of accessibility wouldn’t stop LLMs: it could probably process images into text the hard way & waste more energy in the process. That’d be great, right?

      1. A hyphen isn’t a quotation dash.

      2. Are we playing the AI game? Let’s pretend we’re AI. Here’s some fun punctuation: ‒−–—―…:

        Beep bip boop.

      • mushroomman_toad@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        2 days ago

        Yeah thats definitely fair. Accessibility is important. It is unfortunate though that AI companies abuse accessibility and organization tags to train their LLMs.

        See how Stable Diffusion porn uses danbooru tags, and situations like this:

        https://youtube.com/watch?v=NEDFUjqA1s8

        Decentralized media based communities have the rare ability to be able to hide their data from scraping.

        • lmmarsano@lemmynsfw.com
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          2 days ago

          https://youtube.com/watch?v=NEDFUjqA1s8

          I didn’t have the patience to sit through 19 minutes of video, so I tried to read through the transcript. Then I saw the stuttering & weird, verbose fuckery going on there. Copilot, however, summarized the video, which revealed it was about deliberate obfuscation of subtitle files to attempt to thwart scrapers.

          This seems hostile to the user, and doesn’t seem to work as intended, so I’m not sure what to think of it. I know people who have trouble sequencing information and rely on transcripts. Good accessibility benefits nondisabled users, too (an additional incentive for it).

          Not trying to be overly critical. I’ll have to look into danbooru tags: unfamiliar with those. Thanks.

            • lmmarsano@lemmynsfw.com
              link
              fedilink
              English
              arrow-up
              1
              ·
              2 days ago

              I’m not sure this is so much virtues becoming rarer as inconvenient demands emerging: a video that could have been an article is a problem of the modern age.

              Articles can be read quickly & processed structurally by jumping around sections. Videos, however, can be agonizing, because they resist that sort of processing. Transcripts can alleviate the problem somewhat, but obfuscating them undoes that. And we’ve got things to do.