• nieceandtows@programming.dev
    link
    fedilink
    arrow-up
    16
    ·
    11 days ago

    How are subtitles created usually? Are they provided by the source material team, some professional third party that manually transcribes the video, or just fans doing it for free?

    • megopie@beehaw.org
      link
      fedilink
      English
      arrow-up
      41
      arrow-down
      1
      ·
      edit-2
      11 days ago

      See that’s the kicker, for the longest time it was basically all fan translated subtitles, and only recently have payed for translation become the norm.

      So it’s really quite pathetic for them to try and save a few bucks by replacing a proper translator with a LLM, given that there are still plenty of passionate fans who would have done it for free. Especially given that translating between Japanese and English in a cultural context heavy situation is something these LLMs are really bad at.

      • Unboxious@ani.social
        link
        fedilink
        arrow-up
        8
        ·
        11 days ago

        given that there are still plenty of passionate fans who would have done it for free

        I’d imagine this is a non-starter from a corporate standpoint. I know if I were in charge I’d be terrified of the idea of just trusting community-submitted subtitles to not have random slurs or something inserted. That said I still think it would be super cool if they’d let people source and use their own subtitle files; I now it’s possible because I have a tampermonkey script that lets me do just that.

        • megopie@beehaw.org
          link
          fedilink
          English
          arrow-up
          3
          ·
          11 days ago

          That’s the core of the issue, crunchy roll has sat its self as a corporate middleman, buying the rights to distribute shows and then charging consumers a subscription for access.

          But they can’t be bothered to do the only actual damn work their position would realistically demand, beyond renting server space; providing translations for the foreign media they’re distributing.

          That’s without even discussing the fact that not a single penny users give them will end up in the hands of any of the exploited artists who actually made the shows, since the industry doesn’t work on residuals or any other kind of profit sharing, the licensing fees crunchy roll pays essentially going straight to financiers.

    • sabreW4K3@lazysoci.alOP
      link
      fedilink
      arrow-up
      13
      ·
      11 days ago

      In terms of anime fansubs, it’s normally just great folks in the community. Some got hired by studios. But the studio is meant to provide the subs.

    • handsoffmydata@lemmy.zip
      link
      fedilink
      arrow-up
      4
      ·
      11 days ago

      I maintain my own media library and I ensure every file has English and German subtitles. There are a variety of ways to source srt files but when all else fails a machine with enough compute can transcribe video files using open source whisper. After I generate an English srt file from the video I send it to OpenAI to create the German translation.

      • dubyakay@lemmy.ca
        link
        fedilink
        arrow-up
        1
        ·
        11 days ago

        Is there something similar for manga? Something that can overlay Japanese text on images, similar to what we have on smartphones but for the PC?

    • James R Kirk@startrek.website
      link
      fedilink
      English
      arrow-up
      15
      ·
      11 days ago

      For YouTube tutorial videos I have no issue with relying on GPT, but I think it’s important to recognize that the translation of art is art. I don’t feel good about the idea of something without a soul or perspective interpolating a work of art from one culture and language into another that might be wildly different from where it started.

      That all said, I think Crunchyroll and anyone else using AI art without disclosing it absolutely should be honest about it.

  • Geodad@beehaw.org
    link
    fedilink
    arrow-up
    6
    arrow-down
    1
    ·
    11 days ago

    As someone who is able to speak Japanese, I’d notice the drop in quality of translation almost instantly.

    I never turn on subs anyway when I watch my anime though.

    • t3rmit3@beehaw.org
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      10 days ago

      I have to since my partner doesn’t speak Japanese, but half the time I end up having to correct lines for them once or twice, to make things make sense. The non-egregious stuff I don’t even bother with. It’s crazy how amateurish some of the mistakes are, or even what are clearly choices to omit entire sentences, for no reason.

      おい、ゆうじ君、海行こうぜ

      “Hi Yuji!”

      • MaggiWuerze@feddit.org
        link
        fedilink
        arrow-up
        2
        ·
        edit-2
        9 days ago

        As someone who learns japanese. Is that a kanji for a honorific? probably kun? ゆうじ is the name, although weird that it is written in hiragana I guess… But I fail at this one 海行こうぜ

        The first Kanji has the one for mother as part of it I think… And the second one is pronounced it ‘i’ so …iikouze ? Let’s go somewhere?

        • t3rmit3@beehaw.org
          link
          fedilink
          arrow-up
          2
          ·
          edit-2
          9 days ago

          Yes, 君 is ‘kun’ when used as an honorific.

          海 is ‘umi’, or sea/ocean. You are correct that the second half of the kanji (母) is the same as the standalone character for mother, but it’s base radical is ⽏, which also just means mother. The first radical, ⺡, means water/ liquid, so you can sort of infer that “water mother” = ocean. Not all kanji work out this nicely with their radical structure, though.

          Last part is spot on, ikou (行こう) is the shortened (conjugation?) of iku or ‘to go’ that expresses a suggestion to do, i.e. “let’s (go)”.

          • MaggiWuerze@feddit.org
            link
            fedilink
            arrow-up
            1
            ·
            8 days ago

            Thanks for the feedback, seems my efforts weren’t entirely wasted :D Interesting, that the Kanji for water itself does not contain that rqficale (unless you squint heavily) What’s the difference to Ikkimashou? Isn’t that the suggestive form? As in ‘we should go’

            • t3rmit3@beehaw.org
              link
              fedilink
              arrow-up
              2
              ·
              edit-2
              7 days ago

              The radical for water is actually derived from the standalone kanji. It’s basically an extremely short-stroke version of the kanji.

              Ikimashou is just the ‘formal’, full-length version. No difference in meaning. Just as “iku” is the casual version of “ikimasu”.

              Ikimasu -> iku

              Ikimashou -> ikou

              • MaggiWuerze@feddit.org
                link
                fedilink
                English
                arrow-up
                2
                ·
                edit-2
                7 days ago

                Fascinating. That explains the similarity. Since watching that episode of Witch Watch I definitely feel bad about my formal “Duolingo” Japanese :D

                By the way, is there a rule to how these short forms are formed?

                • t3rmit3@beehaw.org
                  link
                  fedilink
                  arrow-up
                  2
                  ·
                  edit-2
                  6 days ago

                  By the way, is there a rule to how these short forms are formed?

                  Yep! Most Japanese verbs (with a few exceptions like ‘shimasu’ becoming suru) use one of the ‘i’ variants (‘i’, ‘ki’, ‘ni’, ‘mi’, or ‘ri’) after the kanji, that indicates they are verbs.

                  Yakimasu (to burn/ cook), shirimasu (to know), arukimasu (to walk), arimasu (to be), shinimasu (to die), yogimasu (to read).

                  Ki will become ku in the shortened version, ri will become ru, ni -> nu, etc:

                  yaku, shiru, aruku, aru, shinu, yomu

                  I believe the verbs that don’t end in one of those like tabemasu (to eat) will default to ‘ru’ (taberu), but I don’t know if that’s a rule off the top of my head, or if I just can’t think of any others right now.

                  In the cases where rendaku applies, such as oyogimasu (to swim), the end kana will also have rendaku applied, e.g. oyogu. Ki -> ku, gi -> gu.

  • SpectralPineapple@beehaw.org
    link
    fedilink
    arrow-up
    6
    arrow-down
    1
    ·
    edit-2
    11 days ago

    Although it seems likely that Crunchyroll uses an LLM for translation in some way, I wouldn’t call that “confirmed” since that might be the result of an individual translator using it.

    • t3rmit3@beehaw.org
      link
      fedilink
      arrow-up
      4
      ·
      10 days ago

      The actions of an employee, when reviewed and released by a company, are the actions of that company. A company is just the sum of its employees’ actions.

      • faercol@lemmy.blahaj.zone
        link
        fedilink
        arrow-up
        3
        ·
        9 days ago

        Also, LLM have been there for a while. So there are a few possible situations

        • LLM used is authorized or even encouraged. In this case it’s the company
        • LLM use is controlled, and this falls into one of the authorized cases. Same thing really. Also their authorized use cases need review
        • LLM use is forbidden, or restricted and this is not an authorized use. In this case it falls on the company to review what’s being done. It’s their responsibility.

        So yeah, whatever the situation, it’s on Crunchyroll.

  • luciole (he/him)@beehaw.org
    link
    fedilink
    arrow-up
    3
    ·
    11 days ago

    Both translation and subtitles have highly efficient tooling when in the hands of a professional. Translators nowadays use a mix and will build up a dynamic database as they go through a corpus that needs coherence. What’s bad in this instance is not the usage of some AI, but of a badly adapted AI and ultimately of mediocre results which gives an amateurish impression.