• EldritchFeminity@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      10
      ·
      16 days ago

      Yeah, the tweet clearly says that the subscribers they have are using it more than they expected, which is costing them more than $200 per month per subscriber just to run it.

      I could see an argument for an economy of scales kind of situation where adding more users would offset the cost per user, but it seems like here that would just increase their overhead, making the problem worse.

    • BB84@mander.xyz
      link
      fedilink
      English
      arrow-up
      3
      arrow-down
      8
      ·
      edit-2
      15 days ago

      LLM inference can be batched, reducing the cost per request. If you have too few customers, you can’t fill the optimal batch size.

      That said, the optimal batch size on today’s hardware is not big (<100). I would be very very surprised if they couldn’t fill it for any few-seconds window.

      • David Gerard@awful.systemsM
        link
        fedilink
        English
        arrow-up
        10
        ·
        16 days ago

        this sounds like an attempt to demand others disprove the assertion that they’re losing money, in a discussion of an article about Sam saying they’re losing money

        • BB84@mander.xyz
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          7
          ·
          15 days ago

          What? I’m not doubting what he said. Just surprised. Look at this. I really hope Sam IPO his company so I can short it.

            • BB84@mander.xyz
              link
              fedilink
              English
              arrow-up
              2
              arrow-down
              5
              ·
              edit-2
              15 days ago

              Can someone explain why I am being downvoted and attacked in this thread? I swear I am not sealioning. Genuinely confused.

              @sc_griffith@awful.systems asked how request frequency might impact cost per request. Batch inference is a reason (ask anyone in the self-hosted LLM community). I noted that this reason only applies at very small scale, probably much smaller than what OpenAI is operating at.

              @dgerard@awful.systems why did you say I am demanding someone disprove the assertion? Are you misunderstanding “I would be very very surprised if they couldn’t fill [the optimal batch size] for any few-seconds window” to mean “I would be very very surprised if they are not profitable”?

              The tweet I linked shows that good LLMs can be much cheaper. I am saying that OpenAI is very inefficient and thus economically “cooked”, as the post title will have it. How does this make me FYGM? @froztbyte@awful.systems

              • self@awful.systems
                link
                fedilink
                English
                arrow-up
                9
                ·
                15 days ago

                Can someone explain why I am being downvoted and attacked in this thread? I swear I am not sealioning. Genuinely confused.

                my god! let me fix that

      • flere-imsaho@awful.systems
        link
        fedilink
        English
        arrow-up
        4
        ·
        15 days ago

        i would swear that in an earlier version of this message the optimal batch size was estimated to be as large as twenty.