• t3rmit3@beehaw.org
    link
    fedilink
    arrow-up
    6
    ·
    7 hours ago

    One thing I’d push back on in the article is:

    That cost-per-user doesn’t decrease as you add more customers. You need more servers. More GPUs.

    This is assuming constant use, which is not the case. If I have a server handling LLM prompt requests, and for illustrative purposes each request uses 100% of the single discrete GPU in it, and I only have 1 customer, but that one customer only uses it 5% of the day (which would actually be pretty high in real terms), I can still add additional customers without needing to buy additional servers. The question is whether the given revenue of a single server outweighs its cost to run.

    And when it comes to training, that is an upfront cost, that you could (if you get a model to where you want it) stop having to pay whenever you want. I’m pretty surprised they haven’t been really leaning into training models for medical diagnoses, because once you have a model that can e.g. spot a type of tumor with n% accuracy beyond a human, you don’t really have to refine it further if you don’t want to (after all, it’s not like the humans can choose to do it better themselves at that point, like they can with writing prompts).

    • Sas [she/her]@beehaw.org
      link
      fedilink
      arrow-up
      1
      ·
      39 minutes ago

      I’d say they’ve probably long reached the point where they have enough customers around the world to hold the load on their servers fairly constant. The example with one user only taking 5% of a servers load only works for low customer counts, similar to how you can’t count on one wind turbine or solar plant to provide all of your energy but if you have enough of them you can provide a base line of fairly constant energy