• 🇨🇦 tunetardis@piefed.ca
    link
    fedilink
    English
    arrow-up
    46
    arrow-down
    1
    ·
    1 day ago

    For instance, if an AI model could complete a one-hour task with 50% success, it only had a 25% chance of successfully completing a two-hour task. This indicates that for 99% reliability, task duration must be reduced by a factor of 70.

    This is interesting. I have noticed this myself. Generally, when an LLM boosts productivity, it shoots back a solution very quickly, and after a quick sanity check, I can accept it and move on. When it has trouble, that’s something of a red flag. You might get there eventually by probing it more and more, but there is good reason for pessimism if it’s taking too long.

    In the worst case scenario where you ask it a coding problem for which there is no solution—it’s just not possible to do what you’re asking—it may nevertheless engage you indefinitely until you eventually realize it’s running you around in circles. I’ve wasted a whole afternoon with that nonsense.

    Anyway, I worry that companies are no longer hiring junior devs. Today’s juniors are tomorrow’s elites and there is going to be a talent gap in a decade that LLMs—in their current state at least—seem unlikely to fill.

    • Schal330@lemmy.world
      link
      fedilink
      arrow-up
      5
      ·
      16 hours ago

      In the worst case scenario where you ask it a coding problem for which there is no solution—it’s just not possible to do what you’re asking—it may nevertheless engage you indefinitely until you eventually realize it’s running you around in circles.

      Exactly this, and it’s frustrating as a Jr dev to be fed this bs when you’re learning. I’ve had multiple scenarios where it blatantly told me wrong things. Like using string interpolation in a terraform file to try and set a dynamic source - what it was giving me looked totally viable. It wasn’t until I dug around some more that I found out that terraform init can’t use variables in the source field.

      On the positive side it helps give me some direction when I don’t know where to start. I use it with a highly pessimistic and cautious approach. I understand that today is the worst it’s going to be, and that I will be required to use it as a tool in my job going forward, so I’m making an effort to get to grips when working with it.

    • Zexks@lemmy.world
      link
      fedilink
      arrow-up
      9
      ·
      1 day ago

      I’ve noticed this too and it’s even weirder when you compare it to a physics question. It very consistently tells me when my recent brain fart of an idea is just plain stupid. But it will try eternally to help me find a coding solution even it it just keeps going in circles.

      • otacon239@lemmy.world
        link
        fedilink
        arrow-up
        4
        ·
        1 day ago

        I think part of this comes down to the format. Physics can often be analogized and can be very conversational when it comes to demonstrating ideas.

        Most code also looks pretty similar if you don’t know how to read it and unlike language, the syntax is absolute with no room for interpretation or translation.

        I’ve found it’s consistently good if you treat it like a project specification list, including all of your requirements in a list format in the very first message and have it psuedocode the draft along with list what libraries it wants to use and make sure they work how you expect.

        There’s some screening that goes into utilizing it well and that only comes with already knowing roughly how to code what you’re trying to make.

    • Modern_medicine_isnt@lemmy.world
      link
      fedilink
      arrow-up
      11
      ·
      1 day ago

      Sadly, the lack of junior devs means my job is probably safe until I am ready to retire. I have mixed feelings about that. On the one hand, yeah for me. On the other sad for the new grads. And sad for software as a whole. But software truely sucks, and has only been enshitifying worse and worse. Could a shake up like this somehow help that? I don’t see how, but who knows.