• 0 Posts
  • 7 Comments
Joined 1 year ago
cake
Cake day: June 7th, 2023

help-circle

  • For the love of God please stop posting the same story about AI model collapse. This paper has been out since May, been discussed multiple times, and the scenario it presents is highly unrealistic.

    Training on the whole internet is known to produce shit model output, requiring humans to produce their own high quality datasets to feed to these models to yield high quality results. That is why we have techniques like fine-tuning, LoRAs and RLHF as well as countless datasets to feed to models.

    Yes, if a model for some reason was trained on the internet for several iterations, it would collapse and produce garbage. But the current frontier approach for datasets is for LLMs (e.g. GPT4) to produce high quality datasets and for new LLMs to train on that. This has been shown to work with Phi-1 (really good at writing Python code, trained on high quality textbook level content and GPT3.5) and Orca/OpenOrca (GPT-3.5 level model trained on millions of examples from GPT4 and GPT-3.5). Additionally, GPT4 has itself likely been trained on synthetic data and future iterations will train on more and more.

    Notably, by selecting a narrow range of outputs, instead of the whole range, we are able to avoid model collapse and in fact produce even better outputs.



  • I don’t know what type of chatbots these companies are using, but I’ve literally never had a good experience with them and it doesn’t make sense considering how advanced even something like OpenOrca 13B is (GPT-3.5 level) which can run on a single graphics card in some company server room. Most of the ones I’ve talked to are from some random AI startup that have cookie cutter preprogrammed text responses that feel less like LLMs and more like a flow chart and a rudimentary classifier to select an appropriate response. We have LLMs that can do the more complex human tasks of figuring out problems and suggesting solutions and that can query a company database to respond correctly, but we don’t use them.