• 0 Posts
  • 7 Comments
Joined 1 year ago
cake
Cake day: June 14th, 2023

help-circle





  • These models are Markov chains yes. But many things are Markov chains, I’m not sure that describing these as Markov chains helps gain understanding.

    The way these models generate text is iterative. They do it word by word. Every time they need to generate a word they will randomly select one from their vocabulary. The trick to generating coherent text is that different words are more likely to happen depending on the previous words.

    For example for the sentence “that is a huge grey” the word elephant is more likely than flamingo.

    The temperature is the way you select your word. If it is low you will always select the most likely word. Increasing the temperature will make the random choice more random giving each word a more equal chance.

    Seeing as these models function randomly there is nothing preventing them from producing unique text. After all, something like jsbHsbe d dhebsUd is unique but not very interesting.