

In conjunction with his comments about making it antiwoke by modifying the input data rather then relying on a system prompt after filling it with everything, it’s hard not to view this as part of an attempt to ideologically monitor these tutors to make sure they’re not going to select against versions of the model that aren’t in the desired range of “closeted Nazi scumbag.”
I’m not comfortable saying that consciousness and subjectivity can’t in principle be created in a computer, but I think one element of what this whole debate exposes is that we have basically no idea what actions makes consciousness happen or how to define and identify that happening. Chatbots have always challenged the Turing test because they showcase how much we tend to project consciousness into anything that vaguely looks like it (interesting parallel to ancient mythologies explaining the whole world through stories about magic people). The current state of the art still fails at basic coherence over shockingly small amounts of time and complexity, and even when it holds together it shows a complete lack of context and comprehension. It’s clear that complete-the-sentence style pattern recognition and reproduction can be done impressively well in a computer and that it can get you farther than I would have thought in language processing, at least imitatively. But it’s equally clear that there’s something more there and just scaling up your pattern-maximizer isn’t going to replicate it.