A bunch of scientific papers are probably better data than a bunch of Reddit posts and it’s still not good enough.
Consider the task we’re asking the AI to do. If you want a human to be able to correctly answer questions across a wide array of scientific fields you can’t just hand them all the science papers and expect them to be able to understand it. Even if we restrict it to a single narrow field of research we expect that person to have a insane levels of education. We’re talking 12 years of primary education, 4 years as an undergraduate and 4 more years doing their PhD, and that’s at the low end. During all that time the human is constantly ingesting data through their senses and they’re getting constant training in the form of feedback.
All the scientific papers in the world don’t even come close to an education like that, when it comes to data quality.
Haha. Not specifically.
It’s more a comment on how hard it is to separate truth from fiction. Adding glue to pizza is obviously dumb to any normal human. Sometimes the obviously dumb answer is actually the correct one though. Semmelweis’s contemporaries lambasted him for his stupid and obviously nonsensical claims about doctors contaminating pregnant women with “cadaveric particles” after performing autopsies.
Those were experts in the field and they were unable to guess the correctness of the claim. Why would we expect normal people or AIs to do better?
There may be a time when we can reasonably have such an expectation. I don’t think it will happen before we can give AIs training that’s as good as, or better, than what we give the most educated humans. Reading all of Reddit, doesn’t even come close to that.
That’s my point. Some of them wouldn’t even go through the trouble of making sure that it’s non-toxic glue.
There are humans out there who ate laundry pods because the internet told them to.
This is why actual AI researchers are so concerned about data quality.
Modern AIs need a ton of data and it needs to be good data. That really shouldn’t surprise anyone.
What would your expectations be of a human who had been educated exclusively by internet?
It would depend on how well we can control it.
Ideally the material would be completely nonreactive for as long as you’re using it and then instantly degrade into component elements.
The faster things degrade, the higher the chance that they’ll degrade when you don’t want it to.