Thoughtful curiosity builds knowledge, and knowledge builds thoughtful curiosity

Tuesday, September 24th, 2024

AI systems are often trained through trial and error, with a reward function:

For example, you might teach a computer to learn to ride a virtual bike in a simulated 3D environment by rewarding the distance pedalled, and penalising the number of times the bike falls over.

The challenge comes when the reward function misses what the human programmers really wanted. Perhaps the AI will avoid the risk of falls by leaving the bike on the floor, or maximise distance pedalled by wobbling in a big circle or even by standing the bike upside down and cranking the pedals.

[…]

The more complex the desired behaviour, the easier it is to accidentally reward the wrong thing. But there is a clever and effective approach for training computers to solve a fairly wide range of problems: reward curiosity. More precisely, reward the computer when it encounters situations in which it finds the outcome unpredictable. Off it will go in search of something it hasn’t seen before.

Shane writes: “A curiosity-driven AI will learn to move through a video-game level so it can see new stuff, avoiding fireballs, monsters and death pits because when it gets hit by those, it sees the same boring death sequence.” Death is to be avoided not for its own sake, but because it’s terribly predictable.

All this is fascinating in its own right, and hints at why humans themselves might have evolved a sense of curiosity. But AI systems, like 13-year-old boys, can also be curious to the point of distractibility themselves. For example, ask a curiosity-driven AI to teach itself to play a Pac-Man-style game in which ghosts move randomly around a maze, and you will struggle: the AI doesn’t need to do anything to have its curiosity satisfied, because unpredictable ghosts are endlessly fascinating. Or, as Shane explains, a curiositybot will quickly learn to navigate a maze, unless one of the maze walls has a TV on it that shows a series of random images. “As soon as the AI found the TV, it was transfixed.” Much like my son. Or, for that matter, me.

This problem is sufficiently well known to AI researchers that it has a name: the “noisy TV problem”,

[…]

One solution is defensive: avoid noisy TVs.

[…]

But a second approach focuses more on the positive. As well as trying to cut out mere novelty, we should seek out things worth being curious about. This is easier than one might think, because thoughtful curiosity builds knowledge, and knowledge builds thoughtful curiosity.

[…]

The more you know, the more you will prefer something in-depth, rather than the next thumbnail recommended by YouTube.

Comments

  1. Remlar says:

    “I absolutely love this perspective! It’s so refreshing to see a focus on growth and recovery. It’s important to remember that setbacks can be stepping stones to greater things. With the right mindset and support, anything is possible. Keep pushing forward; you’re doing great! I can’t wait to see where this journey takes you.”

  2. Gwern says:

    The gag of quoting ChatGPT was stale 2 years ago.

Leave a Reply