philip lelyveld The world of entertainment technology

30Jan/22Off

OpenAI says its making progress on “The Alignment Problem”

The term refers to the difficulty of making sure that an A.I. system does what humans want it to do. In traditional software, alignment wasn’t much of an issue,  ...

With A.I., alignment is harder. While humans might specify the goal, the software itself now learns how best to achieve it. Often, the logic behind the software’s decision in any particular case is opaque, even to the person who created the software. And this problem becomes more challenging the more capable an A.I. system becomes. ...

OpenAI now says that it has made progress towards solving these alignment problems by creating a new version of GPT, which it calls InstructGPT. InstructGPT starts out a bit like GPT-3 in basic design and training.  ...

After its initial training, InstructGPT is then fine-tuned with two additional steps. First, it is supplied with what Leike says were “a few tens of thousands of examples” of text humans wrote in response to the same sort of prompts that OpenAI’s customers use to try to get GPT-3 to do something. The system has to learn to imitate these human-written responses. Next, the system is further honed by asking it to generate two different responses to a prompt and having human reviewers pick the one they think is best. This information is then used to create an internal reward mechanism where InstructGPT itself has to guess which of the responses it has generated is most likely to be preferred by a human, and that becomes its output. ...

Leike tells me that InstructGPT has not completely cracked The Alignment Problem. “It will still sometimes ignore an instruction or say something toxic,” he says. ...

It’s not clear we’re very close to achieving AGI. But it’s good to know that companies like OpenAI are at least thinking hard about The Alignment Problem—and making some progress towards solving it. ...

See the full story here: https://fortune.com/2022/01/27/openai-alignment-problem-instructgpt-gpt-3/

Comments (0) Trackbacks (0)

Sorry, the comment form is closed at this time.

Trackbacks are disabled.