
.png)
This work is part of our ongoing research into aligning advanced AI systems, which is key to our mission. As we train our models to do increasingly complex tasks, making informed evaluations of the models’ outputs will become increasingly difficult for humans. Our method can be used to summarize books of unbounded length, unrestricted by the context length of the transformer models we use.See for yourself on our summary explorer! For example, you can trace to find where in the original text certain events from the summary happen. It is easier to trace the summary-writing process.Decomposition allows humans to evaluate model summaries more quickly by using summaries of smaller parts of the book rather than reading the source text.Compared to an end-to-end training procedure, recursive task decomposition has the following advantages: In this case we break up summarizing a long piece of text into summarizing several shorter pieces. To address this problem, we additionally make use of recursive task decomposition: we procedurally break up a difficult task into easier ones. But judging summaries of entire books takes a lot of effort to do directly since a human would need to read the entire book, which takes many hours. In the past we found that training a model with reinforcement learning from human feedback helped align model summaries with human preferences on short posts and articles. Large pretrained models aren’t very good at summarization. Consider the task of summarizing a piece of text.
