Factored Cognition

In this project, we explore whether we can solve difficult problems by composing small and mostly context-free contributions from individual agents who don't know the big picture.

What is factored cognition?

Imagine the set of questions you can answer in 15 minutes using a computer. There's a lot you can do. You can look up facts, do calculations, weigh considerations, and thus answer many questions that require a bit of research and deliberation. But there's also a lot you can't do. If a question is about some field of physics you've never heard of, say "What does a field theory look like in which supersymmetry is spontaneously broken?", you probably won't have time to learn enough to give a good answer.

Now, consider a thought experiment: Imagine that during your 15 minutes you can delegate up to 100 tasks to fast copies of yourself. That is, assistants who are just as capable and motivated as you are, and also only have 15 minutes of subjective time, but who are much faster, so that when you delegate a task, you immediately observe their answer. Clearly, you can do a lot more with the help of your 100 assistants than you could on your own. We'll call this a one-step amplification of yourself.

What if we iterated this process, so that each of your assistants in turn had access to 100 assistants, and so on? What capabilities could we implement through iterated amplification, and what tasks would stay out of reach, if any?

Factored cognition refers to mechanisms like this, where sophisticated learning and reasoning is broken down (or factored) into many small and mostly independent tasks.

Why does it matter?

Our mission is to find scalable ways to leverage machine learning for deliberation. This requires that we can view thinking as a task with data that we can train ML systems on. The only concrete way we know for doing that is to record how people deliberate using explicit actions in narrow contexts, i.e. to break deliberation into many little tasks. In that case, we can use ML to imitate what people do and scale up by repeatedly applying the learned policy. This is a central component of Iterated Distillation and Amplification, Paul Christiano's approach to AI Alignment.



Thanks to Paul Christiano, Rohin Shah, Daniel Dewey, Owain Evans, Andrew Critch, William Saunders, Ozzie Gooen, Ryan Carey, Jeff Wu, and Jan Leike for feedback on the chapters.