OpenAI’s Superalignment team will address the core technical challenges of controlling superintelligent AI systems and ensuring their alignment with human values and goals.
To accomplish this, they are developing a ‘human-level automated alignment researcher,’ which itself is an AI. This automated researcher will utilize human feedback and assist in evaluating other AI systems, playing a critical role in advancing alignment research. The ultimate aim is to build AI systems that can conceive, implement, and develop improved alignment techniques.
OpenAI’s hypothesis is that AI systems can make faster and more effective progress in alignment research compared to humans. Through collaboration between AI systems and human researchers, continuous improvements will be made to ensure AI alignment with human values.
So, using AI to control other AI; what do you think?