AI Alignment
The Basic Idea
Imagine that you are trying to teach a young toddler to behave properly in public. You want the child to understand and uphold your values and ethical judgments and avoid any inappropriate behavior. If the child behaves as you intended them to, you could say that they have aligned with your values. By the same token, if the child misbehaves and follows their own objectives, the toddler is misaligned with your values.
A similar process occurs in the field of artificial intelligence (AI). AI alignment refers to the goal of designing artificial intelligence systems in such a way that their objectives and behavior are aligned with the values and goals of human users or society at large. Experts working in the field of AI often refer to the ‘alignment problem’, a concern that as AI systems become more sophisticated and autonomous, there is a risk that they may act in ways that are inconsistent with human values or intentions. Achieving AI alignment is crucial to prevent unintended consequences, risks, and ethical concerns associated with AI technologies.
As large language models, such as Open AI’s ChatGPT or Google’s Lamda, become more powerful, they start to exhibit new capabilities that weren’t initially programmed into the system. The goal of AI alignment is to ensure that these new emerging capabilities align with our collective goals and that AI systems continue to function as intended.
About the Author
Dr. Lauren Braithwaite
Dr. Lauren Braithwaite is a Social and Behaviour Change Design and Partnerships consultant working in the international development sector. Lauren has worked with education programmes in Afghanistan, Australia, Mexico, and Rwanda, and from 2017–2019 she was Artistic Director of the Afghan Women’s Orchestra. Lauren earned her PhD in Education and MSc in Musicology from the University of Oxford, and her BA in Music from the University of Cambridge. When she’s not putting pen to paper, Lauren enjoys running marathons and spending time with her two dogs.