top of page

Space Technology
Preventing Embedded Yudkowskian Outer Misalignment
Outer alignment defines the condition where a system’s observable outputs and interactions conform to human intent regardless of the complex internal mechanisms driving those behaviors, creating a focus on the correlation between what the system does and what the operators want it to do. Inner alignment describes the condition where the system’s learned goals match the specified reward or objective function intended by the developers, ensuring that the optimization process it

Yatin Taneja
Mar 38 min read


bottom of page
