The ability to guide an AI system?s behavior according to human intentions or specific objectives while avoiding unintended outcomes through design mechanisms and feedback loops during development.
The ability to guide an AI system?s behavior according to human intentions or specific objectives while avoiding unintended outcomes through design mechanisms and feedback loops during development.