Interpretability, or understanding how AI models work, can help mitigate many AI risks, such as misalignment and misuse, that stem from AI systems' opacity (Dario Amodei) April 26, 2025 by appcompact