Help us track the emergence of concerning AI capabilities.
Detailed instructions for contributing are available in our contributing.md on GitHub.
Open a PR and one of the repo owners will review and approve.
Important: Please fact-check everything you submit, as well as existing content. We strive for accuracy and reproducibility.
Research papers, technical reports, and documented incidents that demonstrate concerning AI behaviors. We track evidence from arXiv, major conferences, lab publications, and reproducible experiments.
Suggest new capability categories or refine existing ones. Are there concerning behaviors we should be tracking that don't fit our current categories?
Help identify well-founded directions for improving the safety and security of AI systems to prevent software-ultron from taking shape.
Found an error or outdated information? Let us know. Accuracy is critical for this resource to be useful.