Safety Nudges
Overview
Safety Nudges audits chatbot conversations in real time to screen for common problems.
As AI systems have become widely commercialized and integrated into consumer products, financial and market competition pressures may incentivize companies to promote AI products despite known safety risks or societal harms. Recent incidents and research has raised concerns about potential societal impacts associated with AI use, including overreliance on automated systems, excessive use, persuasive influence, social isolation, and user trust reinforced through sycophantic or overly agreeable responses from AI systems. These safety-related concerns can disproportionately affect vulnerable populations, such as older adults and children. Because the incentives of commercial AI product developers may not always align with broader societal interests, there is a need for independent oversight mechanisms and user-facing tools that can promote awareness of potential risks. Designed by AI safety researchers at Carnegie Mellon University, Safety Nudges provides a low-friction real-time auditing interface for chatbot conversations, restoring agency and awareness to end-users. It checks each conversation turn on ChatGPT and Claude by sending it to an external LLM for review; a principled and comprehensive taxonomy of harms is used to flag the conversation for common pitfalls like flattery, overconfidence, and anthropomorphization. Nudges are integrated gracefully into the chat interface and can be paused at any time. NOTE: Safety Nudges is currently in an alpha release, with codes for free access granted to chosen users. If you do not have a code, you can also provide your own OpenRouter API key. We encourage you to submit feedback to help us improve!
0 out of 5No ratings
Details
- Version0.1.0
- UpdatedApril 17, 2026
- Offered byjames-wedgwood
- Size515KiB
- LanguagesEnglish
- DeveloperJames Wedgwood
245 Melwood Ave Apt 801 Pittsburgh, PA 15213 USEmail
jwedgwoo@cs.cmu.edu - Non-traderThis developer has not identified itself as a trader. For consumers in the European Union, please note that consumer rights do not apply to contracts between you and this developer.
Privacy
Safety Nudges has disclosed the following information regarding the collection and usage of your data. More detailed information can be found in the developer's privacy policy.
Safety Nudges handles the following:
This developer declares that your data is
- Not being sold to third parties, outside of the approved use cases
- Not being used or transferred for purposes that are unrelated to the item's core functionality
- Not being used or transferred to determine creditworthiness or for lending purposes