CAISI Research Program at CIFAR

Safeguarding Society

Introduction

Protecting our collective future from the large-scale risks of advanced AI. This means confronting and mitigating systemic harms, like mass disinformation and economic disruption, and building the tools and policies needed to ensure AI remains a force for public good.

Intelligent Ideas with Geoffrey Rockwell

Canada CIFAR AI Chair at Amii, Geoffrey Rockwell, uses his ethics expertise to bring a philosophical perspective to AI safety research, discussing the role of government in mitigating harm and applying existing safety knowledge and infrastructure to AI deployment.

Spotlight

Safeguarding Mental Health from AI Companions

As more Canadians turn to AI chatbots for companionship and self-validation, there is growing proof that misuse and overuse of AI companion chatbots cause mental health harm, ranging from dependency to full AI psychosis and suicide assistance. As many as 70% of young people now regularly turn to AI companions, necessitating the need for independent safeguards, including technological guardrails, policies, and education.

To mitigate the risks of harmful chatbot interactions, the CAISI Research Program at CIFAR provided funding to support Mila’s AI Safety Studio to undertake this work. This initiative focuses on creating independent, trustworthy AI guardrails and developing exhaustive benchmarks that reflect Canadian cultural and societal diversity to objectively measure the harm.

To date, the Studio has developed its first iteration of a mental health guardrail and benchmark for AI chatbots. They are now working to extend their reach across multiple large language model (LLM) vendors, languages and cultural specificities, using anonymized real-world data and input from mental health experts.

Cross-Disciplinary Collaboration and Future Focus

“The most exciting aspect of this work is the unanimous, cross-disciplinary support of a web of partners. The socio-technical collaboration across disciplines — bridging AI expertise, mental health, policy, education specialists and impacted communities grassroots up, ensures that we'll create a robust, multidisciplinary protection against companion AI mental health harm,” said Simona Grandrabur, Mila’s AI Safety Studio Lead.

Over the next year, the Studio plans to develop intelligent filters to block AI-generated content that assists or encourages self-harm or suicide, as well as reliability testing protocols to evaluate the safety and robustness of conversational and generative AI models. Additionally, the Studio will develop psychological and ethical risk assessment tools.

The first official version of the AI Safety Studio benchmark dashboard and guardrails will be released to the public in 2026.

“Mila’s mental health guardrail and benchmarks will establish an independent and trustworthy means to measure the extent of harmful interactions with AI companions to safeguard our most vulnerable populations, including our children, against suicide assistance.”

Simona Grandrabur

AI Safety Lead, Mila

Read the full story

Spotlight

Securing Canada Against Disinformation

Malicious foreign influence and AI-driven disinformation pose a direct threat to Canadian democracy, aiming to erode trust in our institutions, media and civil society. In response, a 2025 CIFAR AI Safety Catalyst project is developing an advanced AI tool to protect Canadians against disinformation campaigns.

Defending against the malicious use of AI is the focus of this research, which is led by Canada CIFAR AI Chair Matthew E. Taylor (Amii, University of Alberta), Brian McQuinn (University of Regina), and CIFAR AI Safety Postdoctoral Fellow James Benoit.

An AI Defense System

The team is developing CIPHER, an advanced human-in-the-loop AI system. The core purpose of this tool is to empower civil society organizations by equipping them to identify and combat sophisticated and coordinated disinformation campaigns. The initial focus of this work is to detect Russian operations across both textual and visual media, providing a vital shield of protection for Canadian society.

“The CIPHER project treats safe and reliable information as a matter of national security. Identifying state-backed disinformation news campaigns can help us all remain rooted in Canadian facts and values. Our goal is to ensure outside influencers don’t poison our debates and security decisions,” the team told CIFAR.

This CIFAR AI Safety Catalyst project will deliver tangible impacts by producing:

A rigorously evaluated proof-of-concept of the CIPHER tool, tested in the real world by Canadian and global civil society partners.
Actionable policy briefs to guide government and industry response.
A new public dataset to accelerate further research and development in this critical area, ensuring the project's impact extends far beyond its initial scope.

“AI is becoming increasingly common. Rather than outsourcing important decisions to AI, our design makes sure humans are always in the loop. The CIPHER project aims to earn the trust of decision-makers and users to collaboratively defend democratic spaces from disinformation and misinformation.”

Matthew E. Taylor
Canada CIFAR AI Chair, Amii

Funded Projects

Solution Network: Safeguarding Courts from Synthetic AI Content

Ebrahim Bagheri (University of Toronto)
Maura Grossman (University of Waterloo)

AI Safety Catalyst Project: On the Safe Use of Diffusion-based Foundation Models

Mi Jung Park (Canada CIFAR AI Chair, Amii, University of British Columbia)

AI Safety Catalyst Project: CIPHER: Countering Influence Through Pattern Highlighting and Evolving Responses

Matthew E. Taylor (Canada CIFAR AI Chair, Amii, University of Alberta)
Brian McQuinn (University of Regina)
James Benoit (CIFAR AI Safety Postdoctoral Fellow, Amii)

See details of all projects

(L-R) Kianna Adams (AltaML), Golnoosh Farnadi (Mila), Mohamed Abdalla (Amii) and Elissa Strome (CIFAR) discuss how to build AI systems that are aligned with the values and safety of society and how Canada can lead in this endeavor.