Posts tagged AI x-risk
7 posts
Vivaria: METR's platform for evaluating AI agents
METR open-sourced its platform for evaluating AI agents. It's called Vivaria.
The METR Task Standard
At METR, I've helped define a standard for tasks that evaluate language model agents for autonomous capabilities.
Dangerous capabilities evaluations for AI
A talk I gave at meetups of Toronto AI Safety and the Wisconsin AI Safety Initiative.
I'm joining ARC Evals
Why I think this role at this organization will let me meaningfully reduce AI x-risk.
Reproducing ARC Evals' recent report on language model agents
I was able to build an agent roughly as capable as ARC Evals'.
I’m leaving my job. Next, AI x-risk
Why I’m leaving and why I want to help reduce the risk of humanity going extinct because of AI.
Creating an AI safety chatbot using LangChain and GPT-3
chat-langchain and ChatGPT made it easy.