Posts tagged AI x-risk

7 posts

Vivaria: METR's platform for evaluating AI agents

2024-08-14

METR open-sourced its platform for evaluating AI agents. It's called Vivaria.

The METR Task Standard

2024-03-02

At METR, I've helped define a standard for tasks that evaluate language model agents for autonomous capabilities.

Dangerous capabilities evaluations for AI

2023-12-02

A talk I gave at meetups of Toronto AI Safety and the Wisconsin AI Safety Initiative.

I'm joining ARC Evals

2023-11-03

Why I think this role at this organization will let me meaningfully reduce AI x-risk.

Reproducing ARC Evals' recent report on language model agents

2023-09-01

I was able to build an agent roughly as capable as ARC Evals'.

I’m leaving my job. Next, AI x-risk

2023-08-16

Why I’m leaving and why I want to help reduce the risk of humanity going extinct because of AI.

Creating an AI safety chatbot using LangChain and GPT-3

2023-03-27

chat-langchain and ChatGPT made it easy.