Posts tagged AI x-risk

8 posts

METR publishes RE-Bench

2024-12-30

I contributed to METR's AI research and development benchmark, RE-Bench.

2024-08-14

METR open-sourced its platform for evaluating AI agents. It's called Vivaria.

2024-03-02

At METR, I've helped define a standard for tasks that evaluate language model agents for autonomous capabilities.

2023-12-02

A talk I gave at meetups of Toronto AI Safety and the Wisconsin AI Safety Initiative.

2023-11-03

Why I think this role at this organization will let me meaningfully reduce AI x-risk.

2023-09-01

I was able to build an agent roughly as capable as ARC Evals'.

2023-08-16

Why I’m leaving and why I want to help reduce the risk of humanity going extinct because of AI.

2023-03-27

chat-langchain and ChatGPT made it easy.