The Inside View

Aug 23 2024 52 ep. 85 mins 128

The Inside View Podcast artwork

The goal of this podcast is to create a place where people discuss their inside views about existential risk from AI.

Copy RSS

Subscribe on Podcast Addict

Owain Evans - AI Situational Awareness, Out-of-Context Reasoning

Aug 23 2024 135 mins

Owain Evans is an AI Alignment researcher, research associate at the Center of Human Compatible AI at UC Berkeley, and now leading a new AI safety research group. In this episode we discuss two of his recent papers, “Me, Myself, and AI: The Situational Awareness Dataset (SAD) for LLMs” and “Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Trai

[Crosspost] Adam Gleave on Vulnerabilities in GPT-4 APIs (+ extra Nathan Labenz interview)

May 17 2024 136 mins

This is a special crosspost episode where Adam Gleave is interviewed by Nathan Labenz from the Cognitive Revolution. At the end I also have a discussion with Nathan Labenz about his takes on AI. Adam Gleave is the founder of Far AI, and with Nathan they discuss finding vulnerabilities in GPT-4's fine-tuning and Assistant PIs, Far AI's work exposing exploitable flaws in &qu

Ethan Perez on Selecting Alignment Research Projects (ft. Mikita Balesni & Henry Sleight)

Apr 09 2024 36 mins

Ethan Perez is a Research Scientist at Anthropic, where he leads a team working on developing model organisms of misalignment. Youtube: ⁠https://youtu.be/XDtDljh44DM Ethan is interviewed by Mikita Balesni (Apollo Research) and Henry Sleight (Astra Fellowship)) about his approach in selecting projects for doing AI Alignment research. A transcript & write-up will be availabl

Emil Wallner on Sora, Generative AI Startups and AI optimism

Feb 20 2024 102 mins

Emil is the co-founder of palette.fm (colorizing B&W pictures with generative AI) and was previously working in deep learning for Google Arts & Culture. We were talking about Sora on a daily basis, so I decided to record our conversation, and then proceeded to confront him about AI risk. Patreon: https://www.patreon.com/theinsideview Sora: https://openai.com/sora Palett

Evan Hubinger on Sleeper Agents, Deception and Responsible Scaling Policies

Feb 12 2024 52 mins

Evan Hubinger leads the Alignment stress-testing at Anthropic and recently published "Sleeper Agents: Training Deceptive LLMs That Persist Through Safety Training". In this interview we mostly discuss the Sleeper Agents paper, but also how this line of work relates to his work with Alignment Stress-testing, Model Organisms of Misalignment, Deceptive Instrumental Alignmen

[Jan 2023] Jeffrey Ladish on AI Augmented Cyberwarfare and compute monitoring

Jan 27 2024 33 mins

Jeffrey Ladish is the Executive Director of Palisade Research which aimes so "study the offensive capabilities or AI systems today to better understand the risk of losing control to AI systems forever". He previously helped build out the information security program at Anthropic. Audio is a edit & re-master of the Twitter Space on "AI Governance and cyberwarfar

Holly Elmore on pausing AI

Jan 22 2024 100 mins

Holly Elmore is an AI Pause Advocate who has organized two protests in the past few months (against Meta's open sourcing of LLMs and before the UK AI Summit), and is currently running the US front of the Pause AI Movement. Prior to that, Holly previously worked at a think thank and has a PhD in evolutionary biology from Harvard. [Deleted & re-uploaded because there were is

Podcast Retrospective and Next Steps

Jan 08 2024 63 mins

https://youtu.be/Fk2MrpuWinc

Paul Christiano's views on "doom" (ft. Robert Miles)

Sep 29 2023 4 mins

Youtube: https://youtu.be/JXYcLQItZsk Paul Christiano's post: https://www.lesswrong.com/posts/xWMqsvHapP3nwdSW8/my-views-on-doom

Neel Nanda on mechanistic interpretability, superposition and grokking

Sep 21 2023 124 mins

Neel Nanda is a researcher at Google DeepMind working on mechanistic interpretability. He is also known for his YouTube channel where he explains what is going on inside of neural networks to a large audience. In this conversation, we discuss what is mechanistic interpretability, how Neel got into it, his research methodology, his advice for people who want to get started, but also