EA - Concrete Advice for Forming Inside Views on AI Safety by Neel Nanda
The Nonlinear Library: EA Forum - A podcast by The Nonlinear Fund
Categorie:
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Concrete Advice for Forming Inside Views on AI Safety, published by Neel Nanda on August 17, 2022 on The Effective Altruism Forum. A lot of people want to form inside views on AI Safety, and this is a post about concrete advice on how to get started. I have a lot of hot takes on inside views and ways I think people misframe them, so I'll begin with some thoughts on this - if you just want to concrete advice, I recommend skipping ahead. This post is aimed at people who already have some context on AI Safety + Inside Views and want help making progress, it's not pitched as totally introductory resource. Meta Thoughts on Inside Views This is mostly a compressed version of a previous post of mine: How I Formed My Own Views About AI Safety (and ways that trying to do this was pretty stressful and counterproductive) - go read that post if you want something more in-depth! Inside Views are Overrated First point - I think people often wildly overrate inside views. I think they're important, and worth trying to cultivate (else I wouldn't write this post), but I think less so than many people (especially in the Bay Area) often think. Why? The obvious reason to form inside views is to form truer beliefs - AI X-risk is a weird, controversial and confusing thing, and it's important to have good beliefs on it. I think this is true to an extent, but that in practice inside views tend to feel true a lot more than they are true. When I have an inside view, that just feels like how the world is, it feels deeply true and compelling. But empirically, a lot of people have inside views, that are mutually contradictory, and all find their own views compelling. They can't all be right! An alternate framing: There's a bunch of people in the world who are smarter than me and have spent longer thinking about AI Alignment than me. Two possible baselines for an outside view: Make a list of my top 5 'smart & high-status alignment researchers', and for any question, take a majority vote of what they all think Randomly pick one of the top 5 and believe everything they believe Neither of these baselines is great. But I would predict that this actually tracks truth better than the vast majority of inside views - having correct beliefs about controversial and confusing topics is just really hard! Relatedly, it's much more important to understand other people's views than to evaluate them - if I can repeat a full, gears-level model of someone's view back to them in a way that they endorse , that's a lot more valuable than figuring out how much I agree or disagree with their various beliefs and conclusions. I'm a lot more excited about someone who has a good gears-level model of what their top 5 alignment researchers believe and why, than I am about someone who confidently has their own beliefs but a fuzzy model - having several models lets you compare and contrast them, figure out novel predictions, better engage with technical questions, do much better research, etc Forming a "true" inside view - one where you fully understand something from first principles with zero deferring - is wildly impractical. For example, let's take the question of AI Timelines. Really understanding this requires a deep engagement with diverse topics like economics, AI hardware, international relations, tech financing, deep learning, politics, etc. I'd guess that no one in the world is remotely close to an expert in all of these. People often orient to inside views pretty unhealthily. Some themes I've noticed (especially in myself!): I should get incredibly stressed about this There's one true perspective on AI Alignment and I can find it if I just try hard enough Everyone around me seems confident that AI Safety matters, so they must all have great inside views, so this must be easy and I'm just not trying hard enough It's ...
