Joseph Carlsmith - Utopia, AI, & Infinite Ethics

Dwarkesh Patel · Beginner ·📄 Research Papers Explained ·3y ago

Skills: Research Methods90%Reading ML Papers80%AI Alignment Basics70%AI Ethics & Policy70%

Key Takeaways

Joseph Carlsmith discusses Utopia, AI, and infinite ethics, exploring the concept of a profoundly better future and the challenges of existential risk from artificial intelligence, with a focus on research methods, paper reading, and AI ethics.

Full Transcript

so Utopia for me just means a kind of profoundly better future and I think it's important because I think it's just actually possible I just think it's actually something that we could do if we sort of play our cards right we could just build a world that is radically better than the world we live in today infinite ethics is ethics that tries to Grapple with how we should uh act with respect to kind of infinite worlds there's the middle ground between I shall ignore this completely and I shall you know be a Jain um which is recognizing that this is a this is a real trade-off there's uncertainty here and and taking responsibility for how you're responding to that the future is a big thing to try to model with this tiny mind and so you know of necessity you need to use these extremely lossy abstractions [Music] foreign of interviewing Joe Carl Smith who's a senior research analyst at open philanthropy and a doctoral student in philosophy at the University of Oxford um Joe has a really interesting blog that I got to check out uh called hands and cities and that's the reason that I wanted to have him on the podcast because it has a bunch of thought-provoking and insightful uh post on there about philosophy morality ethics the future and yeah so I I really wanted to talk to you Joe but do you want to give a bit of a longer intro on what you're up to sure so I work at open philanthropy on existential risk from artificial intelligence um and so you know I think about what's going to happen with AI how can we make sure it goes well and in particular how can we make sure that advanced AI systems are safe uh and then uh I have a side project which is this blog uh where I write about philosophy and uh and the future and things like that and that emerges partly from a sort of my background which is um I was I was before before getting into uh into Ai and working at open open philanthropy I was in academic philosophy okay yeah that's uh that that's a quite an ambitious side project I mean given the length and the regularity of those posts it's actually quite stunning um do you want to talk more about what you're working on about AI at open philanthropy so it's a mix of things right now I'm thinking about AI timelines and what's called takeoff speeds sort of sort of how fast the transition is from pretty impressive AI systems to AI systems that are uh kind of radically transformative um and I'm trying to use that uh to provide more perspective on the probability that um that everything goes terribly wrong I see okay um I I didn't know but I said well what are the implications I suppose it's uh higher or lower than I would expect um I guess if it's higher maybe I should work on EI 11 but other than that what is what are the implications of that that figure changing I think there are a number of implications just from understanding uh timelines with respect to how you prioritize and what um you know just to some extent the sooner something is then uh you need to be planning for it coming sooner and and kind of cutting more corners or you know um Counting Less on having more time um and yeah I think overall the higher you think uh the probability of catastrophe is the um the easier it is for this to uh to become kind of the most important priority I do think there's a range of probabilities where it maybe doesn't matter that much um but I think uh the difference between say uh one and ten percent I think is uh is quite substantive and um the difference between 10 and 90 is quite substantive and um you know I know people in all of those ranges gotcha okay interesting um yeah so let's let's back up here and uh talk a bit more about the philosophy motivating this so I think you identify as a long-termist um yeah so maybe a wrap picture question here is um you have an interesting blog post about what the future looking back on us might think about uh the 21st century given the risk we're taking um so I mean what do you think about the possibility that we're potentially giving up resources potentially dedicating well I'm not you're dedicating your career um to uh you know building a future that you know maybe given you know given the fact that you're alive now you might find strange or disturbing or disgusting I mean um uh sort of I guess to add more contests to the question if from utilitarian perspective the present is clearly much much better than the past but somebody from the past might think that you know uh there's a lot of bad things about the president that are kind of disturbing I mean they they might not like the configuration of how maybe isolating a modern city might be they might find that kinds of free to cheap information that you can access on your phone uh kind of disturbing yeah so how do you think about that so yeah a few comments there so one um I do think that if you took you know for most people throughout history if you brought them to the present day uh they would my guess is that fairly quickly and depending on exactly the circumstances they would come to prefer a living uh living in the present day to to the Past even if there are sort of a bit of Future Shock and a bit of um uh some things for alienating or disturbing um and but that said I think the distance the sort of gap between historical humans and the present is actually much much smaller both in terms of time and kind of other factors than the Gap line Vision between present-day humans and the future humans who are living living ideally in a kind of radically better situation um and so I do expect a sort of Greater distance and possibly greater alienation when you first show up my personal view is that uh the best the best Futures are um uh going to be such that if you really understood that and if you really really experience what they're like which which may be a big step and might require sort of extensive engagement and possibly sort of changes to your capacities to understand and experience then you would think it's really good um and so uh and and I think that's the relevant standard so for me I I worry less if the future is sort of initially alienating um and the question for me is how do I feel once I really really understood what's going on I see um so I I I wonder how much we should value that kind of inside view you would get into the future from being there if you think about I don't know many many existing ideologies uh like I don't think of an islamist or something so you might say listen if you could just like come to Iraq and feel the Bliss of uh fighting for the caliphate uh you you would understand better than you can understand from the outside view just you know sitting on a couch eating Doritos what you know what it's like to fight for a cause and maybe that their experience is kind of blissful in some kind of way but um I feel like the outside view is more useful than the inside view there well so I think there's a couple different questions there one is what would the experience be if you had it from the inside um and then there was this I think a subtly different question which was what which is what would your take on this fee if you kind of fully understood where fully understanding is not just um a matter of having the internal experience of being in you know in a certain situation but it's also a matter of understanding what that situation is causing what sort of beliefs or structuring um the ideology whether those beliefs are true and all sorts of other factors and it's the latter thing that I have in mind so I'm not just imagining oh the feature will feel good if you're there um because you know sort of by hypothesis the people who are there at least one hopes they're enjoying it or one hopes their thumbs up if the people were there aren't thumbs up that's a strange a strange Utopia um but I'm thinking more that in addition to uh their perspective they're just sort of more holistic perspective which is the sort of full understanding uh and that's the perspective from which you would endorse uh in endorse this situation I see um and then yeah so another respect in which uh it's interesting to think about what they might think of us is you know like what will they think of the crazy risk we're taking um by not not optimizing for existential risks and um so you know one analogy you could offer I think what mccaskill does this in his new book is to think of us as uh you know teenagers in our civilization's history and then you know think of the crazy things you did as a teenager and how you um and yeah so uh I mean maybe there is an aspect of which like one would wish they could take back the crazy things they did as teenager but my impression is that most adults probably think that while the crazy things were um kind of risky um they were they're very formative and important and um they feel nostalgic about the things they did in the past oh do you think that the future of looking back they are going to um regret the the way we were living in the 21st century or uh or will they look back and think oh you know that was kind of a cool time I mean I guess this is kind of conditional on there being a future which takes away a lot of the mystery here but I doubt that they will look back with um uh with pleasure at uh the sort of risks and uh and horrors of the uh the 21st century I mean if you just think about how uh we or at least I tend to think about uh something like the Cuban Missile Crisis or uh World War II I don't personally have a kind of nostalgia oh you know sure it was risky but it made it made me who I am or something like that um I also want to say you know I think it's true that when you look back on your teenage years there is often a sort of you know let's say you did something like crazy you and your friends used to race you know around and you play chicken or something at the local Quarry and it's like oh all right right but you know you survived right and the real reason not to do that is the like chunk of probability where you just died um and so I think there's a you know to some extent the the it's it's um the ex post perspective of looking back on certain sorts of risk is not the right one uh for especially for death risks that's not the right uh perspective to use to kind of calibrate your understanding of how to feel about it overall I see um okay so I think you brought up Utopia and you have a really interesting post about uh the concept of Utopia uh well so do you want to talk a little bit more about this concept and why it's important and um and also why do we have so much trouble thinking of a compelling Utopia yeah so Utopia for me just means a kind of profoundly better future and I think it's important because I think it's just actually possible I just think it's actually something that we could do uh we we could make if if we sort of play our cards right in in sort of non-crazy ways we could just build a world that is radically better than the world we live in today um and in particular I think uh we often in thinking about um that sort of possibility underestimate just how big the difference in value could be between our current situation and and kind of um what's available uh and I think often Utopias are kind of anchored too hard on the status quo and sort of changing it in in small ways but imagine imagining our kind of fundamental situation basically unaltered um and I think such that it's a little bit like the difference between you know you have a kind of a crappy job or like a Beach vacation and Utopia is like everyone has Beach vacation uh and you know I don't know how you feel about Beach vacations um but I think it's much I think the difference is more like being asleep and being awake uh or sort of uh it's it's more um uh yeah it's sort of it's like living in a cave or living living in the in under the Open Sky I think I think it's like a really big a really big difference and um and that that matters a lot I that's interesting because I remember in the essay you had um you had a section where um you mentioned that you expect Utopia to be recognizable uh it's like to a person alive now um I guess the way you put it just earlier made it seem like it would be a completely different category of experience than we would be familiar with um yeah yeah so is there a contradiction there or uh I'm missing something so I think there's at least attention and and the way I see the tension uh playing out or you know kind of being reconciled is specifically uh via the notion I referenced earlier of kind of you would if you truly understood come to see uh the Utopia as genuinely good but I think that process I mean ideally I think the way we end up building Utopia is we go through a long um patient process of becoming wiser and better and more capable as a species um and and it's in virtue of that process kind of culminating that we're in a position to build um to build a civilization that is sort of profoundly good and radically radically different um but that's a long process and so I I do think you know if as I say if I just transported you right there and you skipped you skipped the process then you might not like it um but uh and and it is quite alien in some sense but I still but if you went through the process of like really understanding and kind of becoming wiser um you would you would would endorse uh-huh that's um that's interesting to me that you think uh the process to get to Utopia is more of a sort of uh maybe I'm misuring it but when you mentioned it's a process of getting wiser and um yeah so it sounds like it's a more philosophical process rather than I don't know we figure out how to convert everything to hedonium and you know it's Eternal Bliss from then on uh yeah so am I getting it right that you think it's more a philosophical process and then why is it that you think so yeah so I definitely don't sit around thinking that Utopia we sort of know what Utopia is right now and it's hedonium I'm not especially into the notion of hedonium but I think it's possible to um I think it's I think the brand is bad um I think I think you know people uh talk about pleasure with this kind of dismissive attitude sometimes and you know hedonium implies this kind of sterile um uniformity uh you know and you're sort of tiling people are talking about they're like gonna tile the universe with edonium and it's like wow this sounds this sounds rough um whereas I think actually you know the relevant perspective when you're thinking about something like hedonium is the kind of internal perspective from which uh the sort of experience of the subject is something kind of uh joyful and you know boundless and kind of uh energizing you know whatever whatever pleasure is actually like pleasure is not a trivial thing I think pleasure is a profound thing in a lot of ways but I really don't I don't assume that that's what Utopia is about at all I think we're at I think a my you know my own values seem to be quite complicated I don't think I just value pleasure I value a lot of different things and more broadly I have a lot of uncertainty about how how I will think and feel about things if I were to go through a kind of process of significantly uh increasing my capacity to understand um I don't I think sometimes when people imagine that they imagine oh we're going to sit around and do a bunch of philosophy and then we'll have like solved normative ethics and then we'll Implement our solution to normative ethics um and that's not what I'm imagining by uh kind of wisdom I'm imagining something um richer and also that involves uh importantly a kind of enhancement to our cognitive capacity so sort of really you know I think we have we have very small we're really limited in our ability to understand the universe right now we have kind of um and I think there's just a huge amount of Uncharted Territory in terms of what Minds can be and do and see and so I want to sort of chart that territory before we start making kind of big and irreversible decisions about what sort of civilization we want to build and the long term I see um and then I another uh maybe concerning part of uh the Utopia is that um yeah as you mentioned the piece many many of the worst ideologies in history have had elements of utopian thinking in them um to the extent that EA and utilitarianism generally are compatible with utopian thinking maybe they don't aggregate utopian thinking but they are compatible with it um do you see that as a problem for uh the movement's health and potential impact is the question something like uh is this a red flag kind of uh you know we look at we look at other ideologies throughout history and they've been uh compatible with utopian thinking um and and maybe sort of um effective altruism or or uh utilitarians or something similarly compatible so should we should we worry in the same way is that the question uh yeah partly and also um another part is um maybe the maybe maybe it's still right uh that like morally speaking Yeah Utopia is compatible with this worldview and the world view is correct uh but that that the implications are that you know somebody misunderstands what is best um they identify as an EA and this leads to bad consequences when they try to implement their scheme yeah so I think there are certainly reasons to be cautious uh in this broad vein um I don't see them as very specific to EA or utility I don't identify as utilitarian but I'm the to utilitarianism um I see them as more are sort of better understood as uh risks that come from believing that something is very important at all and I think it's true that many um acting from a space of of conviction um especially where uh that conviction has has sort of a flavor of you know it's interesting what exactly constitutes an ideology but I think it's I think it's reasonable to look at EA and sort of be like this this looks like an ideology and I think you know and I think um that's I think that's right and I think uh that's sort of important to to you know have the sort of relevant red flags about um I think it's pretty hard to have a view of the world that doesn't in some sense imply that it could be a lot better um or at least a plausible view of the world and and when I say Utopia I don't really mean anything much different from that you know I think it's sort of um I'm not saying a perfect thing I'm not you know I I do have sort of a more specific view about exactly how much better things could be but more broadly it seems to me many many people believe in the possibility of a much better world and are fighting for that in different ways um and uh so I wouldn't I wouldn't pin the red flag specifically to the belief that sort of things can be better um I think it would have more to do with uh sort of what degree of rigidness are you um you know relating to that belief with how are you uh kind of how are you acting on it in the world how much are you willing to kind of um kind of break things or kind of act in uncooperative ways in virtue of that sort of conviction and there I think um uh caution is definitely warranted I see yes I have I I'm not sure I agree that um most people have a view uh or an ideology that implies um uh anywhere close to the kind of Utopia that uh one uh utopian thinking one can't have like if you think of modern political parties in a developed uh democracy uh like in the United States for example if you think of uh what is like a utopian Vision that either party has it's like it's actually quite uh quite banal it's like oh we'll have universal healthcare or I don't know GDP will be higher in the next couple of decades um which is uh which doesn't seem utopian to me it just seems and it does seem um it does seem like a limited world view where they're not really thinking about how much better or worse things could be but it doesn't exactly seem utopian uh yeah I'll I'll let you react to that I think that's a good point so maybe the relevant notion of utopian here is something like to what extent is a concept of a radically better world kind of operative in your day-to-day uh engagement you know to some extent what I meant is that I think I think if I sat down and talked with most uh you know most people um you know we could eventually with some kind of constraints on reasonableness come to agree that things could be a lot better in the world like we could just cure cancer we could cure you know XYZ disease we can just go through a few things like that we could talk about um the degree of abundance that could be available um and I think you know so but the question is whether that's like the kind of structuring or important Dimension to how people are relating to the world I think you're right that it's often not and that's part of maybe um the thing I'm hoping to uh kind of push back against with that post is actually I think this is a really important feature of our situation um I think it's true that it's it can be dangerous and if you're wrong about it or if you're acting um in the right in a sort of um unwise way with respect to it that can be really bad but I also think it's just it's just a really basic fact and I think we just sort of need to learn to deal with it maturely and kind of pretending it's not true I think isn't the way to do that I see um but to me at least utopian or Utopia sounds like uh some sort of peak uh and maybe you didn't mean it this way but uh so are you saying in the essay and generally that you think there is some sort of carrying capacity to how much good things can get or that Beyond a certain point things can keep getting in um indefinitely better uh but at this point we're willing to say that we have reached Utopia yeah so I mean I certainly don't have a kind of hard threshold you know here's here's exactly where where I'm going to call it Utopia um you know I mean something that is profoundly better uh I do think that if you have a finite so you know a very basic level if there's only a finite number of states that uh the sort of affectable universe can be in um and your your ranking of these states in terms of how good they are is uh transitive and complete um then there will be a sort of top um a top and and I you know I don't think that's an important thing to focus on from the perspective of just getting it just you know taking seriously that things could be radically better at all I think like talking about that but exactly how good and what's the perfect thing is is often kind of um distracting in that respect and it gets into these issues about like oh you know um how much suffering is good to have and and a lot of the sort of discourse on Utopia I think gets distracted from basic facts about like at the very least we can do just a ton better um and that's important to keep in mind I see I see you you point out of the piece that many religions and spiritual movements have done the most amount of thinking on what a Utopia could look like and you know there's a very interesting um essay by Nick Bostrom in 2008 where he lays out his vision of what somebody's speaking from the future Utopia talking back to us would sound like and when you read it it sounds very much like a sort of mystical uh mystical essay the kind of thing that uh change a few words and a Christian could write like C.S Lewis could have written about like what it's like to speak down from heaven um yeah so so at what extent is there uh and I don't I don't mean this pejoratively but uh what extent is there some sort of like uh spiritual or religious Dimension to utopian thinking uh that relies on some amount of faith that things can get in sort of uh indescribably better in some sort of ephemeral Indescribable way so I think there are definitely analogs and similarities between some ways of relating to the notion of utopia and uh attitudes and orientations that are common in religious contexts and spiritual context and I think it's um and I think personally uh so I don't think it needs to be that like that as I say I think I don't think it requires faith I don't think it requires anything mystical um I don't think I think this is it's just a basic fact um about our kind of current uh you know our current cognitive situation our current a civilizational situation that um things could be radically better um and uh it's a you know it's ephemeral in the sense that it's quite hard to imagine especially you know for me an important an important source of evidence here is is sort of variance in the quality of human experiences so if you think about your kind of peak experiences um they're often it's it's a really big deal you're kind of you're kind of sitting there going wow this is radically this is serious um and kind of feeling and touch her or feeling that this is this is uh in some sense a a um something you would trade much much sort of mundane experience for the sake of um and I think it's important so the thing that I think we need to do is sort of extrapolate from there so you sort of look at the trajectory that your mind moved along as you as you moved into some experience or some broader non-experiential like your community got a lot better your relationships got about a lot better look at that trajectory and then sort of stare down you know where is that going um and I do think that requires a kind of I don't want to call it faith I think it requires a kind of extrapolation into a sort of Zone that is in some sense beyond your experience but that is sort of deeply worthy and important and I think that's um something that is often associated with with spirituality um and religion and I think uh I think that's okay um but I I actually think there's a there are a number of really important differences between Utopia and something like heaven um so you know centrally Utopia will be a sort of concrete limited situation they were you know there are going to be frictions they're going to be resource constraints uh it's going to be finite um there's there's a bunch of it's still going to be in the real world whereas I think um uh many you know most religious Visions have don't have don't have those constraints and that's an important important feature up there um uh uh yeah of their their situation yeah speaking of constrained constraints this reminds me of Robin Hansen's theory that you know eventually the universal economy will just be made up of um these digital people M's and that because of competition their wages will be driven down to subsystem levels uh which um maybe that's compatible with some Engineering in their ability to experience such that you know it's still Blissful for them to work as assistance levels of compute or whatever um but uh yeah so it seems like this sort of like uh first order of economic thinking implies that there will be no there'll be no Utopia in fact things will get um things will get worse for on average but maybe better uh overall if you just add up all the experience but worse on average uh yeah so so I don't know this vision seems incompatible with yours of a Utopia what do you think yeah I would not call uh Robin's World a Utopia uh and so you know a thing I haven't been talking about is what should our overall probability distribution be with respect to different quality of futures um and what um you know exactly how possible is it uh and How likely is it that we we build something that is sort of profoundly good as opposed to uh mediocre or much worse um and uh I would class Robin scenario in the mediocre or uh or much worse Zone but so do you have a criticism of the logic he uses to derive that to some extent I think my main my main criticism or the first thing that would come to mind is that I think we will very likely um uh like I think competitive pressures are uh are a source of kind of kind of pushing pushing uh the world in in bad directions but I also think there are ways in which um kind of wise forms of coordination and kind of preemptive action can uh can Stave off the sort of bad effects of competitive pressures and and so that's that's the sort of um that's the way I imagine avoiding uh stuff in the vicinity of of what Robin is talking about though you know there are a lot of complexities there yeah yeah um the last few years have not reinforced my uh my my belief in the possibility of wise coordination but uh yeah yeah uh anyways so um yeah one thing I want to talk to you about is you have a paper on what what it would take to match uh humans brains a computational capacity um uh and then associated with that you have uh you know a very good summary on open philanthropy um yeah so do you want to talk about uh the approach you took to estimate this and then why this is an important metric to try to figure out yeah so um the approach I took was to look at the evidence from neuroscience and the literature on uh the kind of computational capacity of the human brain and to talk to a bunch of neuroscientists and to try to you know see see what we know right now about uh the uh the number of floating Point operations per second uh that would be sufficient to kind of reproduce the task relevant uh aspects of human cognition in a computer um and that's important I mean it's actually not it you know it's not clear to me exactly how important this parameter is to our overall picture um I think the way in which it it's uh relevant to thinking that I've been doing and then openfill has been doing is um as an input into an overall methodology for estimating when we might see uh kind of human level AI systems that proceeds by first trying to estimate roughly the the kind of computational capacity of the brain or the sort of um uh the sort of size of the size of a kind of AI system and it's kind of overall parameter count uh and uh kind of compute capacity and that would be sort of analogous to humans and then you extrapolate from that to the training cost the cost to kind of create a system um of that kind using Uh current methods in machine learning and kind of current scaling scaling laws uh and um that methodology though brings in a number of additional assumptions that I think aren't um aren't like this transparent that that's oh yeah of course that's how we would do it or that and so um I think you have to sort of be a little bit more in the weeds to see exactly how it um how it feeds in I see and then yes I think you said it was 10 to the 15 flops uh for um for a human right like what did you have estimate for how many flops it would take to train uh to train something like the human brain I know the gbt3 is like um only 175 billion parameters or something which is can fit into a you know like a a Micro SD card even um but uh but yeah it was like oh 20 million dollars to train so um yeah so do you have did were you able to come up with some sort of estimate for how what it would cost to train something like this yeah so my focus in that report was not on the training extrapolation that was work uh that ajaya Carter at open philanthropy did um using my reports estimate as an input and uh that her methodology involves assigning different probabilities to different kind of ways of using that that input uh to to to derive an overall training estimate um and in particular an important source of uncertainty there is uh the kind of amount of compute required or the sort of number of times we need to run a system per data point that it gets so in the case of something like gpd3 you get a meaningful data point and a gradient update as to how well you're performing um with each token that you output as you're doing gpd3 style training so you're you know you're predicting text from the internet you know you you suggest an X token and then your training process says like nope do better next time or something like that whereas if you're uh say learning to play go and you have to play uh I mean this isn't exactly how or this isn't Hardware system won't work but it's an example if you have to play the full game out and that's sort of hundreds of moves um then before you get an update as to whether uh you know you're playing well or poorly then uh that's a big multiplier on on the compute requirement and so that's that's one of the central pieces that's called what a j calls The Horizon length of of training and um that's a sort of very important uh source of uncertainty in getting to your overall overall uh training estimate I think but ultimately you know she ends up with this big spread out distribution from something like I think gpt3 was like um 10 to the 24 yeah 4 times 10 to the 23 or something like that and you know she's she spreads out all the way up to the evolution anchor I think is something like 10 to the 41 and uh I think her distribution is centered somewhere in the low 30s okay that's that's still quite a bit I guess um how much does this rely on the you know the scaling hypothesis if one thought that the current efforts and the current approach were not um not likely to lead and uh or at least and not likely an example efficient way towards uh towards human intelligence you know it might be analogous to somebody saying we have um Enough tutorial on Earth to power civilization for millions of years um uh but but if you haven't figured out Fusion then it may be irrelevant uh statistic yeah so I think the approach does assume that you can train a human level or sort of uh transformative AI system um with a sort of non-astronomical amount of compute and data using current you know without without major conceptual or algorithmic breakthroughs relative to what's currently available um now the actual methodology AJ uses allows you to assign probabilities to that assumption too so you can if you want you know say I'm only 20 on that um and then uh you have then there are sort of other uh there are a few other options so you can also kind of rerun Evolution which is not uh and and so that's that's an anchor that she provides to sort of uh and this is often what people will say as a sort of upper bound on how hard it is to create um to create human level systems is is to do something something analogous to um to simulating Evolution um so that you know there are a lot of open questions as to how how hard that is um but I do think this methodology uh is a lot more compelling and interesting if you um are compelled by the uh the kind of available techniques in deep learning and by and by kind of scaling hypothesis like views at least in as an upper bound I think it's important so you know there's different ways of of kind of being interested in algorithmic breakthroughs one is because you think deep learning isn't enough another is because you think they will sort of provide a lot of efficiency relative to deep learning such that an estimate like a jazz is an overestimate because actually you know we won't have to do that we'll make some sort of breakthrough and it'll happen a lot earlier um and uh uh and I put I put weight on that view as well yeah that's really interesting so yeah that implies that like even if you think the current techniques are not uh not optimal maybe that maybe that should update you and take favor of thinking it could happen sooner that's really interesting um uh um yeah so yeah then how did you go about estimating uh like uh the amount of flops it would take to emulate uh the interactions that happen in the brain uh obviously it would be unreasonable to say that you have to emulate every single Atomic uh Atomic interaction um but then what is what is your proxy that you think it would be sufficient to emulate so I used a few different methodologies and tried to kind of synthesize them so one was looking at the kind of mechanisms of the brain and what we know about uh the kind of complexity of what they're doing and how hard it is to capture the kind of task relevant or our best our best guess about the task relevant dimensions of the the signaling happening in the brain um and then I also tried to bring in comparisons with uh existing AI systems that are replicating kind of chunks of functionality um that humans uh that the human brain has and in particular in the context of vision um so sort of uh how do our how do our current Vision systems compare with uh the parts of the brain that are kind of plausibly doing analogous processing other often they're often doing other things as well um and then I use the third method which has to do with physical limits on the kind of energy consumption per unit computation that the brain is possibly doing and then a fourth method that sort of gesture at which tries to extrapolate from uh the communication capacity of the brain uh to its computational capacity using comparisons with uh with current computers so it's sort of a triangulation of like you look at a bunch of different sources of evidence all of which in my opinion are pretty weak I think we are um uh we're quite well the physical limit stuff is maybe more complicated but it's sort of a upper bound um I think we are significantly uncertain about all of this and and my distribution is is pretty spread out um but uh the hope is that by looking at a bunch of things at once you can at least get um a sort of educated guess and then yeah so I'm very curious um uh is there consensus in Neuroscience or uh other relevant fields that we understand the signaling mechanisms well enough that we can say like basically this is what it's involved um this is what the system is reducible to um and yeah so this is how many bits you need to represent uh I don't know all the synaptic connections here or is there a variance of opinion about like just how complicated the the Enterprise is uh there's definitely a disagreement and um it was you know interesting and in some sense disheartening to talk with neuroscientists about just how uh you know how difficult Neuroscience is you know it's sort of I think it's easy a consistent message and I have a section on this in the report um was kind of how far we are from really understanding uh what's going on in the brain um especially at a kind of algorithmic level um that so in some sense the report is somewhat opinionated in that um you know there are experts that I found more compelling than others uh there are experts who are much more in a sort of agnosticism mode of like we just don't know um you know the brain is really really complicated who sort of err on the side of a very large compute estimates a lot of emphasis on biophysical detail a lot of emphasis on sort of mysterious things it could be happening that aren't happening and then there are other neuroscientists who are more uh uh you know more willing to say stuff like well we kind of basically know what's what's going on at a mechanistic level which isn't the same as knowing kind of the algorithm the sort of algorithmic organization overall and how to replicate it I sort of lean towards the latter view though I give weight to both and and try to um yeah try to synthesize the the kind of opinions of people I I saw overall just looking at the the post itself I haven't really looked deeper into the actual um the the the paper performance is Drive uh but it seemed like you were to estimate the flaws mechanistically you were adding up the different systems at play here um yeah so should we expect it to be additive in that way or maybe it's like multiplicative or there's more complicated interaction at like the flops grow super linearly to the inputs uh I know that probably sounds really nice having studied it but just like from a uh first glance kind of uh way that that's a question I had yeah so the the way I was understanding um and breaking down the forms of processing that you you would need to replicate in the brain um made uh made them seem not multiplicative in this way so you know an example would be if you think about I mean yes sort of simple example so suppose we have some neurons and they're uh you know they're signaling centrally via spikes through synopsis or something like that and then we have uh glial cells as well which are signaling via like slower calcium waves uh and it's a sort of separate uh separate Network um you know you could think that if it were something like you know the rate of calcium signaling is um uh dependent on the rate of spikes through synapses or something like that then that's an important interaction uh uh but you know overall if you sort of Imagine like this this kind of network processing um uh these are kind of you can just you can estimate them independently and then and then add it up it's they're not they're not actually multiplicative processes on that I'm not conception um I do think there are kind of correlations between the estimates for for the different parts but I uh it's sort of added it but a fundamental level I see okay and then yeah how much Credence do you put in um these sort of uh almost ruboo hypotheses that I don't know of Roger permanos has that thing about there's something like uh something quantum mechanical happening in the brain that's very important in uh for understanding cognition um yeah to what extent uh do do you put Credence in those kinds of hypotheses I put very little Credence in those hypotheses um uh yeah I don't see a lot of reason to think that um I see a good amount of reason not to think it um but it wasn't something I dug in on a ton okay gotcha all right so you have this really interesting blog post about infinite ethics um do you want to talk about why this is an important topic why it's important to integrate into a worldview and so on sure so infinite ethics is ethics that tries to Grapple with how we should uh act with respect to kind of infinite worlds um and how should we you know how should we rank them um how should they enter into our uh our expected utility calculations or our attitudes towards risk um and I think this is important for both kind of theoretical and practical reasons so I think at a theoretical level when you when you try to do this with a lot of common um ethical theories and constraints and principles um they just break on uh on infinite worlds um and I think that's that's an important clue as to their viability because I think infant worlds are at the very least possible um even if our world is finite I mean even if our causal influence is finite or our influence overall is finite um it's possible to have an infinite worlds and we have opinions about them you know like an infinite Heaven is better than an infinite hell and you know uh so I think um often in ethics we we expect our ethical principles to extend to um kind of ranking scenarios or sort of acting in hypothetical scenarios or overall kind of um all possible situations rather than just our actual situation I think um uh Infinities come in there but then I think maybe more importantly um I think it's a it's an issue with practical relevance um and a way to see that is that you know I think we should have non-zero Credence that we live in an infinite World um and uh you know it's it's a very live uh physical hypothesis that the universe is infinite even if I think the the mainstream view is that our causal influence on that Universe um is finite in virtue of things like entropy and light speed and stuff like that um but the universe itself May well be infinite in um uh you know uh and possibly different in a number of different ways uh uh if the sort of Max tegmark has some work on all the different kind of like large you know ways the university really very large there's a number of ways that I think it's just we should have non-zero Credence that that we we can have um infinite influence in our actions now um so uh you know our kind of the causal influence our the limitations there could be wrong it may be that there are ways you know in the future we'll be able to do infinite things um and then I also think somewhat more uh uh exotically that um it's there there are sort of ways of having a causal influence um on an infinite Universe even if you are uh Limited in your causal influence and that comes from some additional work I've done on decision Theory um and so if you try to incorporate that if you're a sort of expected value Reasoner um it just very quickly starts to dominate or at least break your expected value calculations so you know you mentioned long-termism earlier uh and you know a natural reason a natural argument for for getting an interest in long-termism is oh you know in the future there could be all these people their lives are incredibly important so if you do the EV calculation sort of your effect on them is what dominates um but actually if you have even a tiny increase that you can do an infinite thing uh you know either that dominates or it breaks and then if you have tiny credences on doing different types of infinite things and you need to compare them um you need to know how to do it uh and so I just think this is actually you know it's actually a part of our of our epistemology now um though it's I think we often don't uh don't treat it that way because we're often not doing EV reasoning or really thinking thinking about that um that uh that these are questions that just apply to us yeah yeah so that's that's super fascinating um I I if it is the case that we can only have an impact on a finite amount of stuff then maybe it is true that like there's infinite suffering or happiness in the universe at large but uh the Delta between the best case scenario for what we do in the best worst case scenario is finite um but yeah I don't know that still seems less compelling if the the Hell or Heaven we're surrounded by is uh overall not uh it doesn't change um uh can you talk a bit more I think you mentioned uh in your other work on having impact having infinite impact be Beyond uh the scope of what light at speed an entropy would allow us okay can you talk a bit more about how that might be possible sure so um you know a common decision Theory um though it's not I think the mainstream decision theory is a contender in the literature is evidential decision Theory where you should act um such that uh you would be you know roughly speaking happiest to learn that you had acted that way for that reason um and uh so the reason this allows you kind of a causal influence uh so you know a way of thinking about it is suppose that you are a um a deterministic simulation um and there's a copy of you being run uh sort of too far away for uh for you to ever uh causally interact with it right um but you know that it's a sort of um you know it's uh it's a deterministic copy and so it'll do exactly what you do absence some sort of computer malfunction um and now uh you're deciding whether to give uh you know you have two options you can send a million dollars to that well it's a little complicated because he's too far away but um uh you know just in general like if I raise my hand or if I want to write stuff on my whiteboard right or if I'm going to uh you know there's let's say I have to make some ethical decision like whether I should take an expensive vacation or I should donate that money to say someone's life because that the the other guy uh is going to act just like I do um even though I can't cause him to do that in some sense when I when I make my choice um after doing so I should think that he made the same choice and so evidential decision Theory treats his action as in some sense under my control um and so uh if you imagine an infinite Universe where there are an infinite number of copies of you or even not copies people whose actions are correlated with you such that when you act a certain way that gives you evidence about what they do in some sense their actions are under your control and so if there are an infinite number of them uh on evidential decision Theory and a few other decision theories uh then uh in some sense you're having influent influence on the universe yeah this sounds really similar to um this art experiment and quantum mechanics called the epr pair uh which which you might have heard of but the basic idea is if you have two entangled bits and you take them very far away from each other and then you measure one of them and you like before they're brought apart you come up to some rule that like hey if if it's plus we do this if it's minus we do the other thing it seems at first glance that measuring something yourself uh has an impact on what the other person does even though um it shouldn't be allowed uh uh uh by light speed it gets resolved if you take them any Worlds View but um um yeah yeah so that that's very interesting is this just a thought experiment or is this something that we should anticipate for uh some cosmological reason to actually be a way we could have influence on the world so I haven't dug into the cosmology a lot but my understanding is that it's at the very least a very live hypothesis that the universe is um infinite in the sense that there are you know sort of infinite an extent and there are uh you know suitably far away um there are copies of us having just this conversation and then you know even further away there are copies of us having this conversation but wearing raccoons for hats um and you know and and all the rest um which you know is itself something to wonder about and sit with but you know my understanding is this is this is just a live hypothesis and more broadly um kind of Infinities playing you know infinite universes are just sort of a part of of uh of mainstream cosmology at this point um and so uh yeah I think it I think I don't think it's just a thought experiment I think infinite universes are are live and then I think um uh you know these sort of non-causal decision theories are actually my sort of best guess decision theories um though that's not a mainstream view uh so uh it's fairly um I think it comes in Fairly directly and substant

Original Description

Joseph Carlsmith is a senior research analyst at Open Philanthropy and a doctoral student in philosophy at the University of Oxford. Learn about Joseph and read his work at: https://www.josephcarlsmith.com/ Podcast website: https://www.dwarkeshpatel.com/p/joseph-carlsmith Apple Podcasts: https://apple.co/3Qcboq0 Spotify: https://spoti.fi/3zrElYb Follow me: https://twitter.com/dwarkesh_sp Follow Joseph: https://twitter.com/jkcarlsmith Timestamps: 00:00:00 Preview 00:00:55 Introduction 00:03:42 How to define a better future? 00:10:08 Utopia 00:26:01 Robin Hanson’s EMs 00:28:24 Human Computational Capacity 00:35:04 FLOPS to emulate human cognition? 00:41:04 Infinite Ethics 01:01:40 SIA vs SSA 01:18:42 Futurism & Unreality 01:24:25 Blogging & Productivity 01:29:32 Book Recommendations 01:30:53 Conclusion

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Dwarkesh Patel · Dwarkesh Patel · 27 of 60

← Previous Next →

Rubik's Cube Encryption Demo

Rubik's Cube Encryption Demo

Bryan Caplan - Nurturing Orphaned Ideas, Education, and UBI

Bryan Caplan - Nurturing Orphaned Ideas, Education, and UBI

Matjaž Leonardis - Science, Identity and Probability

Matjaž Leonardis - Science, Identity and Probability

Robin Hanson - The Long View and The Elephant in the Brain

Robin Hanson - The Long View and The Elephant in the Brain

Caleb Watney - America's Innovation Engine

Caleb Watney - America's Innovation Engine

Alex Tabarrok - Prizes, Prices, and Public Goods

Alex Tabarrok - Prizes, Prices, and Public Goods

Scott Young - Ultralearning, The MIT Challenge

Scott Young - Ultralearning, The MIT Challenge

Scott Aaronson - Quantum Computing, Complexity, and Creativity

Scott Aaronson - Quantum Computing, Complexity, and Creativity

Uncle Bob - The Long Reach of Code, Automating Programming, and Developing Coding Talent

Uncle Bob - The Long Reach of Code, Automating Programming, and Developing Coding Talent

Michael Huemer - Anarchy, Capitalism, and Progress

Michael Huemer - Anarchy, Capitalism, and Progress

Sarah Fitz-Claridge - Taking Children Seriously | The Lunar Society #15

Sarah Fitz-Claridge - Taking Children Seriously | The Lunar Society #15

Byrne Hobart - Optionality, Stagnation, and Secret Societies

Byrne Hobart - Optionality, Stagnation, and Secret Societies

David Deutsch - AI, America, Fun, & Bayes

David Deutsch - AI, America, Fun, & Bayes

Bryan Caplan - Labor Econ, Poverty, & Mental Illness

Bryan Caplan - Labor Econ, Poverty, & Mental Illness

Jimmy Soni - Peter Thiel, Elon Musk, and the Paypal Mafia

Jimmy Soni - Peter Thiel, Elon Musk, and the Paypal Mafia

Razib Khan - Genomics, Intelligence, and The Church of Science

Razib Khan - Genomics, Intelligence, and The Church of Science

Pradyu Prasad - Imperial Japan, the God Emperor, and Militarization in the Modern World

Pradyu Prasad - Imperial Japan, the God Emperor, and Militarization in the Modern World

Manifold Markets Founder - Predictions Markets & Revolutionizing Governance

Manifold Markets Founder - Predictions Markets & Revolutionizing Governance

Ananyo Bhattacharya - John von Neumann, Jewish Genius, and Nuclear War

Ananyo Bhattacharya - John von Neumann, Jewish Genius, and Nuclear War

Agustin Lebron - Trading, Crypto, and Adverse Selection

Agustin Lebron - Trading, Crypto, and Adverse Selection

Sam Bankman-Fried - Crypto, FTX, Altruism, & Leadership

Sam Bankman-Fried - Crypto, FTX, Altruism, & Leadership

Alexander Mikaberidze - Napoleon, War, Progress, and Global Order

Alexander Mikaberidze - Napoleon, War, Progress, and Global Order

Sam Bankman-Fried On FOCUS

Sam Bankman-Fried On FOCUS

Sam Bankman-Fried on GREAT FOUNDERS

Sam Bankman-Fried on GREAT FOUNDERS

$30 BILLION Opportunity Ignored by Sam Bankman-Fried Competitors

$30 BILLION Opportunity Ignored by Sam Bankman-Fried Competitors

Fin Moorhouse - Longtermism, Space, & Entrepreneurship

Fin Moorhouse - Longtermism, Space, & Entrepreneurship

Joseph Carlsmith - Utopia, AI, & Infinite Ethics

Joseph Carlsmith - Utopia, AI, & Infinite Ethics

Will MacAskill - Longtermism, Effective Altruism, History, & Technology

Will MacAskill - Longtermism, Effective Altruism, History, & Technology

Steve Hsu - Intelligence, Embryo Selection, & The Future of Humanity

Steve Hsu - Intelligence, Embryo Selection, & The Future of Humanity

Austin Vernon - Energy Superabundance, Starship Missiles, & Finding Alpha

Austin Vernon - Energy Superabundance, Starship Missiles, & Finding Alpha

Charles C. Mann - Americas Before Columbus & Scientific Wizardry

Charles C. Mann - Americas Before Columbus & Scientific Wizardry

Tyler Cowen - Why Society Will Collapse & Why Sex is Pessimistic

Tyler Cowen - Why Society Will Collapse & Why Sex is Pessimistic

Bryan Caplan - Feminists, Billionaires, and Demagogues

Bryan Caplan - Feminists, Billionaires, and Demagogues

Brian Potter - Future of Construction, Ugly Modernism, & Environmental Review

Brian Potter - Future of Construction, Ugly Modernism, & Environmental Review

Kenneth T. Jackson - Robert Moses, Hero of New York?

Kenneth T. Jackson - Robert Moses, Hero of New York?

Edward Glaeser - Cities, Terrorism, Housing, & Remote Work

Edward Glaeser - Cities, Terrorism, Housing, & Remote Work

Byrne Hobart - FTX, Drugs, Twitter, Taiwan, & Monasticism

Byrne Hobart - FTX, Drugs, Twitter, Taiwan, & Monasticism

Nadia Asparouhova — Tech elites, democracy, open source, & philanthropy

Nadia Asparouhova — Tech elites, democracy, open source, & philanthropy

Bethany McLean — Enron, FTX, 2008, Musk, frauds, & visionaries

Bethany McLean — Enron, FTX, 2008, Musk, frauds, & visionaries

Holden Karnofsky — History's most important century

Holden Karnofsky — History's most important century

$30m Grant to OpenAI?

$30m Grant to OpenAI?

Does GPT Have Holden Worried?

Does GPT Have Holden Worried?

Lars Doucet — Progress, poverty, Georgism, & why rent is too damn high

Lars Doucet — Progress, poverty, Georgism, & why rent is too damn high

Deep Learning Changes Everything

Deep Learning Changes Everything

Garett Jones — Immigration, national IQ, & less democracy

Garett Jones — Immigration, national IQ, & less democracy

Marc Andreessen — AI, crypto, 1000 Elon Musks, regrets, vulnerabilities, & managerial revolution

Marc Andreessen — AI, crypto, 1000 Elon Musks, regrets, vulnerabilities, & managerial revolution

Why You Shouldn't Start A Startup

Why You Shouldn't Start A Startup

The Future Of Venture Capital

The Future Of Venture Capital

The Crucial Skill For A Startup Founder

The Crucial Skill For A Startup Founder

Brett Harrison — FTX US former president speaks out

Brett Harrison — FTX US former president speaks out

Nat Friedman (Github CEO) — Reading ancient scrolls, open source, & AI

Nat Friedman (Github CEO) — Reading ancient scrolls, open source, & AI

Ilya Sutskever (OpenAI Chief Scientist) — Why next-token prediction could surpass human intelligence

Ilya Sutskever (OpenAI Chief Scientist) — Why next-token prediction could surpass human intelligence

Impact of Taiwan Invasion on AI

Impact of Taiwan Invasion on AI

Reliability is Bottleneck on AI - OpenAI Founder

Reliability is Bottleneck on AI - OpenAI Founder

Next Token Prediction SOLVES AI Says OpenAI Founder

Next Token Prediction SOLVES AI Says OpenAI Founder

Harmful Uses of GPT - OpenAI Founder

Harmful Uses of GPT - OpenAI Founder

Why OpenAI Founder Thinks AI Is Near

Why OpenAI Founder Thinks AI Is Near

AI will help us achieve enlightenment - OpenAI Founder

AI will help us achieve enlightenment - OpenAI Founder

Eliezer Yudkowsky — Why AI will kill us, aligning LLMs, nature of intelligence, SciFi, & rationality

Eliezer Yudkowsky — Why AI will kill us, aligning LLMs, nature of intelligence, SciFi, & rationality

Richard Rhodes — The making of the atomic bomb

Richard Rhodes — The making of the atomic bomb

Joseph Carlsmith discusses the concept of Utopia and infinite ethics, highlighting the challenges of existential risk from artificial intelligence and the importance of research methods, paper reading, and AI ethics. The conversation explores the possibilities of a profoundly better future and the need for careful consideration of the potential risks and benefits of advanced technologies.

Key Takeaways

Read research papers on AI ethics
Conduct research on existential risk from artificial intelligence
Analyze concepts of infinite ethics
Apply AI ethics principles
Understand utopian thinking and its implications

💡 The concept of Utopia and infinite ethics highlights the need for careful consideration of the potential risks and benefits of advanced technologies, and the importance of research methods, paper reading, and AI ethics in navigating these challenges.

🔒 Pro feature: Ask AI to explain this lesson →

More on: Research Methods

View skill →

Mechanics of Materials III: Beam Bending

Mechanics of Materials III: Beam Bending

Inaugural Lecture: Juliane Reinecke

Inaugural Lecture: Juliane Reinecke

Saïd Business School, University of Oxford

Hands-On Learning: How and Why You Should Build a Home Lab

Hands-On Learning: How and Why You Should Build a Home Lab

SANS Live Online Interactive Remote Lab and Range Demo – SEC599: Defeating Advanced Adversaries

SANS Live Online Interactive Remote Lab and Range Demo – SEC599: Defeating Advanced Adversaries

Does Water Swirl the Other Way in the Southern Hemisphere?

Does Water Swirl the Other Way in the Southern Hemisphere?

Undergraduate Research Forum 2026

Undergraduate Research Forum 2026

Related AI Lessons

I Spent Weeks Looking for a Research Gap Before I Realized I Was Searching the Wrong Way

Learn how to effectively find research gaps by changing your approach, a crucial skill for AI researchers and academics

ICMI 2026 Reviews [D]

Learn how to interpret ICMI 2026 reviews and improve your paper's acceptance chances

Reddit r/MachineLearning

Workshop submission for main conference paper under review [D]

Learn how to navigate submitting a paper to a non-archival workshop before the final decision of a main conference like ECCV

Reddit r/MachineLearning

Kept context-switching between arxiv, OpenReview, GitHub, and HuggingFace for every paper, so I built this. Chrome extension + website with everything inline, plus citation graph + SPECTER2 neighbors. 3M papers, free, feedback welcome [P]

Streamline your research with a new Chrome extension and website that integrates 3M papers from arxiv, OpenReview, GitHub, and HuggingFace, including citation graphs and SPECTER2 neighbors, and provide feedback to improve it

Reddit r/MachineLearning

Chapters (13)

Preview

0:55 Introduction

3:42 How to define a better future?

10:08 Utopia

26:01 Robin Hanson’s EMs

28:24 Human Computational Capacity

35:04 FLOPS to emulate human cognition?

41:04 Infinite Ethics

1:01:40 SIA vs SSA

1:18:42 Futurism & Unreality

1:24:25 Blogging & Productivity

1:29:32 Book Recommendations

1:30:53 Conclusion

Beyond Big Vendors: ERP Systems Explained #shorts

Digital Transformation with Eric Kimberling