a16z Podcast | Data Network Effects
Skills:
ML Maths Basics60%
Key Takeaways
The a16z Podcast discusses data network effects, a crucial concept for software-based businesses, especially with the prevalence of machine learning and deep learning in startups, with a focus on how to leverage data to create value and balance data team building with data accumulation, featuring insights from Vijay Pande and Alex Rampell
Full Transcript
hi everyone welcome to the a 6nz podcast I am sonal and today we're doing a podcast on data network effects and we have two general partners here to have that conversation with us we have Vijay Pandey who covers all things bio and Aleks Rempel who covers all things FinTech as well as other areas welcome guys hey thank you okay so first let's just kick off by talking about what a data network effect is in the most simplest form it's a network effect that results from data and if a network effect is defined as something where values were the value to users and all the participants increase as more users use a particular platform or marketplace how does this play out with data so if you think about eBay which is more people more buyers go to eBay because more sellers go to eBay more sellers go to eBay because more buyers go to eBay that is the canonical network effect and their commerce is happening that's that's the transaction for a data network effect typically the there's no commerce per se there's an extraction you're either reading or writing in most cases you're reading and by reading or writing you mean in the database that's like reading to a database and as more people write the value of each read goes up that's the way of thinking about it so an example would be the credit score I could figure out what your credit is by just looking at you and profiling you in legal ways not illegal ways and saying here's how here's what I think your proclivity to repay is but if every bank on earth is using one central repository then they will pay more money to actually extract to read the reads become far more valuable and if a new company started tomorrow and said hey we're going to do credit score's we're going to charge a dollar per extraction per read and not $10 per extraction well there's nothing to extract like they can't actually provide any value if they end up having more data than the current number one person then they could charge a lot more than than ten dollars they could charge a hundred dollars and in fact the value of the number two person goes to zero because they actually have a demonstrably poor product which is why there aren't really any competitors to eBay right it's also a way people often talk about network companies have network effects as as winner-take-all markets yeah which is generally the case or winner take most or where to take like the vast majority but I think if it is if you think about it in the database sense of reads and writes the reeds just become disproportionately more valuable as more people are using a central repository of data you know on the medical side there's interesting aspects that combine in with machine learning as well because the database model is I think a very natural one but then if you put data science and machine learning on top the reeds can become much more higher value because you could of the insights that you can gain from the data as well especially these new modern machine learning methods like deep learning just crave data and so often you have to reach a critical mass before they can even be used I'll give you another example like Google right now and Facebook what do you think about translation services which has nothing to do with FinTech but I want to go translate text or I want to go look at images and figure out what they are Google Baidu Facebook people that have large corpus corpuses are just they have such a huge advantage because if I want to figure out translation at scale and I have no data on which to draw this isn't a read/write problem because you're not making a central repository like eBay is a central repository it's a marketplace where people trade or the credit bureaus or marketplaces where people trade or anti-fraud companies are that work with lots of e-commerce companies there's central market places where people trade here it's just like Google can have not the best computer science they probably do have the best computer science but imagine that they didn't but they had the biggest corpus of data Megan is go acquired the best computer science and the unfair advantage that they have their data network effect is effectively as they get better translation they can actually use that to make their translation software even better and they also have users to autocorrect that as well so that's another example of a data network effect where like the corpus is the demonstrable advantage so one thing that confuses me here and I feel like we overuse this as a result is that sometimes people conflate having a lot of data to your point in this case the large corpus is required in order to create better results which is a feature of machine and deep learning but sometimes people conflate having a lot of data and say we have a data network effect and that's actually not true so how do we sort of travel from having data to actually having a data network effect that results from that data yeah you have to have a plan to actually do something with the data right and usually this is something where you guys are providing of higher quality let's say in Diagnostics you because you know so many other results that you can actually do their job at predicting and diagnosing or do something cheaper and obviously the combination the two higher quality at lower cost is really a game changer well the other thing is that having a lot of data is not a network effect of having a lot of data doesn't have a plan to make your data better go back to credit bureau I have a lot of data on Experian therefore people write and read from me and therefore I get more data and my data gets better as opposed to like look at visa you know where these about my company and they have a tremendous amount of data they could predict the US economy down to like probably the ninth percentage point or a decimal point but knowing having all of that data doesn't make their data better people don't want to go transact it's an output so it's like exhaust so a lot of data actually takes that form of an exhaust and it makes it very very valuable but there's no network effect typically to exhaust type data as opposed to when the data is actually it's a key component of the business model and there is this concept of more people want to write because more people want to read more people want to read because more people want to write and replace that with the commerce aspect of like buyers and sellers right so if you were to operationalize that and make even more even more concrete than that one thing I've heard is that you have to have an algorithm to actually take the data out and then to your point add value back it I mean how would you sort of operationalize this more concretely for people who are building products if they want to build data network effects what should they do yeah I mean that really varies obviously in terms of the domain and the company but you know some sort of data science machine learning is very natural to be able to apply to this but I think you know sometimes this doesn't have to be fancy and machine learning or anything like that just the ability to monetize something from that and really something where you your company gets better yeah I think part of the problem as well is that algorithms like if you look at compression algorithms over time there's one called lzw which has been around for a very very long time it's pretty good and then the next one that was better was maybe 1% better the next one that was better was 2% better and if you are an algorithm company it's very very hard to build any kind of value because somebody else comes up with a marginally better algorithm so you need to pair the algorithm with the data and there's actually a shift going on right now from outputting just the data to more outputting a Earned aspect on top of that that's the algorithm part I mentioned fraud in anti-fraud companies so they're a bunch of companies there's one in our portfolio called signifyd there's another one that I invested in as an angel a long time ago called sift science and for a long time many of these companies will tell you ok do we think this is risky they'll tell you all of the answers that got pulled from the data so go back to credit reports well credit reports are a combination of like what you did in your past you got this thing when you got out of college you didn't pay that loan on time you you know you were a deadbeat for this doctor or whatever whatever and then there's a credit score which is a heuristic that's built on top of that so it's actually interesting if you look at credit reporting right now the goal of applying machine learning is to actually come up with a better heuristic so this is the thing where you need the data repository and ideally it's proprietary to you because then you can extract more economic rent if you're building a company here and then you want to have a better set of heuristics on top of that that's the algorithm and neither one alone is really sufficient I mean it is sufficient I guess you could say if you have the data the data network effect tends to be more valuable than the algorithm but you can extract more value if you're not saying here are 50 things for you to go to analyze on your own and we're the only one that have access to that and then you have to hire a team of 50 50 people to go analyze it but now you actually have an algorithm that outputs you a decision and you can use that decision and that's an even bigger advantage for a tech company that has a data network effect and while they're not formally related usually one Falls the other you have you're the one with the big giant corpus you'll attract the very best data scientists because they want to dive into that they'll come up with the right features and the right ideas and that will be another sort of effect on top so how you saw the chicken egg problem in this scenario and and buy the chicken egg problem we talked about the conundrum of how where you start like an example you just shared VJ is it the corpus that comes first and then the data scientists or do you get the data scientist first to create that corpus like how does it sort of come together you know there's a couple different strategies one common strategy is to sell something at cost or not necessarily with huge margins in order to be able to gather data you know in principle 23andme was doing something where they're getting these kits out and gathering huge data sets and then downstream making big research deals that a canonical example but that's not easy to do to build up that size so quickly yeah another example is I mean Google didn't set out and whatever it was 1998 or whatever they were incorporated long time ago almost 20 years ago to become a deep learning company this was almost like wow we've been scanning the web forever we have hundreds of thousands of servers or however many they have around the planet we have all these images that we've stored now we have a corpus and we also have a very profitable business let's go get a bunch of data scientists and machine learning people and figure out what we can do so that's called the accident that that's the atypical one but that is actually it's atypical but at the same time it is quite typical because some of the best people out there today are working at companies like Google or like Facebook and Facebook didn't want to be an image recognition company back in the day it fell into it because they have such that so that that enormous corpus the other example is you kind of move up the value chain over time so I'll talk about the fraud example here where a lot of the anti-fraud companies like Twitter has a fraud problem but Twitter what is the economic impact of fraud on Twitter it means that somebody opened an account and they've been spamming somebody or there's trust and abuse or things that don't have massive economic impact they're annoyances but they're not really really problematic Blue Nile has a much much bigger problem Blue Nile sales diamonds online so as you know diamonds are very very expensive they're very very small and you know one pound of diamonds is worth millions of dollars so if you lose the equivalent of one pound in diamonds to fraud like that's not good right has economic it's if you're economic so you know you can imagine on the fraud scale and yet actually there's overlap because bad people tend to do lots of bad things so somebody who's truly a bad person might open up a bad Twitter account and then actually steal a credit card number as well and then use that stolen credit card number to go steal a diamond and then they might do all sorts of other unsavory things as well and the nice thing is that bad people because they don't exist in pockets there is horizontal overlap here across all these different verticals if you go and it's almost like what Vijay was saying where it's not even giving it away for free because that's hard to sustain for too long but you can go to people that have vast vast numbers of Rights going back to the readwrite analogy so Twitter would be able to say ok we will give you information everybody who's potentially a bad account or a good account will just let you watch these people not watch their data but just profile them like you know here's their browser type here's their IP address here's a cookie that was on their machine things like that and now you build up 50 million bad people and Twitter will pay a little bit of money for this not that much because there isn't a data network effect then you merge that with tumblr then you merge that with somebody else and none of these people will pay that much but now the value of a read is getting of substantial size to Blue Nile the diamond company or to any other ecommerce company whereas if you went to Blue Nile from scratch and you said hey you should use our anti-fraud technology and not these guys anti-fraud technology you yeah you don't have a data network effect at all so it's very hard to say like you might have a better algorithm but again it's hard to extract that much economic rent from a marginally better algorithm because it's only marginally better today and not tomorrow potentially and you don't have enough data as well so you might bootstrap yourself by a different vertical so it's part of what you also touch on is this notion of pooling data among different sources how does this play out in both FinTech and bio because I would think if data is your advantage and yet you need more data especially in science we have open science and sharing how do you then sort of overcome that sort of silo effect and create that shared central repository when everyone wants to protect their data yeah it's a huge challenge on the health side because of things like HIPAA which require anonymity and and become natural barriers and so but that's also therefore an opportunity for the company that can put everything together but also you know what's interesting is that there just is so much data there I mean whether we're talking about data from clinical trials or from patients or from Pharma and so the opportunity is huge if a company can work out those logistical issues yeah and likewise I mean it's very hard to get competitors to work together so as an example if you carry credit card debt imagine that you have five credit cards every credit card company should want to know how much you're spending on the other credit cards because if you go imagine that you decide I'm gonna flee the country and renounce my US citizenship and never pay any of my debts back and you have five credit cards that each have a twenty thousand dollar limit well you could just go steal a hundred thousand dollars with him punitive and that would be very bad Chace should want to know how much you're charging on your Amex card at any point in time Amex doesn't want to tell Chase and in many cases this actually creates the opportunity for a separate company and you anonymize everything you wash it you make sure that nothing is actually of discernible value because if Amex is turning over their complete customer list to chase every night that would not be like I can't imagine that agreement ever ever happening so part of what the data company does is they figure out how to sanitize it they deal with the political issues and then everybody benefits from being part of this cooperative and it's very hard to get these things off the ground but the nice thing is that the companies themselves have left to their own devices will never do it and yet at the same time it's a very very big problem for them so is the ideal opportunity then first startup to be sort of at that center of all these different players like play a broker like role or to try to create something in its own vertical I mean like where do the opportunities lie here for startups and both of your spaces and beyond I think I mean I hate to say it depends but it really depends because I mean in some cases you're creating something new and you're not really I mean like in the fraud case it's not like you're extracting like very very confidential confidential information and sanitizing it or there's a company called yodelling which is very very interesting they are like every FinTech company pretty much on earth right now is in some way shape or form using yo delete to aggregate information across all of these different financial services companies like so you have an e trade account you have your IRA with bit of fidelity and you've got your bank account with Bank of America and you want to put them in a mint like interface whether on mobile or on the desktop yoder Lee is typically the player behind the scenes that's aggregating all of that but then yotally actually retains all that information as well and they can use it for on an anonymized basis their own purposes that didn't exist before so people are doing all sorts of cool things on that data as well to figure out you know what's happening in the world so it really depends on whether or not you were like that there's the I have to build a cooperative and there are only ten companies that have this data and I'm going to be the UN between them you sure that's very very valuable but it's very very hard to be the UN because these are very very large monolithic companies that can't agree on anything and getting them to agree to work with you or anybody that matter is a that's an uphill battle if you can get it there's a lot of value there I tend to like the companies that they're not reliant on playing peacemaker with ten but there are thousands and then eventually you can build up with thousands and then sure those ten have no choice but to use your information because acting in a centralized manner it's so important and there's nothing else quite like it going back to the to the network effect piece I think there's a lot that these healthcare side can learn from the FinTech side my assessment of things is that it's maybe a little bit further behind and there's a lot of different reasons for this one reason is even just the use of electronic medical records or um ours is only relatively recent and and that's that's really changing but that's much more recent and to speak to Alex's point there are generally just a few big players there's not like a thousand health insurance companies or something like that so there are these new challenges but I think I'm always curious when Alex and I chat to see what what tricks can be borrowed from the FinTech space into the healthcare space so one question and you may not have the answers for it but I think it's worth us discussing is sort of the ethical implications of users in a system where the biggest value were the network effect now accrues from data and as it is users are always you know there's a lot of advocacy groups who say like users should have the right to extract their data and do whatever they want with their own data which is a separate point but related in the sense that it touches on how much agency who has that agency and what are the ethics associated with all of this VG we should probably start off with you because I think with HIPAA it's automatically a constraint in place yeah there's HIPPA which you know requires the anonymization and sometimes that is not as obvious as you might think it's not just removing someone's name if someone has a scan of your brain like an MRI of your brain is that anonymized is that because maybe that could go back to you is your genome sequence anonymized just having that sequence alone might be enough to be able to connect it to you with the blood test probably it is and so it's actually a much more of a profound sort of philosophical issue to think about but on the flip side the upside could be really quite huge it could be the difference between pooling everyone's information to be able to predict whether you're going to get cancer or not and I would like to have my information in there and I'd like to know those things and so there's going to be something that we're going to have to sort of figure out on the and policy side to figure out what's the best thing to balance these two forces the other interesting point there is that there is I mean in economics there was this concept of the public good or free rider problem and you often have that so going back to reads and writes everybody wants to read but nobody wants to write and in many cases like I mean if writing means giving your blood and actually going to a phlebotomist and getting blood withdrawn from you like reading is very easy readings a lot of fun writing actually requires a lot of work and so there are two ways that I think about that that's that that's obviously a health-related analogy but I think of that about this in terms of on the one hand you've got kind of regulatory issues and there's also just like a lack of consumer understanding so I remember a good friend of mine who's not very literate computationally or technology technological II was saying oh my god Alex I have all these cookies on my computer like I'm being tracked this is terrible how do I like cookies are dangerous and I happen and I think some tech column has contributed to all of his confusion over cookies I was trying to explain to this friend that if you go to the New York Times do you want to have to log in to the New York Times every time you go to New York Times coms like no I'd hate to log in every time that's annoying it's like well that's what a cookies doing it's remembering on your own browser some information so the New York Times can reference you and actually D anonymize you and then when he understood it that way was like oh I like cookies it was just this this kind of fundamental misunderstanding the benefit to the users sort of greater than right so part of it is it is the free-rider thing of like sometimes providing data actually makes you better like if I'm willing to give up more information for insurance purposes like okay will I let my car insurance company see how fast I'm driving and on the one hand that sounds like really really spooky like oh my god they're watching what I'm doing and then Big Brother this in 1984 that that sounds terrible on the other hand if I'm willing to give that up and I show that I never drive past the speed limit I never veer out of my lane you get a better insurance I he had a better insurance right so part so part of it is like it's not caveat emptor it's like whatever the latin phrase would be like choose your own destiny kind of thing some people will value time more than money some people value money more than time the same thing goes with privacy some people value privacy more than money some people will value money more than privacy and I think part of it is just making it transparent so that that's one side the other side is is how you educate people right I think how you talk about it to your point transparency how you talk about it and sometimes giving users a choice to opt in or out of a system it also doesn't have to be black or white I think with especially with the machine learning you could learn features from data without having to share the data itself and that's useful for IP or for hip or and so on so I think there's a lot of ways that one can contribute to network effects without making your data even publicly known or even exchanging data necessarily right and I think actually in most cases the benefit of the doubt I mean right now it's like the company that's using data is the evil company and they're up to some pernicious whatever and that's almost never the case I just think that that's a lot of like Congress goes and investigates company XYZ because they're using data or what are they doing with consumer data and part of it is like the default assumption is that these guys are out to get you in most cases that's not true and there are a lot of good things that do come from being part of this cooperative and I think as people do like I would love it if you don't get charged more because that's where like how would people react poorly or negative or positively poorly your insurance company says hey we saw that you were speeding you're getting charged a lot more can you imagine how terribly people would react to that it's like up and up in arms congressional inquiries bla bla bla on the other hand if you got a giant rebate check from your insurance company saying hey you've been driving very safely or you haven't gone to see the doctor in a long long time and the last time you went to go see the doctor all your vitals were better here's a rebate check people would love that and that's coming from data as well so I think part of it is just the psychology of how you how you reward people for sharing their data when in many respects it's already being shared anyway you're right okay so this has been helpful so far so let's talk about the fact that we think data network effects are really important for software based companies especially in this age as you mentioned in machine learning deep learning ai all the things kind of trends coming together so what concretely can entrepreneurs do to a build data network effects or think more strategically about it early on versus by accident and secondly what do you want to see in pitches from entrepreneurs when they talk about data network effects yes so you know in terms of a start-up usually starts do well when they focus on one area and so the challenge here is that how can the data network effect really accelerate what they're doing I think too often that what happens is the de network effect almost suggests a side business or something like that and and so one challenge is how to think about what is the rule go to market is the Dana Network effect really germane and central and key to the focus and then how can you monetize it how can you take advantage of it what often happens is I think there's the aspiration for taking advantage of the day and network effect or the assumption that it will just come but often we see situations where maybe that plan hasn't been well thought out yet you know I would say that in many cases it's about going up the value chain so starting at the bottom where your data doesn't like you're you're accumulating rights with the purpose of hopefully charging for reads down the road and/or hopping across different verticals so you start off in vertical X where again you have your right heavy which is great because every write that you're getting is more data that you can eventually learn from and even if you're not learning from as we talked about there's a network effect that might play out there and then eventually you go into an area where it has high monetary value and you're charging for reads but you're still continuing to get rights along the way and I think economics is really the best way of looking at how effective this really is because there are a lot of people that claim I have a data network effect I have a data network effect or they'll say I will have one eventually and it's like sure like Google had one eventually or Facebook had one eventually I mean everybody has one eventually but it's very hard to prognosticate that eventualities when that happens and how do you make it more deterministic versus kind of economics comes in so like imagine that you're at the stage where you actually are charging for your product a good sign is that assuming that you kind of started off in the low monetary value area and now you're charging for reads in the high monetary value area if you are charging more than the incumbents I mean normally you say oh if I can like charge one-tenth as much then it's going to be very disruptive and I'm shrinking the market but you actually have the opportunity to charge a lot more value-based pricing so if you can really show that you're charging twenty thirty forty percent more than the competition that's a and they're actually willing to pay for it and they're switching from a lower priced product either they're totally irrational they say hey I want to lose more money this year and increase my cost which by the way almost never happens or you've actually demonstrated in the eyes of many many customers that they are willing to pay or because your data is better and they're contributing back to this collective as well which almost de-facto means that you do have a data network effect and it's not about prognosticating it's like it's actually real that's a great example I will say that the exception that seems to me is in the very early days of a company where people are actually not quite you can't really use proxy the proxy people paying quite yet so then how do you sort of figure it out there can be other network effects I mean Google and Facebook had clear other network effects and that sort of helped create the day and network effects so I think that's one mechanism by which this can be bootstrapped or likewise if as long as the entrepreneur has a pretty clear plan and they have access to a lot of rights and they don't necessarily have to be charging for those but it just seems pretty evident like the hard thing is in my example on e-commerce fraud how do you get twitter to sign up to supply you with the rights or how do you get some other like massive publisher that doesn't really have that much economic downside from fraud but has tons and tons of data if it was actually assigned to a company like this if you sign up ten of those it's almost very easy to see the blueprint of wow you've solved the biggest problem which is you now have the rights you have a different more execution oriented problem which is how do you go charge for the reads and how do you show that you have enough value but at least you've solved the right side of the database so what I'm really hearing is a theme for you guys is you know it can happen by accident but if you have a plan if you're even aware and intentional about some of the decisions you make those are all contributing factors to actually just create and be better at building data and the other thing going back to this winner-take-all things like you're never going to get to a network effect if it's a I mean if there are 25 companies doing exactly what you do and they're all about the same size and nobody gets the big that nobody has like a just demonstrably better system then the data is actually it looks more like the algorithm remember I we talked about how the algorithm gets like one percent better every year and this company can out algorithm that company until they can't the same thing goes for data if nobody really gets to that critical that critical point then it's never going to be that much better and you can't charge excess rent either great so any parting advice for entrepreneurs before we wrap it up you know in the data science side my personal opinion some of the best data Sciences are ones where people can go deep within the domain and it's something where it's not just taking off the shells algorithms and so on and this is especially important in this case with a data networking effect because this is where speaking to Alex's discussion of data and algorithms often the two are really tightly connected and having a deep experience in the domain and on the algorithm side can really bring that about we call it founder market but it's almost like data algorithm founder fit and there's a profound HR implication of this as well because the algorithm set is related to the data side because if you have the best data then guess what you're gonna be able to hire the best people because if you're an amazing statistician you don't want to work on a database that has five rows on it you want to work on I did is that has five trillion rows in it and if you have that then you come up with a better algorithm and therefore more people want to contribute data you're getting more rights therefore you're getting more reads and that kind of continues on and on and the the HR component is very important because the best people again you can attract them to a start-up I often advise companies where they say oh you know we're gonna hire these five data scientists but they don't have any data yet and what they don't really realize is that if these are if their data scientists who are happy to take out the trash and you know clean the toilets and do all the other things that are fun about running a startup then that's fantastic but if they really really are very laser focused on this one task they're gonna burn out or not even burn out they're just gonna leave there's there's nothing to do chronologically I mean it's great if that's in the founding team DNA but you also just have to be careful about not over building until you actually know that you have enough there when you say not over building you mean well I mean I just look at a lot of companies where they say wow you know we're gonna hide we have tended like here's a new ad network or its new this and we have 20 data scientists and they're amazing I see these companies all the time they have like just their overweighted on the data science side and they have no data yeah and in like what they don't realize is that like they're increasing their burn like just yeah if they can't find these people and it's a 20-month hiring process then then okay but they're gonna lose these people unless they've managed the other side of their network unless they manage the supply side of the data and actually figure out how they get those people how they get the rights coming in the door you're gonna lose your team that's the vicious cycle the virtuous cycle is obviously appa if you have data you can hire the best people if you have if you hire the best people you get the best algorithms you get the best clients you get the best data I'm glad you brought that up because you're actually focusing on the flywheel effect and for data network effects as including talent as a component well there's a flipside of this which is that I think if the data science is tacked on too late it also I think hasn't strongly connect them so what Alex spoke to I think he mentioned it being in the founder DNA I think that's what I loved were in terms of the vision of the company from the beginning it's there but not over built you know before you're ready to go to war in that area but for that to be in the founders DNA I think it's perfect yeah and and I think part of that is just what is the architecture of I mean like you know I was mentioning the rows of the database if if it's a non-technical team and they've got two columns to their database they're not really collecting that much stuff that makes it harder to actually append the data scientists to do all of these great things especially if you end up becoming a data company by accident so imagine that you are a background check company what is an exhaust of a background check company well it's how many people are applying for jobs I mean if there's all sorts of interesting things on the data side but hopefully you are collecting things not in freeform text entry but you're doing it in a much more itemized way we're in a much more like defined in controlled way or enumerated way we're a pre enumerated way where you can do things that are much more relevant so this is a bland generalization but is it fair than to say that almost by definition as every company becomes a software company that every software company is by definition a data company well maybe I think it's just part of it depends on how lucrative your primary business model is because every company of scale has amazing exhaust and the question is whether or not you want like you know I'll go back to visa visas exhaust is very very valuable but if they shared that then their clients wouldn't like them very much and then they lose their clients so even though they're exhaust is worth it's probably worth billions of dollars a year you can predict the economy down to the earth would pay for that it just can't do it and they're not making a mistake by not doing it so yes they are a data company but they also are a network and being a network is probably more important than being a data company but it really just depends on the particular use case I mean yeah I think you will have different products I mean every company that gets to scale that's touching enough consumers or businesses will have the opportunity to have a very very valuable data suite beyond just whatever they use for their own purposes but the question is whether or not they want to and part of that is just how lucrative like Apple could be the biggest you know insert the X there a lot of that pertains to data but Apple makes too much money selling an iPhone so I don't think they're gonna do that you know towards that end the accident that Alex referred to is often really an inevitability that you know this will happen the question is what do you do with it and is it part of your core business or is it something that you have to leave on the side well thank you guys and we'll talk more about this a lot thank you yeah thank you
Original Description
If network effects are one of the most important concepts for software-based businesses, then that may be especially true of data network effects -- a network effect that results from data. Particularly given the prevalence of machine learning and deep learning in startups today.
But simply having a huge corpus of data does not a network effect make! So how can startups ensure they don't get a lot of data exhaust but get insight out of and add value to that data and the network? How can they make sure that the (arguably inevitable) data aspect of their business isn't just a sideshow or accident? How should founders strike the balance between not overbuilding/ building a data team vs. having enough data for those data scientists to work with in the first place? And finally, what are the ethical considerations of all this?
The a16z general partners most focused on bio and fintech -- Vijay Pande and Alex Rampell -- join this episode of the a16z Podcast to share their observations and advice on all things data network effects.
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from a16z · a16z · 47 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
▶
48
49
50
51
52
53
54
55
56
57
58
59
60
a16z Podcast | Money, Risk, and Software
a16z
a16z Podcast | Wall Street's Most Hated Man -- A Conversation With Overstock.com's Patrick Byrne
a16z
a16z Podcast | How Big Companies Can Get the Most From Silicon Valley
a16z
a16z Podcast | The Role of Academia in the Startup World
a16z
a16z Podcast | AMPLab, the Power of Open Source, and the Future of Systems Software
a16z
a16z Podcast | Dell + EMC -- Why the Python Just Ate the Cow
a16z
a16z Podcast | Belief -- An Interview with Oprah Winfrey
a16z
a16z Podcast | Holy Non Sequiturs, Batman: What Disruption Theory Is ... and Isn't
a16z
a16z Podcast | Boards and the Power of Networks
a16z
a16z Podcast | A Whirlwind Tour of Policy Issues in Tech
a16z
a16z Podcast | Beyond Lean Startups
a16z
a16z Podcast | Blockchain vs/and Bitcoin
a16z
a16z Podcast | Quantum Leap
a16z
a16z Podcast | Artificial Intelligence and the 'Space of Possible Minds'
a16z
a16z Podcast | Fintech from the World's Financial Capital -- London
a16z
a16z Podcast | On Recent IPOs and Comparing Private vs. Public Valuations
a16z
a16z Podcast | The Future of Food
a16z
a16z Podcast | Data Down on the Farm
a16z
a16z Podcast | The Data Science of Food and Taste
a16z
a16z Podcast | Using Social Tools to Build Homes for Those Most in Need
a16z
a16z Podcast | London Calling for Tech Done in a Different Way
a16z
a16z Podcast | Building Tech Startups in a Place Where Tech Isn’t Everything
a16z
a16z Podcast | Nootropics and the Best Version of Your Brain, Yourself
a16z
a16z Podcast | Scaling Ideas and Startups in the U.K. and Europe
a16z
a16z Podcast | The Tiger and the Dragon -- On Tech and Startups in India and China
a16z
a16z Podcast | Telepresence and Tech for a Distributed Workforce
a16z
a16z Podcast | The Present State and Future Possibility of Virtual Reality
a16z
a16z Podcast | Writing a New Language of Storytelling with Virtual Reality
a16z
a16z Podcast | Mellody Hobson and Ben Horowitz Talk Investing, Career, and Star Wars!
a16z
a16z Podcast | The Future of Software Development
a16z
a16z Podcast | What Software Developers (and Therefore Every Company) Need
a16z
a16z Podcast | Making the Most of the Data That Matters
a16z
a16z Podcast | Harnessing the DevOps Movement -- Don’t Go Chasing Waterfalls
a16z
a16z Podcast | Nobody Discusses Work Software Outside of Work -- and Then There’s Slack
a16z
a16z Podcast | The Fundamentals of Security and the Story of Tanium’s Growth
a16z
a16z Podcast | Things Come Together -- Truths about Tech in Africa
a16z
a16z Podcast | When Banking Works Like My Smartphone
a16z
a16z Podcast | How to Be Original and Make Big Ideas Happen
a16z
a16z Podcast | The Future of Money and Monetization
a16z
a16z Podcast | Building Affirm, and Why Max Levchin Has Watched Seven Samurai 100-Plus Times
a16z
a16z Podcast | Hall of Fame Football Meets Venture Capital
a16z
a16z Podcast | Breaking the Barriers of Human Potential
a16z
a16z Podcast | 'In the Eye of a Tornado': Views on Innovation from China
a16z
a16z Podcast | Infrastructure... Is Everything
a16z
a16z Podcast | Mobile Falls Hard for Virtual Reality
a16z
a16z Podcast | Disruption in Business... and Life
a16z
a16z Podcast | Data Network Effects
a16z
a16z Podcast | The Dream of AI Is Alive in Go
a16z
a16z Podcast | I Reject the Term Viral Video
a16z
a16z Podcast | Truth and Humanity in Leadership
a16z
a16z Podcast | Your Worst Deeds Don’t Define You -- Life and Redemption in Prison
a16z
a16z Podcast | Investing in (Business and Career) Change
a16z
a16z Podcast | Scaling Companies and Culture
a16z
a16z Podcast | Teams, Trust, and Object Lessons
a16z
a16z Podcast | The Why, How, and When of Sales
a16z
a16z Podcast | Selling to Developers & Open Source Business Models
a16z
a16z Podcast | Connectivity and the Internet as Supply Chain
a16z
a16z Podcast | E-commerce, Payments, & More in India's Evolving Retail Landscape
a16z
a16z Podcast | Banking on the Blockchain
a16z
a16z Podcast | On Corporate Venturing & Setting Up 'Innovation Outposts'
a16z
More on: ML Maths Basics
View skill →
🎓
Tutor Explanation
DeepCamp AI