Open Telemetry

Data Skeptic · Intermediate ·☁️ DevOps & Cloud ·4y ago

Key Takeaways

The video discusses OpenTelemetry and its integration with Splunk, a data analysis platform, with insights from John Watson, Principal Software Engineer at Splunk.

Full Transcript

[Music] this is data skeptic time series the podcast about how to predict the future based on historical sequential data episode number open telemetry is a set of tools that enable software developers to instrument their code in a common standard and have the flexibility to direct metrics logs and trace data into any system that can work with the standards the open telemetry project has defined monitoring that time series of information that pours out of a distributed software system is a challenge only possible with a standard like this and today i interviewed john watson about the open telemetry project [Music] my name is john watson i work for splunk at the moment but i also have a phd in astrophysics from northwestern university and have been working in the software industry since 1998. so how did you get from astrophysics to where you are today so as a part of my dissertation work there was a lot of software engineering or at least software programming i would say at that point software programming involved i had to write code to analyze my data i wrote some code to model what my data might be showing and you know i've been writing code since i was a little kid and i kind of found about you know probably halfway through my dissertation work that what i really enjoyed was writing software i mean the astronomy was interesting but what i was passionate about was the software and so i had a fantastic thesis advisor at northwestern dr david meyer and he was very supportive of me kind of wrapping up the research that we had in progress writing a paper writing my dissertation defending and then moving on and going to do what i was really passionate about so i've used splunk for over a decade now i think off and on and feels like a household name brand to me but for listeners who aren't familiar with it what does splunk do well classically splunk has been in the business of kind of aggregating log data and enabling searches on log data and more recently splunk is getting involved in kind of the more the larger observability space i mean observability is kind of the you know it's more than just logs it's logs and it's traces and it's metrics and because of that we're very interested in the open telemetry project in particular we see it's the future of what will be used from a library and data collection perspective for general software and hardware telemetry and do you want to do a you know opinions or my own kind of situation or do we stand with you and splunk yeah so i work for splunk they pay the bills but anything i say here is is really my opinions i'm not speaking on splunk's behalf in any way shape or form and in fact i really want to mostly focus on you know talking about the open telemetry project as a whole and kind of talk about its multi-vendor outlook on the world so splunk is just one of many many players in that business well let's start there what is open telemetry before we talk about open telemetry i want to step back a few years and talk a little bit about kind of the projects that predated open telemetry kind of there's two major tracing projects one called open tracing and another called open census that was primarily the open sourcing of a google project called census open census included both distributed tracing and metrics collection and open tracing was really just trace a you know distributed tracing data collection and before i got involved with the project these two groups the open census team and the open telemetry project kind of just got together and decided it didn't make sense to try to build two standards that they wanted to get together and work on one standard so open tracing was formed probably about three years ago with the goal of joining the open tracing and open census projects deprecating those two projects and then moving forward building what we think is going to be the industry standard for software and hardware data collection in that data collection what is offered is it schema or protocol what is the essence of the project it's actually a lot of pieces it might be easier to talk first about what it isn't and the big thing that it isn't is the open telemetry project is not concerned with the visualization or analysis of any of the data that's being collected so the goal of the open telemetry project is really to build and define a standard both for the wire format for software and hardware telemetry and provide language like programming language specific libraries to collect data and transform it into that open standard format a wire format and then thirdly provide a specification for what we call the semantic conventions so when you're describing a piece of like a user interaction or some request that comes into a server that we have a very standard way of describing that data semantically for example if you're instrumenting an http server an http is a protocol with you know very well-defined verbs and we have ways where we are defining what we think the standard attributes that should be collected on an http server request and then how that data should be transformed into wire formats so there's one final piece of the open telemetry project on one of the goals and that is what we call the open telemetry collector which is a piece of software that will collect telemetry data from essentially any source that someone could write an ingest format for and then be able to arbitrarily transform that data and then export that data in wherever it needs to go so then in terms of deployment do i deploy an instance or a fleet of those and interact with them or what's the typical setup there's lots of different models for how you might deploy kind of deploy open telemetry the very simplest possible model you might deploy is let's say you're a java programmer and you want to collect data about the runtime environment that your application is running in and kind of the requests that are coming in or whatever is going on inside your software and you don't want to pay a vendor for a back-end analysis piece of software you could configure the open telemetry java sdk to report that data to one of the open source analysis tools such as jaeger or zipkin or prometheus being the metrics kind of the open source metrics backend and you could just report the data to a local version local copy of those data visualization tools the next step up if you're like you have a vendor a observability vendor that you work with for example lightstep or splunk or microsoft has i think in azure has their own observability platform and you know there's any number dynatrace et cetera et cetera i could go there's very many observability vendors but if you are already a customer of one of those vendors you can redirect your telemetry towards that those vendors ingest endpoints and then do your analysis with those you know using the vendors tools and then if you need to scale up you just have don't have just a single instance you have lots of like a fleet of servers running a fleet of containers for example running in aws or wherever they might be then it might make sense to actually deploy the open telemetry collector as a common place to aggregate all of that data and send it to whatever back end you want to send it to kind of in one centrally managed place well i imagine it's the open in open telemetry that makes it possible for these variety of vendors to all offer solutions that extend on top of this can you talk a little bit about how that came into being yeah so when the open telemetry project when it was conceived as i said it came from open census and open tracing originally and there were a number of vendors already involved in both of those projects so google primarily in open census but there's other there were other people who were using open census so it could adjust open census data and open tracing was also there was a fairly strong involvement from some of the vendors lightstep and new relic and splunk and i'm sure a few others that i can't think of on top of my head but the vendors basically all really decided they were going to work together to define this open standard and one of the big reasons why all of the vendors agreed not only the vendors but also the cloud providers so microsoft and amazon and google are all very heavily involved in the project is we really think it's important for all of the the vendors and the cloud providers really feel that's important for all of our customers to be able to instrument their code in one way and then have a choice of where they want to send that data so if there's a common standard for all of this data collection then it makes it really easy for people to publish libraries software libraries that are already pre-instrumented with open telemetry and then if you're a new relic customer or a splunk customer or just a google cloud customer you don't have to do anything special the data is already going to be produced in the same format using the same standards that the open telemetry project has defined so we really feel that the benefit to users is not the specific details of the way the telemetry is collected but the benefit that the users have is the fact that there is just going to be one way to collect all that data and the vendors and the cloud providers we're happy to compete on our experience and our analysis and not on some sort of customized instrumentation that we spend thousands of human hours writing per year kind of before open telemetry each one of the observability vendors would have to maintain their own code bases of instrumentation so for example i'm a java programmer mostly so i know the java world better than most of the rest but every vendor would have to produce their own instrumentation for the apache http client which means that essentially the same instrumentation is being written by five or six different teams at five or six different companies all producing what is nominally the same kind of data just using proprietary formats and proprietary collection tools that's a lot of wasted effort on everybody's part so all of the vendors really feel very strongly that they'd rather not be investing in that they'd rather we all work together on writing that instrumentation and then we can compete on the experience and the analysis that we deliver to the end user so i know the observability space is not packed but it's a growing market and a lot of people are trying to get their share of it you've also got the three cloud platforms involved who you know in some sense could one day release their own solutions to compete everybody's vying for similar areas how do you all get along in the project yeah it's a fascinating question i think the thing that from a personal perspective is the most interesting to me that we all do get along and we all continually acknowledge that we're competitors but we all are really working to try to find something that's great for our users and we really all have the same users they're software engineers they're reliability engineers they're operators of software at big companies and small companies we all have the same customers and i think we all are very very aware that those same people are the ones that we're trying to make happy and so even though we're fiercely competitive on a day-to-day basis from a business perspective we've all really do recognize that the end user is the one who we're trying to make happy so we work as hard as we can and we work together even when we disagree about the technical details of something we don't really get into kind of the politics of my companies better than your company in fact it's you know it's really kind of foreboding to on the cncf slack to tear down or even mention other competitors unless you're saying something good about them it is really interesting to me at this point in time that this kind of open standard in this open source software is something that these business competitors can collaborate on and you know for all i would say 75 to 80 percent of the people involved in the project we're all being paid by one of these competitors to do this either full-time or part-time so our companies already acknowledge that this is something that even though it feels weird and awkward from a traditional business perspective that is really important to work together on well i'm wondering if you'd be willing to imagine with me some academic person who's deeply researching something like anomaly detection they've got some novel algorithms they've come up with maybe they run you know with a low memory footprint on real-timeish data so good opportunity for them to maybe plug their solution into open telemetry and create something maybe even a commercial venture out of this how did they get started taking that core algorithmic concept and getting it integrated so i think the first thing that this hypothetical person would want to do is need to know what language they're writing their code in and once they understand that language or they know that what language is than going to the open telemetry project for that specific language so there's open telemetry java and python and net and go ruby all of the main languages have open telemetry support and really understand what the apis and the kinds of data that can be produced we haven't really talked very much about the kinds of telemetry signals that open telemetry is talking about but there are three kinds of signals that we talk about in the open telemetry and i think in the observability world and those are logs traces and metrics so i think most um especially kind of academic people will probably be very familiar with metrics i mean you're measuring numbers about whatever the software that you're writing and when the runtime like the counts of how many requests or the counts of how many whatever it is that you're counting or you could produce histograms around the shape of the data distribution if you're talking about timing things so i think most people will be familiar with what metrics are and there's a number of open source metric libraries that have been around for a long time so that is one of the signals that we can produce and probably the one that i would guess would be most interesting to people in the academic space but there are two other signals that are little well not necessarily newer one of them is logs and that's not new at all logs have been around since the very beginning really well before computers existed people were writing logs on ships crossing the ocean they're really just a point in time piece of data that describes something that happened so open telemetry is also working on defining a kind of a standard wire format for logs again that maybe not something that's super interesting to an academic but the final one is actually you know a relatively new telemetry signal in the programming world which is distributed tracing and distributed tracing essentially if you have a collection of services which can include databases and load balancers and actual business services when a user request comes in let's say you have a shopping cart application running on the web a user goes and clicks on a button in your shopping cart that will produce a number of different requests in those back-end services like it might make an authorization request and then that authorization request will talk to a database to you know to confirm the user's password etc and after the authorization happened then there may be a request to some sort of back-end shopping cart service etc and in order to understand the actual like that user's actual interaction and how long things took and where there might be bottlenecks in the actual software distributed trace essentially creates a tree of what we call them spans an open telemetry which is essentially a timed operation with a name and a set of attributes associated with it and in order to get the kind of the distributed trace each of those fans may have a parent so the result of that is kind of a tree structure of timings that can describe a fairly complex interaction especially in the ever-growing microservices architecture so the thing that someone would want to do as an academic or any practitioner who's interested in kind of generating the standard shapes of telemetry is to decide which of those three signals their data is best modeled with or which combination of those three signals so there are point in time events which can be logs and there are these spans which really do a good job describing detailed information about timings that in a distributed system and then there's metrics which is just everything all the numbers you want to record and once you've figured out kind of how your data and the information you want to collect fits into those kind of three general shapes of data then is the time to kind of dig into the apis and understand exactly how to generate the information that is going to be useful to you makes sense yes i would imagine metrics might be a place that i would come if i had some specialized solution it seems like those could be rather custom are there any boundaries guardrails or best practices for what i can stuff in the metrics component of my logging so the metrics api for open telemetry has only in the past month been uh stabilized in that if we basically we're defining a specification for what the kinds of data or how we'll be recording data for metrics and i'm probably going to not remember everything off the top of my head but there are probably four or five different kinds of what we call metric instruments one of those is a counter which basically lets you count things right it makes sense as a counter and you can increment singly or increment by any given amount you want but a counter can't have negative values recorded so it's it's monotonically increasing only but we do have an instrument called the up down counter which does let you decrement as well as increment and you might ask the question why do you need both a counter and an up down counter and the reason is that telemetry back-ends often treat that data differently if data is monotonically increasing you can apply much simpler algorithms to the analysis than if it can both rise and fall so those are kind of the two simplest instruments are the counter and the up down counter there's an instrument called a histogram which is i think a little bit of a strange name for an instrument it's kind of what i usually think of as an output of an instrument as you know a distribution of numbers but after probably spending six to nine months of bike shedding on the name of what this thing should be we finally decided that histogram is going to be the easiest thing for people to kind of not argue about too much and i think it's the name that's used in several other common metrics libraries i think it might be what prometheus uses and i think there's a the drop wizard metrics library also calls it this and it's essentially a way to record values in fact an early version of the open telemetry api specification actually called this instrument a value recorder and then it's it was too generic and nobody really understood what it was because it felt like it was an anything instrument but the histogram instrument is basically you record values and by default those values will be recorded into some sort of histogram usually a fixed width bucketed histogram but we're also working on defining a standard exponential histogram as well but the histogram instrument is is the one you would use if you really need to collect distribution data about the recordings that you were interested in and then we have two other instruments which really have a very specialized use case which is you want to record things that some other process has already recorded and you just want to report them so for instance if you're running the java virtual machine it collects information about the current amount of memory that's being used and so if you want to observe those numbers and you want to report them to some sort of telemetry metrics back end we provide what we call several observable instruments which essentially are just there to do this kind of callback style we're going to periodically query this data and report data that's already been aggregated by something else so we have a kind of this observable counter and there's an observable up down counter i think to go along with it and then i think we have one other one which i think we call the gauge which is gauge is just a point in time number that you record so these are kind of the instruments that the open telemetry provides the counters and the histogram and these uh background collection gauges and observable instruments so the idea would be depending on how your data is being collected and what that shape of it is you would choose one or more of those instruments to record it and then once you've kind of instrumented what you're interested in actually seeing and that data is all being written to the open telemetry apis then you would configure what we call an sdk which actually takes that api those api recordings and does something with them the standard sdks will record that data and then deliver it to some sort of exporter to send it to a system downstream when it comes to something like the histograms and these counters am i correct in assuming that new greenfield install of open telemetry is going to require someone to go and kind of hand craft those and configure it or is there anything automagical that can do that for me so for some languages like java and python i think it's somewhat in javascript and in net the open telemetry project also is building what we call auto instrumentation libraries and these auto instrumentation libraries use all sorts of fairly arcane and complex techniques to kind of automatically inspect the code at runtime and collect this data for you so for example in java there are well known techniques for introspecting the byte code that is executing in a running system and in adding instrumentation add additional byte code to that to actually record the metrics and traces and logs automatically without the user having to do anything and then we provide some standard ways to configure the kind of the auto instrumentation sdk using environment variables and things like that to kind of set up where the the parameters you want to apply to your running project and where you want to send that data for example so that metrics api you gave us the overview on you'd mention it's a recent launch or recent stabilized api i guess can you talk a little bit about the process to arrive at that i know it's a collaborative situation how do these decisions get made yeah so that's a fantastic question and actually kind of gets at the heart of the way open telemetry works as a project so open telemetry is organized into a fairly large set of what we call special interest groups on top of the project there's the governance committee which is kind of does the organizational control of the project like if we need new zoom meetings set up they do that kind of stuff or if there's a problem or someone's violating code of conduct they can step in and there's the technical committee which is really a group of people who are the technical experts in this observability field and they are kind of in control of what we call the open telemetry specification there's a special interest group specifically for the specification that meets twice a week to talk about the specification to talk about these details and over the past two years there's been a sub special interest group or a sig as we call it a sub sig of the specification called just to work on metrics and they have been meeting twice a week for about a year and a half to really try to hammer out exactly what the common set of metric instruments and the shape of the data we're going to be producing but it's really it's all these meetings are completely open anyone can join in to them who wants to be involved in the project they're all on a public calendar on the cncf website that's the cloud native compute foundation which open telemetry is a part of but essentially if there's a particular topic you're interested in you find the special interest group that you're interested in joining and participating in and you show up at the meeting and you start participating and you start asking questions or you volunteer to write specification or you volunteer to help implement something but the process is really quite ad hoc we keep open google docs with meeting notes for every meeting where decisions are made or action items you know the usual kind of things you do in a corporate meeting except that it's not corporate it's just a people who are interested in the project or in that particular group's work so the metrics group have put together very preliminary api definition probably about a year and a half ago memory serves and then language they were asked for participants in the metrics group who also had language interests and expertise to go and prototype these apis and the back end and like the sdk implementation of those apis in their specific language and then report back like where are their problems where are the things that work well like that we released alpha versions of the software for users to try out and give us feedback many of us who work for observability vendors used the stuff ourselves and had our teammates within our companies used these alpha versions to get feedback early and what this what actually ended up happening when we started working on metrics in particular is the big kind of the elephant in the room when it comes to open source metrics the prometheus group which i think has recently been at least somewhat rebranded into open metrics they took notice and they joined the metric special interest group several of the people there and they having done metrics instrumentation and aggregation for many years had a lot of opinions about what we what we were doing wrong or what we could do differently and so at that point we kind of took a step back and worked with the open metrics group to kind of talk through a bunch of these issues and kind of tweak some of the ideas we also talked to metrics practitioners from the java community micrometer being a big metrics library for java and we got their feedback on some of the naming and some of the things like i said we originally called the histogram a value recorder and then we got feedback from the community that nobody understood what this thing was and so we decided to do renaming based on that feedback so now after a couple years of deliberation and prototypes and feedback and tweaking things and more feedback et cetera et cetera we think we've come to a point where the metrics specification and the api specification has been stabilized just in the past month and now all of the language groups are going to take those api definitions and implement them and meanwhile the metrics group within the specification group is working hard on kind of specifying what the standard implementation what we call the sdk will actually look like and how it will behave so hopefully that specification will then be solidifying and stabilizing in the next month or so at which point all of the languages will essentially be able to go and implement that sdk specification of course meanwhile all those languages are also prototyping all of that and we have working implementations on what we think the final version might end up looking like but based on feedback again from users and from practitioners we'll make adjustments and we'll tweak it as we go so if i instrument my app with open telemetry and i set up my own collector my metrics are getting pushed there is the collector then a system of record or is it a pass through to what i choose to be my ultimate persistence layer yeah it's 100 a pass-through the purpose of the collector really is to be that kind of universal hub that understands all the formats that could be being sent to it and can transform them to any format that you want to send the data to in the end it doesn't do any storage of any data at all anything that it has is purely in memory before it gets then shipped off to whatever external system you want to be the system of record and what are some of the typical choices about where people ultimately persist well if people who are really open source open source first open source only they're probably going to be for metrics they're going to be running prometheus or statsd or grafana the open source version of grafana they're going to be running all of those kind of metrics back ends themselves and they'll send that data so there are open source options for people who don't want to pay somebody who want to run their own infrastructure and they want to run their own systems if you are a customer of one of the cloud vendors you're running your software in aws or azure or google cloud platform they provide back-ends for all that data so your point your collectors many of them actually will run collectors for you i think amazon has a collector that they will run for their customers and that data will go to the cloud vendor and then the cloud vendor will have visualization tools etc for that and then finally there's the observability vendors like splunk and new relic and lightstep and dynatrace etc and if you're a customer of one of those vendors or they'll be happy to make you a customer if you're not and pay them money to store and visualize that data then you can send your data to one of those vendors so observability is a natural and kind of obvious use case that we've been over a little bit i see a lot of potential for other things here as well you know these metrics could be aggregated i wouldn't surprise me to learn that there's a machine learning engineer somewhere doing like predictive system maintenance or something like that from iot devices have you gotten the opportunity to see any of the unconventional or interesting ways people are using the service so i don't have a lot of insight into how machine learning practitioners might be using this data i mean this is really what open telemetry is data collection right we're not doing any analysis so certainly this is a signal that could feed into any machine learning system that you might be interested in especially time series systems right because what we're generating is time series but one kind of maybe non or atypical non-standard use that i have seen outside the data science case is people instrumenting their continuous delivery pipelines so kind of in modern software engineering you write your code you check it into your source control something automatically picks that up builds the code deploys it to some sort of test environment automated tests runs and then once that's done it gets deployed into your staging environment where some more automated tests will run and then once those have all passed then it will be deployed into your production environment well this is a fairly complex set of operations that also is very important to make sure you understand where it is or isn't breaking down so this is a case where people where it's not kind of your observability of your end systems but the observability of your pipelines and infrastructure that build your systems as well so this is one case that i have seen people getting more interested in i've seen like in java at least the maven and the gradle build systems people are instrumenting those i did a side project where i did some gradle instrumentation myself and then you plug that into your ci cd build pipelines and have that data then emitted invisible so you can see what's going on not only in your running software but in your build so where is the time being taken in your build maybe is there some place where you could optimize it or is there some way you could kind of combine build steps and to make it better or something like that so this is this is a maybe a non-standard but growing area of where instrumentation is being used and telemetry data is being very useful and what's the future of open telemetry it's a work in progress but also stable to a certain point where do you see things evolving down the road so where we are right now is the tracing specification is stable the tracing implementations have been out now for a while in most of the major languages as i said metrics is really in the process of becoming stable in the next few months with the api being stabilized and the sdk specification coming very soon the next thing after that is logs logs are super important kind of the oldest of the telemetry sources really and the open telemetry is defining a wire format standard wire format for log data and it's unclear exactly what the shape of the logging api is because every language probably already has a logging api and we don't want to reinvent that wheel if we don't need to but over the next year we're almost certainly going to be stabilizing the logging apis and sdks so once we've got those three things done and stable then we're done right no more work obviously not right i think there's a actually a new group that's being with two groups uh one of which i am actually actively involved with is what we call client telemetry more often referred to in the industry as real user monitoring so this is measuring the performance of actual users using applications and being able to detect like mobile like if your mobile app is crashing a lot being able to detect that and then understand why and fix those bugs so that group which we call client telemetry and a new network monitoring group based on the ebpf which i don't know anything about network monitoring so i'm probably going to say things that aren't correct but there's a network monitoring group also that is really getting ramped up in open telemetry and the two groups the client telemetry and network have really identified a new signal which we're preliminarily calling an event so it's something like a log but it's a little bit different and so this event api maybe we're not exactly sure what it means but over the next year we're going to try to understand what an event is it's a point in time thing with data associated with it so it's you know fits into this these ideas but it is shaped slightly differently than logs probably it doesn't really look like a span and a trace so i think the next step after logs is really to have a come to a common understanding of what this event api might be so we can collect both for client telemetry and network telemetry and we think probably for iot use cases as well understand this event like an example of an event in the in the mobile space for example is your network might change from being wi-fi to being a 5g like if you're walking down the street and so recording that event is very interesting because if the performance of the application then becomes radically different on those two different networks it's very interesting to know that oh there was a point in time when something changed and then be able to reason about maybe there's a causal relationship between the performance characteristics of the app and the fact that this change in the network occurred so this event api or this event data model whatever it ends up being it's we're really still trying to figure it out that's what's coming next i think after metrics and logs are complete that's interesting it seems like a natural and dare i say obvious thing to add but yet when you think about all of the variety of situations one might want to apply that the standard feels to me a little bit intimidating how do we do something that's pleasing universally maybe to wrap up do you have any guiding thoughts on how you'll approach that well i think the way that we often tend to approach this is to really think about the most so if we have a new signal for example this event signal like think about the superset of all of the shapes of the data that that might take and by shape of the data i mean we're going to be recording something about that event that happened like probably some sort of name and then a bunch of some sort of attributes but are those attributes simple strings or are they string are they numbers are they more complex data structures and so try to find the superset of all of the use cases that we know about for these events so looking at network looking at iot looking at mobile and you know web browsers and find the superset data model so that's kind of step one is to kind of define that data model and from the data model we'll actually define probably the wire format at that point so that's kind of step one and then this is very speculative but i think most people are thinking right now that the data model for events is probably going to look exactly like logs and so we probably will use the log data model for events because the log data model is very very flexible but once we've got that data model settled then we need to start thinking about how are people who are writing code people who are writing applications people who are building telemetry into their iot devices like how do we make an api that's going to be really easy and obvious for them to use and record that data and i suspect at this point this event api will bifurcate and we may have one that's very specific for mobile devices and one that's very specific for iot devices and one that's specific for network devices but the next step then is to define those apis and maybe there will be a common api for events but maybe we'll bifurcate the api into kind of implementation or use case specific apis but that'll be the general process that we go through well john what's the best place for people to follow the project online yeah the first place to go is opentelemetry.io which is the website where there's a ton of documentation and you know lots of words and pretty pictures and some cartoons i think and then if you're a software engineer all of our code is up on github all of the meetings that that are run as a part of the project are all public the public calendars are all linked to from the open telemetry io website i believe if not they're certainly in the community repository in github under open telemetry but if you want to get involved find the thing you're interested in and join in in a meeting come to a meeting we love newcomers we'd love to have new people and new points of view in the project because the more points of view we get the more general and useful solution we're going to end up with and what's the best place for people to follow you yourself online so on twitter i'm jk watson on github i'm jk watson well you've got an impressive green set of tiles on your github i have to say yeah i write a lot of code it's what i love john thank you so much for taking the time to come on the show well thank you very much it's been a pleasure being here that concludes another installment of data skeptic time series our guest today john watson myself claudia armbruster as associate producer vanessa bly guest coordinator and our host kyle pollich [Music] you

Original Description

John Watson, Principal Software Engineer at Splunk, joins us today to talk about Splunk and OpenTelemetry.
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Data Skeptic · Data Skeptic · 0 of 60

← Previous Next →
1 Data Skeptic book giveaway contest winner selection
Data Skeptic book giveaway contest winner selection
Data Skeptic
2 OpenHouse - Front end and API overview
OpenHouse - Front end and API overview
Data Skeptic
3 OpenHouse Crawling with AWS Lambda
OpenHouse Crawling with AWS Lambda
Data Skeptic
4 [MINI] Logistic Regression on Audio Data
[MINI] Logistic Regression on Audio Data
Data Skeptic
5 Data Provenance and Reproducibility with Pachyderm
Data Provenance and Reproducibility with Pachyderm
Data Skeptic
6 [MINI] Primer on Deep Learning
[MINI] Primer on Deep Learning
Data Skeptic
7 Big Data Tools and Trends
Big Data Tools and Trends
Data Skeptic
8 [MINI] Automated Feature Engineering
[MINI] Automated Feature Engineering
Data Skeptic
9 The Data Refuge Project
The Data Refuge Project
Data Skeptic
10 [MINI] The Perceptron
[MINI] The Perceptron
Data Skeptic
11 [MINI] Feed Forward Neural Networks
[MINI] Feed Forward Neural Networks
Data Skeptic
12 Data Science at Patreon
Data Science at Patreon
Data Skeptic
13 [MINI] Backpropagation
[MINI] Backpropagation
Data Skeptic
14 [MINI] GPU CPU
[MINI] GPU CPU
Data Skeptic
15 OpenHouse
OpenHouse
Data Skeptic
16 [MINI] Generative Adversarial Networks
[MINI] Generative Adversarial Networks
Data Skeptic
17 [MINI] AdaBoost
[MINI] AdaBoost
Data Skeptic
18 [MINI] The Bootstrap
[MINI] The Bootstrap
Data Skeptic
19 [MINI] Dropout
[MINI] Dropout
Data Skeptic
20 [MINI] Gini Coefficients
[MINI] Gini Coefficients
Data Skeptic
21 [MINI] Random Forest
[MINI] Random Forest
Data Skeptic
22 [MINI] Heteroskedasticity
[MINI] Heteroskedasticity
Data Skeptic
23 [MINI] ANOVA
[MINI] ANOVA
Data Skeptic
24 Urban Congestion
Urban Congestion
Data Skeptic
25 [MINI] The CAP Theorem
[MINI] The CAP Theorem
Data Skeptic
26 Unstructured Data for Finance
Unstructured Data for Finance
Data Skeptic
27 Detecting Terrorists with Facial Recognition?
Detecting Terrorists with Facial Recognition?
Data Skeptic
28 Predictive Models on Random Data
Predictive Models on Random Data
Data Skeptic
29 [MINI] Entropy
[MINI] Entropy
Data Skeptic
30 [MINI] F1 Score
[MINI] F1 Score
Data Skeptic
31 Causal Impact
Causal Impact
Data Skeptic
32 Machine Learning on Images with Noisy Human-centric Labels
Machine Learning on Images with Noisy Human-centric Labels
Data Skeptic
33 The Library Problem
The Library Problem
Data Skeptic
34 Stealing Models from the Cloud
Stealing Models from the Cloud
Data Skeptic
35 Data Science at eHarmony
Data Science at eHarmony
Data Skeptic
36 Multiple Comparisons and Conversion Optimization
Multiple Comparisons and Conversion Optimization
Data Skeptic
37 Election Predictions
Election Predictions
Data Skeptic
38 [MINI] Calculating Feature Importance
[MINI] Calculating Feature Importance
Data Skeptic
39 MS Connect Conference
MS Connect Conference
Data Skeptic
40 Music21
Music21
Data Skeptic
41 The Police Data and the Data Driven Justice Initiatives
The Police Data and the Data Driven Justice Initiatives
Data Skeptic
42 Studying Competition and Gender Through Chess
Studying Competition and Gender Through Chess
Data Skeptic
43 [MINI] Goodhart's Law
[MINI] Goodhart's Law
Data Skeptic
44 Trusting Machine Learning Models with LIME
Trusting Machine Learning Models with LIME
Data Skeptic
45 [MINI] Leakage
[MINI] Leakage
Data Skeptic
46 Predictive Policing
Predictive Policing
Data Skeptic
47 Mutli-Agent Diverse Generative Adversarial Networks
Mutli-Agent Diverse Generative Adversarial Networks
Data Skeptic
48 [MINI] Convolutional Neural Networks
[MINI] Convolutional Neural Networks
Data Skeptic
49 Unsupervised Depth Perception
Unsupervised Depth Perception
Data Skeptic
50 [MINI] Max-pooling
[MINI] Max-pooling
Data Skeptic
51 MS Build 2017
MS Build 2017
Data Skeptic
52 Activation Functions
Activation Functions
Data Skeptic
53 Doctor AI
Doctor AI
Data Skeptic
54 [MINI] The Vanishing Gradient
[MINI] The Vanishing Gradient
Data Skeptic
55 CosmosDB
CosmosDB
Data Skeptic
56 Estimating Sheep Pain with Facial Recognition
Estimating Sheep Pain with Facial Recognition
Data Skeptic
57 [MINI] Conditional Independence
[MINI] Conditional Independence
Data Skeptic
58 MINI: Bayesian Belief Networks
MINI: Bayesian Belief Networks
Data Skeptic
59 Project Common Voice
Project Common Voice
Data Skeptic
60 [MINI] Recurrent Neural Networks
[MINI] Recurrent Neural Networks
Data Skeptic

This video teaches the basics of OpenTelemetry and its integration with Splunk, providing insights into data analysis and observability. John Watson shares his expertise as a Principal Software Engineer at Splunk. The video is essential for those interested in data analysis and monitoring.

Key Takeaways
  1. Understand the basics of OpenTelemetry
  2. Learn how to integrate OpenTelemetry with Splunk
  3. Discover how to use OpenTelemetry for distributed tracing
  4. Explore how to use OpenTelemetry for metrics and logging
💡 OpenTelemetry provides a standardized way to collect and manage telemetry data, making it easier to monitor and analyze systems.

Related Reads

📰
What Is an MCP Registry? (And the NxM Problem It Solves)
Learn about MCP registries and how they solve the NxM problem by providing a centralized catalog of MCP servers
Dev.to · Sahajmeet Kaur
📰
Built a suite of client-side dev tools to fix the "production data" privacy gap
Learn how to build client-side dev tools to address production data privacy gaps and improve development efficiency
Dev.to · Rayan Ahmad
📰
5 Best BrowserStack Alternatives to Optimize Your Testing Infrastructure
Discover the top 5 BrowserStack alternatives to optimize testing infrastructure for better execution speed, pricing, and test management
Medium · DevOps
📰
️ The Lifecycle Symphony: A Senior SRE’s Deep Dive into Init and Sidecar Containers
Learn how to optimize container initialization and sidecar containers for resilient multi-cloud platforms
Medium · DevOps
Up next
Containers on Amazon ECS with Mama J
AWS Developers
Watch →