NVIDIA Tools to Train, Build, and Deploy Intelligent Vision Applications at the Edge

NVIDIA Developer · Intermediate ·🛠️ AI Tools & Apps ·6y ago

Skills: LLM Engineering80%Fine-tuning LLMs70%CV Basics60%

Key Takeaways

NVIDIA provides tools such as Transfer Learning Toolkit (TLT), Deep Stream, and Triton Inference Server for training, building, and deploying intelligent vision applications at the edge, enabling high-throughput and low-latency video analytics pipelines. The tools support transfer learning, model pruning, and knowledge distillation, and can be used with popular deep learning frameworks and models.

Full Transcript

hi everyone and thanks for joining us today for GT C digital we've been working hard from home to bring you this webinar and as you know the work from home experience comes with its own challenges and so we appreciate your understanding as we bring you this in the day as such here's some tips that can help make this event as best as it can be to maximize the quality of this audio stream please close any open applications aside from your browser window also a good old fashioned browser refresh can cure many ills so if your audio sputters or the slides seem to be lagging give that a try first and then you can also try opening this same event in a different browser so then lastly if you encounter any other technical issues today please let us know in the key will try to troubleshoot with you in real time now without further ado I'll turn the event over to our speaker Chintan please begin the presentation thank you good afternoon everyone and thank you today for joining my name is Chintan Shah I'm the product manager in video and I manage several of AI toolkits and today I'm gonna talk about different tools that we have that will help you train build and deploy your intelligent video analytics applications and the edge so what are the major challenges with intelligent video analytics number one to create an intelligent system you need AI and creating highly accurate and reliable AI is difficult it's an iterative process that requires a lot of good label data in obtaining this data it's time consuming and expensive models are trained and retrained continuously so how do you create this highly accurate AI second challenge is how do you achieve high throughput especially on the edge due to ban the limitation you want to process as much as possible on the edge and only send vital information options when you deploy the application you not only want to achieve high throughput but also be able to generate the real-time insights increasing the number of streams for device reduces overall cost for your customer and besides a I there are a lot of other high computing processes that are involved in video analytics and finally the last major hurdle is at scale deployment deploying in managing tens hundreds or thousands of each device is challenging physically connecting to the edge device is not always possible so you need a way to orchestrate and manage all this appliances from a central location and you need to do it securely in this talk I'll talk about how you can use various Nvidia tools to create highly active AR build AI applications that can achieve high throughput and real time insurance and then finally show you how you can deploy this application at scale first let's jump into how to create AI training to create AI you need to collect lots of label data one option is you can start training from scratch but your data set this is like think of us building a house with all the raw material you start with nails in woods bricks cement and you start building everything from scratch although it can be done it's rather time consuming and not scalable better option will be to build your house with some pre-built material similar concepts can be applied in machine learning and deep learning you can build your model from scratch or you can apply transfer learning to building pre train models transfer learning is a process of transferring learned features from one model to another surprisingly you can use a model that was trained to recognize people to recognize cats with some images of cats the key benefit of transfer learning in you require lesser data to train an accurate model as compared to if you were to train from scratch later on I'll show you how accurately you can train with the limited data set by applying transfer learning and because you need not because you only need a small data set you spent less time collecting the data set which lowers your overall cost and in in addition you can also change much quicker with a small data set to apply transfer learning near released a tweeted called transfer learning toolkit to help developers data scientists and engineers trained and networks quickly and efficiently here's the entire PLC software stack at the top of the stack these are the things that can be done with the transfer learning toolkit this toolkit is geared for developing quick accurate AI models you can use TLT to add or remove classes to an early train model let's say for example a model detects cars trucks and buses well you want to add a fourth class to detect any means this can be easily be done with TLT T or diesel pool models cooling removes parameters from the model to reduce the overall size of the model without compromising the accuracy itself and finally you can do scene adaptation adapt the network to your data set your point of view your camera angle to change the TLT meal provided for more than 25 3 train models to get started these models are available on NGC a piteous cloud registry models are available for object detection and image classification this models provide a good starting point for someone looking to start their training TLT consists of just few considerable steps to Train this includes data preparation and augmentation train and prune I'll cover those works more in detail in the next slide Tod runs on top of our carers training framework what for an end user and this is completely abstracted away users only needs to modify an easy-to-use configuration file to get started and this toolkit sits on top of nvidia cuda extract this includes all the lower level libraries and nvidia nvidia container runtime the GPU acceleration from inside the container Kouga and podían in library for all the parallel processing and solving B and n equations tensor RT for model export for infancy and finally all of this runs on top of nvidia competing platform models train with TLD can be deployed on edge base e GX server these models are trained on on d GX mission machine these are supercomputer in a box or it can be training in one of the cloud instances with a GPU or it can also be trained on your workstation with an NVIDIA GPU can works more how to change the TLC and pre train model shield it consists of just five commands data argument train pool evaluate and export in this workflow it is assumed that the data set is your own you first start by showing the container and the pre train model from in DC next you take your data set and augment your data set augmentation enhances your data set to significantly improve accuracy especially if you are a small data set the spatial augmentation supported by plc our image location zoomin zoom-out image shift you can also apply color augmentation such as color shift hue location saturation and contrast adjustment now once you have the Augmented data set you can start your training after initial training you value the model against the validations accuracy is acceptable and then you move on to next step which is model truly if not then you go back and adjust your hyper parameters in restart training pruning removes nodes that the algorithm thing doesn't contribute to the overall accuracy I'll talk more about this later so this final 4.1 pruning will result in some loss of accuracy so the next step after pruning is to retrain to reading accuracy lost because of pruning and after retraining you evaluate again to see if you have gained back accuracy that was lost if you're not able to gain the accuracy then you go back and adjust the pruning threshold renewable fooling threshold and repeat or restart again from the beginning maybe you need you might need to increase your data set and once you finally beam the model is good enough for the format you explore the model for inferences here you can generate an intake calibration table to do inferencing using into a precision this highlights the entire workforce on taking a pre train model from in DC pick your own data set to generate an accurate model that is suitable for your use case so now let's assume you are trying to train a 20 class classifier with a thousand images per class this is a rather small data set to accurately chain one option is to train from scratch with a 20,000 images or other option system chain using internal included and the pre chain model as you can see with the bloom line when you're training from scratch you can achieve about 60% accuracy good but not great let's see if you can do better now if you take the same 20,000 images and use one of the creature model you start seeing most to 80% to 90% accuracy without even increasing your data set and you can see when applying transfer learning you can drastically increase the accuracy from 60 to 80% that are increasing the data set and this was done on a resonant 18 classifier Network in fact you can use transfer learning to transfer rates to a completely different domain modern as I mentioned earlier is a technique when one size is reduced by removing connection please popular in decision tree where a certain connections are removed by traversing the graph and deleting connection pruning algorithm is an immediate proprietary algorithm but I feel it is one of the key differentiator between TLT and other training frameworks pruning can help increase the number of inferences per second increasing your overall throughput pruning reduces the number of parameters by an order of magnitude leading to model that runs many times faster for inference pruning is a two part approach first you have to prune the model and when you prove you will lose some accuracy what second step is you quickly retrain to gain backs and accuracy so the graph on the right highlights the key benefits of pruning on a resonant 18 Network we were able to reduce the memory by more than six acts by cooling which translated into more than to experience in our inference throughput again this is very much networking and data cell dependent and improvements can vary based on the type of network architecture and the tail and your data set or all tuning is very effective in increasing your overall channel density for inference in teal team underdog which was released last year we started with few models and now we're building on top of that by adding newer model architecture in one model we started with basic ResNet PGG Google and mobile net and squeeze net as our backbone for classifiers for detection we started with our Nvidia's - technically - network and an alpha version of faster our CN n SSD we are adding larger in complex classifying classifiers such as less than 34 present 101 interaction at 19 and 53 - TLT roadmap in addition we are adding new detection algorithms in the future support the uber popular euro III network as well as retinal and DSST network all the detection network will support most of the backbones your III SSD and faster arsenal are three on the top Network based on the number of projects that are on github and we feel very confident than most developers and data scientists will be able to use this network for their own use case all of these models will train on Google open image data set but close to half a million images Jupiter load books are provided in the container for developers to get started with training and in addition we published a step-by-step blog that walks through the entire training process with TLT and in some in summary in the future we'll have support for most of the popular networks or of the detection and classification in addition to the networks shown on the previous side slide we are also looking to add some highly accurate purpose-built networks these are application specific networks that can be retrain or deployed out of the box these are trained on millions of our own proprietary image you spent years collecting images labeling them and training on these we feel this provided excellent starting point for a specific application the people led model is trained to detect people in many different environment at different camera angles this model can be used to build application such as people counting or generating heat maps the traffic-cam net model this can recognize cars bikes pedestrian and road signs this is very useful for smart city application where you want to check cars on the road for traffic analysis dash cam net is collected on images from a dashboard inside a camera these are useful for Kartal application where you want to create an alert based on what is happening in front of the car the way called type and make on that make our type are useful for smart city application where you want to categorize cars on the road and finally the phase detect ir model is extremely accurate model to detect age right in front of the camera and this can be used for other applications for to monitor drivers at a car and these all of these models will be distributed on NGC can you be able to you will be able to train them retain them using transfer learning toolkit also you're more than welcome to use this model as is out of the box without any training and finally for small city and small retail type of application we feel this malls will make your job of training highly accurate Leal much easier this chart shows the infant's performance of two movies high purposeful models traffic m-net in - chemin this is running on immediate t4 with a bad size of 64 and intake precision the performance are shown for both an uncle model and a full load these two networks use detected v2 as detector with resonant 18 as a feature extractor the performance the inference performance for the hunter mental model is almost identical for both networks for traffic cam we are seeing about 2x entries and infants throughput born to a fool model this means that without any optimization in your course you can increase your channel density by 2 which is significant you can do even better with - geminate the improvement we are seeing is about 3 X 4 - camera we are able to you're able to fool much more aggressively those giving a higher improvement on our entrance pruning depends a lot on the mall architecture as well as your data set we suggest using a large data set if you want to prune aggressively this is because if you try to clean progressively and small data set then when you reach Ain you might start or fitting on your training data set so it's ideal to use a large data set when you when you want to be like aggressive pruney all right now let's switch gears now that you have learned how to use TLT to create a ai now let's dive into something build high performance applications I could create your model the next step is to create an application which consumes this model for real-time analytics what you see here is a very high-level view of what a typical IV a graph looks like and you don't bang with requirements and the need for real-time data a lot of the heavy lifting has to be done on the edge from pixels to inside pixels from the camera or captured decoded and processed decoding machines into frame is a fairly computer intensive task and it can become a bottleneck if not done it's done inefficiently pre-processing such as changing the resolution the color space conversion before infants can also hog up computing resources and add latency and after pre-processing the next step is to apply ki this is where you get insights from the video this needs to be done efficiently on the GPU to maintain a high circle and then finally action generating insights you send this metadata to a data center or clock this is the edge to cloud component your messages needs to be exchanged between the edge and clock this pipeline for processing and understanding video is English agnostic and in the next few slides I'll show you how to create this workflow efficiently with tip shape you've seen SDK is a streaming analytic toolkit for AI base video and image understanding addiction you can be efficient application that allows you to do real-time AI this is the entire software stack from application level all the way down to the hardware at the very top is the application layer Peacham application can be built using native c c++ or can also be built in python using teaching python bindings a lot of AI deep learning community uses Python and to enable more developers we are bringing full capability with Python now you can completely brilliant applications in Python under the application layer is SDK the core SDK can consist of various hardware target plug-in that provided highest input for any computer vision can I be a application these are plugins for hardware X are their decode encode and other processing tasks for deployment customers can deploy the application using their model systems or they can deploy in cloud native container using a MIDI a container runtime we are continuously improving our SDK and one of the features that we are adding is a support for bi-directional iot message this allows the cloud to control and configure deep stream application running on the edge over-the-air model update is also a new feature that we are looking to add this allows for instantaneous mom updates while the application is running in addition we provide several reference application and Helm's for container orchestration to kickstart your idea projects the next level of the stack is the cooler extract which looks various software technology in lower level libraries that are used by the deep sink plugin this includes CUDA filter RT Titan instant server and multimedia library and find me at the very bottom in the full hardware stack that can be used to deploy different applications this includes the Nvidia container one time and kubernetes on GPU and for the performance platform diction can be deployed on Nvidia justin device or any t4 or v100 GPS the thing that is common because all of these industry is you need to process pixels from camera to inside the type of application might differ but the flow remains common across all the use cases some application might require the processing on the edge either on an edge device such as Amelia Judson or on one from servers like Amelia egx with g-force the use the lose case can be in security training from a small business to enlarge buildings such as the airport or shopping mall cameras are used in retail these are used to understand your customer behavior to automate a checkout system for loss prevention and others can mimic at a construction site can be used to monitor worker safety cameras and manufacturing can be used to detect defects the really tiny ones which are impossible for humans to detect deep stream offered the shimmy and analytic toolkit to efficiently process and understand video across many industry and in an AI system networks are constantly trained and deployed training generally happened in the cloud you can use TLT or other training framework with training network and deploy on the edge running leadership the insight and melody are collected on the edges then swing to the cop15 supports sending data over open source cough-cough AMQP or mt GT message broker or you can use one of the IOT services for meeting cultures provider teaching can send messages to microsoft azure IOT or to interview us and finally the insights collected in the cloud can then be used to generate the lon trade visualization dashboard or used to generate further business insights this is the deep stream graph architecture this is a typical IV a pipeline constructed with deep-sea fishing underneath the plugin are the underlying hardware that is believed that each of this plugin the first step is capturing the streaming data this could come from an RTS T scene from a file or from the USD or CSI camera for capture next is decode decoding is very computer intensive and so means any deck hundred accelerated deport from nvidia hardware this is different than the full record in GPU so you can be rest assured that the the core GPU will be completely utilized by AI infancy after decoding there might be some pre-processing before influencing this could be image conversion in the scaling cropping or the stream comes from a 360-degree camera then you might need to devolve image and there are various accelerators for these type of operation after image processing you bash the stream before sending it for entrance batching allows us to utilize the entire GPU efficiently for influencing after batching we send the data for inference here you can do object detection classification segmentation semantic segmentation and this can run on GPU order DL a the deep learning accelerator which is available on just an ad Xavier or Azalea index after entrance you might need to track the object for insights tracking or reading car on the road people inside a building and finally the last step is output here the options are you can view the video with with the bounding boxes will be metadata or you can store the video with all the important information or send just metadata to the car for further analytics so far I've talked about the capability of deep stream but now let me switch gears and talk about what's coming up on our roadmap for the next version or TFG the features that we provide in the future releases can be divided into three major buckets usability of the SDK infants of ki moms in IOT related feature first is how do we improve the usability of our SDK we trial-type usability features based on the feedback that we receive directly from our customers as well as our developers ones in improving usability we want to make it easy enough for developers to get started with diction and one of the major things that you're looking to add is a full support for pipe we started with an alpha version of Python and we got good feedback from the community and we are fully committed in providing a production ready path subtitle all the features that we're looking to add are more application these shows the full capability of this ticket you have tons of great features and without a reference the application user might find it overwhelming to use all the features the next bucket is ki infants at its core deep viewing the performance SDK to build a high based application yes constantly looking to increase the models that we support but having a rapidly moving target so we're opening up support to run malls natively in the training framework our goal is to make it easy for developers and data scientists who are working on these complex animals to deploy them effortless effortless they are not possible so for that we are looking and integrating the invidious Triton in front server with deep ship in addition we're looking to provide a full interoperability between all the models training transfer learning toolkit work Steve ship and lastly that's always connected devices transmitting data efficiently and securely between the edge devices crowd is extremely important that in mind in looking to add tons of hierarchy features on a deep chin Rona sending messages from each tube heart is important but receiving messages from fall onto our any device and active and acting upon it nothing is also very important the by directional messaging you can control the edge from the car you can change configuration remotely but are physically connecting to it it can allow you to meet all the updates of your deep learning model also with so much data being generated and transmitted it's very important to do it securely and we are looking to add some security measures for IOT from the education next I'll go all these features in more details front insulin workflow from training to determine your start with retrain models with TLT train them with your data set using the transfer learning toolkit once chained and true you explored the model train models on TLP will work all the box in deep Street this includes all the new purposefully train models that I spoke about earlier as well as some of the most popular models such as we all only three faster our CNN SST it will work seamlessly with deep-sea these models are fully compatible with tensor RT and videos infant accelerated run accelerated runt efficient chaining the TLB in real-time infants a deep Shing will give you the highest possible to put for any video analytics application as I mentioned earlier we are fully committed to providing full support to a Python developers is released an alpha version last year and based on the general feedback we have received in length we are improving and extending upon what we provided in the future release mentor for support for Python this means that most of the native seasoned phosphors ATR will now be supported in Python in addition to that having support for post-processing the tensor data from introns in Python so what that means to be developer is it will set already or see routine to car spawning boxes now you can take the raw changer and our Fox bonding box is directly in Python to get started you provided several useful sample application in Python on github then we continue to add more in the future some of the notable apps that we are providing our ability to to save an image from a pipeline this is useful when you want to save some image data of interest now with the Python API and sample apps user can incorporate these features to their own application we will also look to add an example project streaming data or RTSP and stream it out over an RDSP this is a typical use case for a lot of iea application for streaming video comes from I teach Emma or RTSP and you might need to send it upstream through RTSP and there are several jupiter notebooks on github that makes it easy to get started give shape from Python so up until deep sink for auto the network's needed to be completely supported that sensor arte to work for tips use what we are changing that in the future adding support for NVIDIA Triton instant saga formerly known as tensor RT in consumer has an infant server in the insurance micro service which is which was originally designed for Janus energy teams and now will be extended to embed a deep embedded line such as Jetson inference server is available in a ready to deploy container from Angie see the advantage of Triton inference server is it can support almost all the deploying framework this includes tensorflow pencil 40 r t5 torx onyx one-time in others now these shape users will be able to use try to an influencer and immediately from a deep Sunnah but that said an idiot sensibility is still supported and if we go to pad for entrance you're really looking for high super performance the downside is since there are key can be sometimes be limited due to certain layers and not need support it out-of-the-box intentionality provides custom plugins to add missing layers but that is not always trivial the idea of use case for tensor RT that the deep changes and you're ready to deploy we recommend tuning the model authority to give you the highest performance but on the other hand if you want flexibility on running any network or the Box addiction and then then can write an infant server is the correct approach this is usually the case when you're prototyping and you're still coming up with the right architecture you're experimenting is so but you don't want to spend a lot of time and effort converting a model into RT in this case you can create the entire video analytics Python you get all the benefit off of creating an efficient pipeline but use title infant server for infancy when some of you have few options to run infants now and you can decide to see which options work best for your network now you have trained the AI build your awesome app next you're ready to productize and deploy this application now imagine the awesome ad that you built to detect attract cars now needs to be deployed across hundreds or maybe thousands of s devices all across the city once your app was to count people in a regional you need to deploy a list of hundreds of retailers across the country how do you lose it in either of these cases it's not feasible and sometimes been possible to physically be connected to the customers for a scale deployment you want to be able to deploy apps and controller apps from a central location or Club cloud native applications allows you to scale out we develop applications in a container based environment and deployed on your micro service container consists of the entire runtime environment needed to run the application this includes the code tools and libraries setting everything bundled into this lightweight standalone executable container objects array the underlying OS and other infrastructure so first you start by continuing application cloud native platforms like kubernetes allows you to automate the scaling and management of containerized application kubernetes creates this cluster that consists of one mashup and hundreds of worker nodes now you can take your containerized application and deploy it across thousands of each device across the city or across country from one central location you can manage and update your apps deep stream application can also be containerized using docker containers or NGC these containers can be orchestrated using phone sound charts can be you can be used to manage your kubernetes package and to get started we released a demo on NGC that uses helm and kubernetes to deploy a containerized efficient application now imagine your application is in production it's processing on the edge in music ai to generate some inside it sends this inside incites back to the call in addition it also sends other stat such as health of the device logs or errors it can also transmit multimedia files such as images or videos of interest in the call you find that the current model is not performing them on your certain condition so you retrain and you want to update the model in this age or you want to change some parameters this could be adding or removing a cameras or changing camera settings changing a region of interest or maybe you need to pause resume or or stop an application for all of these things you need a mechanism to receive and act upon the investitures on your edge this requires a bi-directional communication between the ancient cop indeed ring phone are only supported sending messages from the edge to the call but in the future we will have a bidirectional communication this will allow deep scene to receive messages from the car and make changes in the application as required and bi-directional communication will be over the task our message broker in the CAF Commerce is broken even be able to subscribe messages from the cloud as well as publish and subscribe nation messages to multiple topics there will be several applications for developers to get started with the bi-directional messaging and once once we release this in Baltimore developers to take our examples and develop on top of it for your use case yeah isn't the iterative process your model works best on the very first day of your deform as more and more corner cases are discovers the air and moment legit deployed we start to perform poorly on this so you need to collect new data retain your model and then the fallback in the field this process repeats till you seem to improve the quality of your model this continuous model update means that you need to communicate to the app running on the image that a new model is available and ask the app to pick it up this is really neat the bi-directional messaging functionality send messages from cloud to the edge other use case for on-the-fly model update is you want to change model based on certain time of the day or weather condition because certain ones might work better under certain lighting condition with OTD a model update we can swap to a different model on the fly certain Majan a robot doing inventory management but robot might need a different model for each aisle because the product skier in each are completely different so instead of having them one monolithic money you know multiple more accurate model that are swapped instantaneously each team next will support this all the air model update this means that when application is running you can instantaneously swap them all zero downtime and this all happens without shutting down and restarting the application this is extremely important for mission critical application we cannot tolerate any downtime another use case for bi-directional messaging is to initiate a record based on a trigger on it on the edge device let's say for example you have a small city application where you're tracking the direction of the cars in the room whatever anomaly commit a monitor and a flat cars going in opposite direction the way you have architected your design and you do car detection and tracking on the edge using leave shim and then you send this inside to a data center or clock for further analytics in the cloud you have an unarmed detection that takes in metadata and looks for anomalies in the road one being car loan in the opposite direction once you detect that a car is going off as a direction immediately triggers producing that application to record a link or some finite amount of time the trigger doesn't necessarily have to come from the car it could be from any other micro services running on on the edge in the future will provide a PRS to start stop recording based on an external trigger the main benefit of this feature of the selector record is it's a valuable disk space imagine in a city with thousands of camera recording footage 24 hours a day which is very expensive and 99.9% of time it is not needed another benefit in it when something interesting happens instead of reviewing hundreds of hours of footage think quickly review few clips which can save valuable time and there are a lot of other very use cases for trigger base or smart record system lastly for successful scalable deployment one of the most important overlooked aspect is security he able to securely communicate between the edge device in the cloud security around alt device is a growing concern it's so much data and sometimes sensitive data generated on the edge then you need to you know that you want to ensure that you're sending to a trusted location one of the ways we are looking to bolster our security offering is to provide a support for too late TLS authentication based on SSL certificates and encrypted communication the two layers of cell authentication works with the kefka messaged over the client certificate is generated and stored outside of the deficient application secure authentication can be concealed in the deep stream app we are the message mobile pauline and this to a TLS authentication is based on a handshake protocol between the client and the server when a client is ready to communicate it sends a request to the server the server sees the request and since the server certificate when they then the client and verifies the service and it's given the certificate authority you know successfully verify and we present its own client certificate back to the server for two years and occasion then the server verifies the client-side certificate and it's successfully verified and acknowledged the client that a successful communication was set up only after secure authentication established and the client and server begin to communicate so finally I would like to close this talk by recapping the key challenges and NVIDIA software solutions and building it into an intelligent video analytics solution from training all the way to deployment creating highly accurate and reliable AI is challenging you need to retrain to train on lots of data now with invidious transfer learning twinken and a collection of pretreatments newton viewing a video creating highly accurate they are for smart city application for small retail uses you can use either of the highly accurate improve personal networks such as people that air traffic cam phase to type out of the box or you can retrain with your own data set to get even higher accuracy for these for your use case for other uses use case you can use it a generic free training model or some of the most popular network architecture such as the normal III faster CN n SSD and others the pre-training networks and TLT containers are all available on on NVIDIA NGC building an efficient pipeline to process and use AI Nvidia shield is not trivial there are lots of hardware level optimization that enhance improved performance deep stream SDK we have done a lot of heavy lifting to get you the highest channel mentally but sufficient memory management how they're excited floggings you can be a rest assured that you'll get the maximum throughput and it's never been easier to get started with live stream in the future with the full support of Python and triton infant server can quickly prototype and and reiterate your full preview analytic pipeline with a very little effort and lots of great sample applications in C++ and Python to get started and you're constantly looking to add feature and features and applications and finally for a scale deployment you need to manage and orchestrate hundreds of thousand dollars device from a central location or club building cloud native applications using container in 4k shading with kubernetes platforms allows you to scale out the d-team application can be built in cloud native containers and deployed with kubernetes once deployed the application needs to send metadata and other information to the cloud and be able to receive instructions and messages from the top deep scene with a bi-directional messaging capability allows for these level of communication and then the SSL authentication not you can be comfortable sending messages securely to a certified server so in summary use the transfer learning toolkit and creature involved to train a I use deep scene to build your application in use kubernetes and I are key features to deploy your AI applications so it started today on building your air month with the TLT and deploying these applications using deep stream and stay tuned for the next morning of deep stream and transfer learning toolkit here are some important developer resources to get started three months ago we released a free soft paste course on DLI getting on getting started with deep stream on that's a nano it's an introductory course and if you have never used it's a great course to get started and learn how to build it for you analytic pipeline and finally we'll have several connected expert sessions for both for both deep ship and TLT next week TLT connect with experts session is scheduled for tomorrow Friday March 27th from 2 to 3 p.m. deep stream connected expansionist alien force next Friday April 3rd from 2 p.m. to 3 p.m. you get a chance to interact and ask technical questions directly to more technically than engineers and work on building this technologies and that's it I'll tacitly Rachel for to get started on Q&A great Thank You chintan what a great presentation just as a note for those of you who saw the last slide with all those links and are wondering how you can get those a PDF of the presentation is going to be available on the session catalog so you can go back there and find into that in the next day or so just a reminder you can ask a question using the Q&A window to the right of the pre slide presentation and we'll give chance in a little moment to look over the questions and please begin when you're ready thank you what well first question I received was how can we create our training data set of image for specific purpose available or one for Google Microsoft Nvidia are not good so this movie this will require you to either collect your own data set this could be just collecting images or buying data sets from from from companies there's several companies who who label data sets will provide labor data set for us as a service so you can work with those companies to get those data sets another question is what is your opinion about darkness free trade networks are they good enough for generating deployable object detection models the answer is it depends on what is the use case and I mean the the darknet model that featuring a filter people are over over a large collection of images so if you want to take that and if you want to apply it for you for for your use case you will need to provide your own data set to make it usable otherwise it's not going to be very useful for for your day for your use case one question is what is the most promising pruning algorithm you recommend so actually we don't specify which particular cleaning algorithm well we have our own proprietary swimming algorithm that is part of TLT so if you use the transfer learning toolkit you can actually use the pruning functionality functionality within the within the within the framework and similarly the same the same applies for for in Tate as well so TLP has a waiting to export ink eight precision as well one question is Jetson nano ready to run in production ah yes certain Nano is ready to be run in production and deep scene in fact even if she wanted Sandman Nano is for production one can you share the link of this demo to deploy ditching a be using film also then link is actually in the the linkages in the the deck itself so when you download the deck the link should be in there but do you have any suggestions on how you would convert at photo time learning on Titan infants over to cancer are deeper production so for that there are examples on one sensor RT to convert a model so for example there is a potential flow there's a tf2 trt four-by-fours there's piping PRT and then I believe for an Onix runs directly with trt as long as all the layers are supported by sincerity so there's a multiple different scripts that are available for you for each framework one question what is market quart so small record is the way for you to control when to start and stop your record and the way was the way it works in lifting and reaching towards API and you as a user of this particular feature will have to will have to send a request to stall and stop the record so the the the smart part will actually happen to the decision part would actually have to come from you of what is the criteria to start and stop the record the new tree will find a pi44 for that the ELT handle quantization as well so we don't support a quantizer we're training at the moment but it is wrong or roadmap is simply to support in the future do you have eye are pre-trained models so one of the the personal preach a model for for face detect is for ir and it is specifically trained on IR data set where can I find where can I find the Jupiter notebook to get started with deep stream Python so the Jupiter moldings are actually part of our github repo so if you search Nvidia a IOT repo on Google and then we scroll down there's a section for deep chin Python apps that's where you'll see all the Python apps that we released in our alpha version of Python and a couple of upo notebooks there can you comment on the performance of Python haves compared to C C++ the perf of Python app depends on basically depends on what kind of processing is done in Python heavy processing in Python will introduce performance degradation is compared with native C our one question is how to set up deep chain development environment on Linux or Windows so deep stream is not supported on on Windows and to set up the leaching environment on Linux please refer to the deeps and develop in time what is OTA model OTS stands for over-the-air update and this is useful when you want to update in a model or or certain can take or or or the application from from cloud or running on the edge how one question is are there integrated containers with remote ELT and live stream no right now they're separate container they're actually two different tools you use for different application one is clearly is focused for training which generally happens on one of our large chaining GPUs and in the model that is exported will run on deep shame which is more for for influence or deployment which can run on on Jetson or or any of RT for GPUs has any teaching application we deployed out of the box and then teaching SDK sample apps run when men are the boss and to use this sample once check out the development guide and you will see performance for Nano so we have one I'll take one more question and your bottle will wrap up but I one thing I would like to say we have connect with expert sessions for both TLT which is tomorrow and what deep stream in next Friday so I highly recommend signing up for those if you if you haven't already and last question is how to use custom pencil flow models and Python code the deep stream and the one container for volume directly natively you will have to use the new feature the Triton inference server with leave Sheen would be the upcoming Python binding and that thank you everyone for joining great thank you and thank you all again for joining us for this session of deep GTC digital we appreciate your time and all of your questions as well an on-demand version of this webinar will be available in about an hour after this event ends and you can access that using the same link that you use to access this live this link will expire in about 48 hours so after that you can go back to the session catalog and you can find a video recording of it and the PDF and all the other materials in relationship to this this seminar so with that I'll thank you all again for joining and wish you all a great day

Original Description

Learn how to make sense of data ingested from sensors, cameras, and other internet-of-things devices. See how to train with massive datasets and deploy in real time to create a high-throughput, low-latency, end-to-end video analytics pipelines. We'll show you how to optimize your training workflow, use pre-trained models to build applications such as smart parking, infrastructure monitoring, disaster relief, retail analytics or logistics, and more. Get to know the suite of tools available to create, build, and deploy video apps that will gather insights and deliver business efficacy. Learn More at www.developer.nvidia.com/deepstream-sdk Join the NVIDIA Developer Community www.developer.nvidia.com

Watch on YouTube ↗ (saves to browser)

Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from NVIDIA Developer · NVIDIA Developer · 15 of 60

← Previous Next →

Ray Tracing Essentials Part 2: Rasterization versus Ray Tracing

Ray Tracing Essentials Part 2: Rasterization versus Ray Tracing

NVIDIA Developer

Ray Tracing Essentials Part 3: Ray Tracing Hardware

Ray Tracing Essentials Part 3: Ray Tracing Hardware

NVIDIA Developer

Ray Tracing Essentials Part 4: The Ray Tracing Pipeline

Ray Tracing Essentials Part 4: The Ray Tracing Pipeline

NVIDIA Developer

NsightGraphics 2020 2 Release Spotlight

NsightGraphics 2020 2 Release Spotlight

NVIDIA Developer

Ray Tracing Essentials Part 5: Ray Tracing Effects

Ray Tracing Essentials Part 5: Ray Tracing Effects

NVIDIA Developer

Ray Tracing Essentials Part 6: The Rendering Equation

Ray Tracing Essentials Part 6: The Rendering Equation

NVIDIA Developer

Ray Tracing Essentials Part 7: Denoising for Ray Tracing

Ray Tracing Essentials Part 7: Denoising for Ray Tracing

NVIDIA Developer

Spatiotemporal Importance Resampling for Many-Light Ray Tracing (ReSTIR)

Spatiotemporal Importance Resampling for Many-Light Ray Tracing (ReSTIR)

NVIDIA Developer

Announcing Cloud-Native Support for Jetson Platform

Announcing Cloud-Native Support for Jetson Platform

NVIDIA Developer

JetsonTV: Build your next project with NVIDIA Jetson

JetsonTV: Build your next project with NVIDIA Jetson

NVIDIA Developer

Nsight Compute Feature Spotlight: Roofline Analysis, Asynchronous Copy, Sparse Data Compression

Nsight Compute Feature Spotlight: Roofline Analysis, Asynchronous Copy, Sparse Data Compression

NVIDIA Developer

Nsight Systems Feature Spotlight: OpenMP

Nsight Systems Feature Spotlight: OpenMP

NVIDIA Developer

Isaac Sim 2020: Deep Dive

Isaac Sim 2020: Deep Dive

NVIDIA Developer

NVIDIA Jetson: Enabling AI-Powered Autonomous Machines at Scale

NVIDIA Jetson: Enabling AI-Powered Autonomous Machines at Scale

NVIDIA Developer

NVIDIA Tools to Train, Build, and Deploy Intelligent Vision Applications at the Edge

NVIDIA Tools to Train, Build, and Deploy Intelligent Vision Applications at the Edge

NVIDIA Developer

Jetson Xavier NX Developer Kit: The Next Leap in Edge Computing

Jetson Xavier NX Developer Kit: The Next Leap in Edge Computing

NVIDIA Developer

Synthesizing High-Resolution Images with StyleGAN2

Synthesizing High-Resolution Images with StyleGAN2

NVIDIA Developer

NVIDIA Robotics: Isaac SDK and Sim 2020.1

NVIDIA Robotics: Isaac SDK and Sim 2020.1

NVIDIA Developer

Accelerating COVID-19 Research with GPUs

Accelerating COVID-19 Research with GPUs

NVIDIA Developer

Visualizing 150 Terabytes of Data

Visualizing 150 Terabytes of Data

NVIDIA Developer

Boosting Performance and Utilization with Multi-Instance GPU

Boosting Performance and Utilization with Multi-Instance GPU

NVIDIA Developer

Running Multiple Workloads on a Single A100 GPU

Running Multiple Workloads on a Single A100 GPU

NVIDIA Developer

NVIDIA Nsight Feature Spotlight: GPU Trace

NVIDIA Nsight Feature Spotlight: GPU Trace

NVIDIA Developer

Spark 3 Demo: Comparing Performance of GPUs vs. CPUs

Spark 3 Demo: Comparing Performance of GPUs vs. CPUs

NVIDIA Developer

NVIDIA Jetson Nano Wins Edge AI and Vision Alliance Award

NVIDIA Jetson Nano Wins Edge AI and Vision Alliance Award

NVIDIA Developer

NVIDIA IndeX on Google Cloud Platform Marketplace

NVIDIA IndeX on Google Cloud Platform Marketplace

NVIDIA Developer

DeepStream SDK: Best practices for performance optimization

DeepStream SDK: Best practices for performance optimization

NVIDIA Developer

Efficiently Deploying GPU Accelerated 5G CloudRAN for Edge AI Inferencing

Efficiently Deploying GPU Accelerated 5G CloudRAN for Edge AI Inferencing

NVIDIA Developer

NVIDIA PhysicsNeMo - Accelerating Scientific & Engineering Simulation Workflows with AI

NVIDIA PhysicsNeMo - Accelerating Scientific & Engineering Simulation Workflows with AI

NVIDIA Developer

NVIDIA Deep Learning Institute Instructor-Led Training Available Remotely

NVIDIA Deep Learning Institute Instructor-Led Training Available Remotely

NVIDIA Developer

Advancing AR Glasses

Advancing AR Glasses

NVIDIA Developer

Blender Cycles: RTX On

Blender Cycles: RTX On

NVIDIA Developer

Real-Time GPU-Accelerated Data Analytics of 250 million Flight Data Records of 737 Max grounding

Real-Time GPU-Accelerated Data Analytics of 250 million Flight Data Records of 737 Max grounding

NVIDIA Developer

Assessing Property Damage with AI

Assessing Property Damage with AI

NVIDIA Developer

RAPIDS: GPU-Accelerated Data Analytics & Machine Learning

RAPIDS: GPU-Accelerated Data Analytics & Machine Learning

NVIDIA Developer

DaVinci Resolve Turns RTX On

DaVinci Resolve Turns RTX On

NVIDIA Developer

RAPIDS with Plotly Dash : GPU-Accelerated Census 2010 Visualization

RAPIDS with Plotly Dash : GPU-Accelerated Census 2010 Visualization

NVIDIA Developer

NVIDIA IndeX for arivis5D Cloud Platform

NVIDIA IndeX for arivis5D Cloud Platform

NVIDIA Developer

NVIDIA Backchannel: Behind the Scenes of Marbles at Night RTX

NVIDIA Backchannel: Behind the Scenes of Marbles at Night RTX

NVIDIA Developer

NVIDIA Backchannel: Sneak Peek into Marbles RTX in Omniverse

NVIDIA Backchannel: Sneak Peek into Marbles RTX in Omniverse

NVIDIA Developer

How to Create "Paint" in Substance Painter

How to Create "Paint" in Substance Painter

NVIDIA Developer

Accelerate AI development for Computer Vision on the NVIDIA Jetson with alwaysAI

Accelerate AI development for Computer Vision on the NVIDIA Jetson with alwaysAI

NVIDIA Developer

Securing Next Generation Apps over VMware Cloud Foundation with Bluefield-2 DPU

Securing Next Generation Apps over VMware Cloud Foundation with Bluefield-2 DPU

NVIDIA Developer

Accelerated Data Centers with NVIDIA and VMware

Accelerated Data Centers with NVIDIA and VMware

NVIDIA Developer

GPU-Accelerated Motion Blur in Blender Cycles

GPU-Accelerated Motion Blur in Blender Cycles

NVIDIA Developer

NVIDIA Clara Guardian Virtual Patient Assistant

NVIDIA Clara Guardian Virtual Patient Assistant

NVIDIA Developer

Revolutionizing Supercomputing with NVIDIA UFM Cyber-AI

Revolutionizing Supercomputing with NVIDIA UFM Cyber-AI

NVIDIA Developer

Inventing Virtual Meetings of Tomorrow with NVIDIA AI Research

Inventing Virtual Meetings of Tomorrow with NVIDIA AI Research

NVIDIA Developer

Learning a Contact-Adaptive Controller for Robust, Efficient Legged Locomotion

Learning a Contact-Adaptive Controller for Robust, Efficient Legged Locomotion

NVIDIA Developer

Getting started with Jetson Nano 2GB Developer Kit

Getting started with Jetson Nano 2GB Developer Kit

NVIDIA Developer

NVIDIA Jetson Developer Community AI Projects

NVIDIA Jetson Developer Community AI Projects

NVIDIA Developer

Open-source projects on NVIDIA Jetson Nano 2GB Developer Kit

Open-source projects on NVIDIA Jetson Nano 2GB Developer Kit

NVIDIA Developer

Real-Time Ray Tracing with Project Lavina

Real-Time Ray Tracing with Project Lavina

NVIDIA Developer

Jetson AI Fundamentals - S1E2 - Hello Camera

Jetson AI Fundamentals - S1E2 - Hello Camera

NVIDIA Developer

Develop Optimized Conversational AI Models with NVIDIA NeMo on DGX A100

Develop Optimized Conversational AI Models with NVIDIA NeMo on DGX A100

NVIDIA Developer

Jetson AI Fundamentals - S1E4 - Image Regression Project

Jetson AI Fundamentals - S1E4 - Image Regression Project

NVIDIA Developer

Jetson AI Fundamentals - S2E1 - JetBot Intro and Hardware

Jetson AI Fundamentals - S2E1 - JetBot Intro and Hardware

NVIDIA Developer

Jetson AI Fundamentals - S2E2 - JetBot Software Setup

Jetson AI Fundamentals - S2E2 - JetBot Software Setup

NVIDIA Developer

Jetson AI Fundamentals - S1E1 - First Time Setup with JetPack

Jetson AI Fundamentals - S1E1 - First Time Setup with JetPack

NVIDIA Developer

Jetson AI Fundamentals - S1E3 - Image Classification Project

Jetson AI Fundamentals - S1E3 - Image Classification Project

NVIDIA Developer

This video teaches how to use NVIDIA tools to train, build, and deploy intelligent vision applications at the edge, enabling high-throughput and low-latency video analytics pipelines. The tools support transfer learning, model pruning, and knowledge distillation, and can be used with popular deep learning frameworks and models.

Key Takeaways

Train AI models with NVIDIA TLT
Deploy AI models with Triton Inference Server
Optimize models with transfer learning and pruning
Use Deep Stream for computer vision applications
Deploy models with Kubernetes and NGC

💡 NVIDIA's proprietary pruning algorithm can increase the number of inferences per second and overall throughput, making it possible to deploy complex models effortlessly.

🔒 Pro feature: Ask AI to explain this lesson →

More on: LLM Engineering

View skill →

Build an LLM and RAG-based Chat Application using AlloyDB and LangChain

FULLY LOCAL Mistral AI PDF Processing [Hands-on Tutorial]

FULLY LOCAL Mistral AI PDF Processing [Hands-on Tutorial]

Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation

Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation

Ultimate Guide: Deploy Google ADK Agents to Vertex AI & Cloud Run (Step-by-Step Tutorial)

Ultimate Guide: Deploy Google ADK Agents to Vertex AI & Cloud Run (Step-by-Step Tutorial)

Shane | LLM Implementation

How to Make an Asteroids Game Bot (LIVE)

How to Make an Asteroids Game Bot (LIVE)

Using Claude Code + Nano Banana Pro To Create a Dataset of Engineering Drawings

Using Claude Code + Nano Banana Pro To Create a Dataset of Engineering Drawings

Automata Learning Lab

Related AI Lessons

How to prepare TIC teacher exams in Spain with AI (oposiciones 2026)

Prepare for TIC teacher exams in Spain using AI with these actionable steps

Why I built a simple AI provider wrapper (and you might too)

Learn why a simple AI provider wrapper is useful and how to build one for streamlined AI integration

Dev.to · zhongqiyue

This ChatGPT Prompt Replaced 3 Hours of PowerPoint Work

Learn to generate pitch-ready presentation decks in 5 minutes using ChatGPT, replacing hours of manual work

This ChatGPT Prompt Replaced 3 Hours of PowerPoint Work

Learn to generate pitch-ready presentation decks in 5 minutes using ChatGPT, replacing hours of manual work

Medium · ChatGPT

AI in Care - Katie Furey, Pairly.com

The Access Group