Scikit Learn Machine Learning Tutorial for investing with Python p. 5

sentdex · Beginner ·📐 ML Fundamentals ·11y ago

Key Takeaways

This video tutorial demonstrates how to use Scikit Learn and Python for machine learning in investing, specifically by parsing HTML files, retrieving financial data from Yahoo Finance, and visualizing debt to equity ratios using Matplotlib.

Full Transcript

what is going on everybody Welcome to the fifth video in our python for machine learning using sidekit learn tutorial series uh in the last video we were talking about uh pulling some of the ne necessary information from our data file and in this video we're going to talk about how to act to actually acquire the value that we're interested in uh for us the value that we're interested in is this total debt to equity ratio uh again as we move forward we will add many more features to the company I don't think that this is going to give us anything too useful um at least until we separate companies by their sector um then maybe this number might get useful but for now we want to keep it simple so we can actually visualize the out output um and then we'll we'll start making it more complex and likely a little more interesting uh so first of all uh we've got the date and the Unix time and now we have actually need to acquire uh the data so when it comes to parsing a website um again we're not actually parsing a website in this example uh but we are what did I do with the um but what we have is a bunch of HTML files that b is identical to what you would have gotten if you had parsed Yahoo finance so uh for example here I have uh the stock ticker a for um agilant Technologies and if we scroll down we can see here is the total debt to equity ratio to start which is uh 407 now if we uh generally I mean you can use uh like a a module called Beautiful soup to do web parsing in my honest opinion beautiful soup is almost never necessary unless you're doing some really complex part parsing so I'm going to show you how simple it is to parse pretty much anything but um anyway what I tend to do is this is the I want this this right here I want to eventually find this number I'm just going to keep in my mind that the number is 0.407 I'm copying the element right before that number and then if you're on a website we're on an HTML page so we literally can do control U if you're in Chrome or you can do rightclick view page source and then contrl F and what we're looking for is total debt equity mrq which takes us here and then we see here is the actual value that we were interested in 0.47 we can't actually search for 0.407 because um that number is going to change obviously given the document so uh no longer do we need to print uh the date time and the unix's time uh we'll keep the sleep there for now and now what we want to do is figure out how we can pull this data now again you could use something like beautiful soup and you know use their like table reading functionality or you can just do the following so we have um the information there for the date time now what we want to do is we currently have no way of opening the full file yet so we need to specify how to build the entire path So currently we have this path then we have the stats path added to it and then from there we haven't done anything um so we want to open up the file uh so basically path plus stat path plus file equals um what we want so what we'll do is we're going to say uh the file so we'll say full undor file undor path equals each unor dur plus um slash plus file so that gives us the file path because you have to understand um we're currently uh where is each dur eacher here in stock list so this would give us the actual um path to our file now what we want to go ahead and do is the following so uh first let's go ahead and print this right here so we'll just take this copy paste and now we want the source so we're going to say the source for the source code source code equals open normally you would this would be like a URL lib open task with a read at the end but since we're not actually parsing from the website um we're opening a file instead uh open full uncore path oops UND let's just copy and paste copy paste open that with the intention to read and then do read and then let's go ahead and print the source just to see if we're on the right track um cool save F5 to run and might take a second yeah okay so here is the entire source code so we got all of that so cool we are indeed on our right track uh let's comment out the printing of Source before we get in trouble and uh now we actually want to pull the value that we're looking for and it turns out that Yahoo very rarely changes much but uh to get this you'll have you see we have the actual thing that we're trying to gather then we've got basically a colon and this before the actual value okay uh so that's easy enough um so what we'll do is um you'll have something like this so we'll say value equals source dosit and we want to split by gather so then so whatever we're trying to gather because that's going to be um this so we want to split and this is how you split like a big block of string data you can split it up by a value and we're going to say the value we want to split it by this Orange right here is and in fact let me zoom in fancy stuff here um the orange right here see if we can get bigger sometimes it's really difficult whenever I uh that's too big um I look at this sometimes afterwards and I'm like yeah that was pretty much impossible to read since I'm film in the 1080 anyway total debt to equity the Orange is the uh gather and then we want to add this bit to it so we'll literally just highlight that we can hit copy come over here gather um plus uh does it have any quotes in it yeah it's got double quotes so we'll use single quotes paste so these double quotes here as long as you encase them in a different form of quotes you're totally fine so anyway um or you can also Escape character but we'll just this will be fine so do split by that and then when we when we have splitted by that um on what side of this split is the element we're interested in well it's on the right side so we would not use element zero we use element one which is basically this and then everything after right so let's get back to where we were uh so we want the Firth element there and then what do we do well we do one more do split and this splits a little easier uh basically we want everything like since we have all this what what would be the most sane thing to split by well pretty much like this right just the closing table data tag that's it so copy that come over here split by the closing table data tag um and then we want the zeroth element there done we have parched this table um with one line of code basically so we we we read the source and we used one line of code to get the data we needed and this over the course of a decade just simply as far as I've seen thus far has not changed on Yahoo finance okay so now let's say we want to print uh the ticker and then the debt to equity ratio so first what we want to do is we need to define the ticker that we're you know using right now and so the ticker which is like the stock ticker equals uh each dur do split by and basically we want to split by backs slash so we do back slash backs slash and then the element is going to be on the right hand side so that will give us the actual ticker then what we can do is we come down here and we can do basically print uh ticker and then plus colon comma value this will give us the ticker and the price to equity or I mean the debt to equity ratio and then what we'll do is we'll just tab over the sleep so this will give us the ticker and the debt to equity ratio uh for that company for the decade right so we'll save and run that and I forgot where we were printing out the um the directory every time we'll fix that in a second uh but anyway you can see here it is here it is here it is here it is they went to zero for a little bit awesome here it is here it is here it is all the way down to now I mean this company is in massive debt oh wait oh I'm sorry we went to AA uh we stopped it so really this company yeah see at the end this company is in a large amount of debt um since my time here it'll be interesting to see we'll have to graph Deb debt to yeah debt to equity for like all of the companies we'll have to graph that it'll be 500 elements times about probably 20 or something but it should be okay to plot up on on Matt plot lib um I'd be really interested I think all the companies I've seen so far have increasingly take on taken on a lot of Leverage and debt over the years especially very recently um which is really worrying if you ask me all of the companies are in massive debt and it's in a market that's in theory in massive debt from QE right now and it's just it's insane so anyway that's uh interesting so uh now uh we've got the ticker and the value so we've parsed the data that we want and now we need to store it and structure it in such a way that we can use it right so we've acquired the the data right but now we have to save it so we can later on later access that data but we want to save it um with all of the companies so we'll go through all the companies and save these values for them and then maybe eventually we'll go through and save them by sector or something like that but anyway that's it for this video in the next video we're going to be actually uh using pandas to structure our data and then output it to CSV so later we can access it with pandas and be very efficient so uh that's it for this video if you have any questions or comments feel free to leave them below otherwise as always thanks for watching thanks for all the sport subscriptions until next time [Music]

Original Description

In this video, we build on the previous machine learning with scikit-learn tutorial, and we're going to be pulling out the specific data point that we're interested in as using as a feature. sample code: http://pythonprogramming.net http://seaofbtc.com http://sentdex.com http://hkinsley.com https://twitter.com/sentdex Bitcoin donations: 1GV7srgR4NJx4vrk7avCmmVQQrqmv87ty6
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from sentdex · sentdex · 0 of 60

← Previous Next →
1 Matplotlib Python Tutorial Part 1: Basics and your first Graph!
Matplotlib Python Tutorial Part 1: Basics and your first Graph!
sentdex
2 Python Encryption Tutorial with PyCrypto
Python Encryption Tutorial with PyCrypto
sentdex
3 Python's Logging Function
Python's Logging Function
sentdex
4 wxPython Tutorials 1: Making Windows GUIs with Python : Installing + 1st window!
wxPython Tutorials 1: Making Windows GUIs with Python : Installing + 1st window!
sentdex
5 wxPython Tutorials 2: Making Windows GUIs with Python: Customizing Window Parameters
wxPython Tutorials 2: Making Windows GUIs with Python: Customizing Window Parameters
sentdex
6 wxPython Programming Tutorial 3: Menu Bar and Menu Button
wxPython Programming Tutorial 3: Menu Bar and Menu Button
sentdex
7 wxPython Programming Tutorial 4: Panels
wxPython Programming Tutorial 4: Panels
sentdex
8 wxPython Programming Tutorial 5: User Input Saved To Variables
wxPython Programming Tutorial 5: User Input Saved To Variables
sentdex
9 wxPython Programming Tutorial 6: Multiple Choice Input
wxPython Programming Tutorial 6: Multiple Choice Input
sentdex
10 wxPython Programming Tutorial 7: Adding Static Text and Colors
wxPython Programming Tutorial 7: Adding Static Text and Colors
sentdex
11 wxPython Programming Tutorial 8: Custom Button Images
wxPython Programming Tutorial 8: Custom Button Images
sentdex
12 wxPython Programming Tutorial 9: Tool Bar Items and Sub Menus!
wxPython Programming Tutorial 9: Tool Bar Items and Sub Menus!
sentdex
13 Basic PHP Tutorial 13: Multi-dimensional Array
Basic PHP Tutorial 13: Multi-dimensional Array
sentdex
14 Basic PHP Tutorial 15: Functions and Global Variables
Basic PHP Tutorial 15: Functions and Global Variables
sentdex
15 Basic PHP Tutorial 12: Associative Array
Basic PHP Tutorial 12: Associative Array
sentdex
16 Basic PHP Tutorial 14: Foreach loop
Basic PHP Tutorial 14: Foreach loop
sentdex
17 Basic PHP Tutorial 16: Include and Require
Basic PHP Tutorial 16: Include and Require
sentdex
18 Basic PHP Tutorial 7: Assignment, comparison and Logical operators
Basic PHP Tutorial 7: Assignment, comparison and Logical operators
sentdex
19 Basic PHP Tutorial 4: Variables and Comments
Basic PHP Tutorial 4: Variables and Comments
sentdex
20 Basic PHP Tutorial 11: Arrays part 1, basic array
Basic PHP Tutorial 11: Arrays part 1, basic array
sentdex
21 Basic PHP Tutorial 6: If else and else if conditionals cont'd
Basic PHP Tutorial 6: If else and else if conditionals cont'd
sentdex
22 Basic PHP Tutorial 1: Intro to PHP
Basic PHP Tutorial 1: Intro to PHP
sentdex
23 Basic PHP Tutorial 3: HTML with PHP
Basic PHP Tutorial 3: HTML with PHP
sentdex
24 Basic PHP Tutorial 9: While Loop
Basic PHP Tutorial 9: While Loop
sentdex
25 Basic PHP Tutorial 10: Switch Statement
Basic PHP Tutorial 10: Switch Statement
sentdex
26 Basic PHP Tutorial 2: Print and Echo
Basic PHP Tutorial 2: Print and Echo
sentdex
27 Basic PHP Tutorial 5: If else and else if conditional statements
Basic PHP Tutorial 5: If else and else if conditional statements
sentdex
28 Basic PHP Tutorial 8: Arithmatic Operators: Doing math with php
Basic PHP Tutorial 8: Arithmatic Operators: Doing math with php
sentdex
29 Basic PHP Tutorial 17: User Input Form Example / String Manipulation
Basic PHP Tutorial 17: User Input Form Example / String Manipulation
sentdex
30 Basic PHP Tutorial 18: HTML Entities and forms cont'd
Basic PHP Tutorial 18: HTML Entities and forms cont'd
sentdex
31 Basic PHP Tutorial 19: Finding words in strings
Basic PHP Tutorial 19: Finding words in strings
sentdex
32 Basic PHP Programming Tutorial 20: Saving to a File / writing and appending
Basic PHP Programming Tutorial 20: Saving to a File / writing and appending
sentdex
33 Basic PHP Programming Tutorial 22: Hashing part 2: salting
Basic PHP Programming Tutorial 22: Hashing part 2: salting
sentdex
34 Basic PHP Programming Tutorial 23: Variables in Strings and tokenizing
Basic PHP Programming Tutorial 23: Variables in Strings and tokenizing
sentdex
35 Basic PHP Programming Tutorial 21: MD5 Hashing For Security
Basic PHP Programming Tutorial 21: MD5 Hashing For Security
sentdex
36 Basic PHP Programming Tutorial 24: String similarity
Basic PHP Programming Tutorial 24: String similarity
sentdex
37 Basic PHP Programming Tutorial 25: Time and Time stamps
Basic PHP Programming Tutorial 25: Time and Time stamps
sentdex
38 Basic PHP Programming Tutorial 26: Die and Exit
Basic PHP Programming Tutorial 26: Die and Exit
sentdex
39 Basic PHP Programming Tutorial 27: MySQL Databases Part 1
Basic PHP Programming Tutorial 27: MySQL Databases Part 1
sentdex
40 Basic PHP Programming Tutorial 28: MySQL Database Part 2: Reading From Database
Basic PHP Programming Tutorial 28: MySQL Database Part 2: Reading From Database
sentdex
41 Basic PHP Programming Tutorial 29: MySQL Database Part 3: Inputting Data
Basic PHP Programming Tutorial 29: MySQL Database Part 3: Inputting Data
sentdex
42 Basic PHP Programming Tutorial 30: MySQL database in Use
Basic PHP Programming Tutorial 30: MySQL database in Use
sentdex
43 Django Tutorial Web Development with Python Part 1: Installing Django
Django Tutorial Web Development with Python Part 1: Installing Django
sentdex
44 Python Tutorial: File Deletion and Folder Deletion / directory deletion
Python Tutorial: File Deletion and Folder Deletion / directory deletion
sentdex
45 Python Tutorial: How to Rename Files and Move Files with Python
Python Tutorial: How to Rename Files and Move Files with Python
sentdex
46 3D Graphs in Matplotlib for Python: Basic 3D Line
3D Graphs in Matplotlib for Python: Basic 3D Line
sentdex
47 3D Plotting in Matplotlib for Python: 3D Scatter Plot
3D Plotting in Matplotlib for Python: 3D Scatter Plot
sentdex
48 3D Charts in Matplotlib for Python: Multiple datasets scatter plot
3D Charts in Matplotlib for Python: Multiple datasets scatter plot
sentdex
49 Sikuli Tutorial 1: Visually programming in python!
Sikuli Tutorial 1: Visually programming in python!
sentdex
50 Sikuli Tutorial 2: Program visually in python!
Sikuli Tutorial 2: Program visually in python!
sentdex
51 Sikuli Tutorial 3: Program visually in python!
Sikuli Tutorial 3: Program visually in python!
sentdex
52 3D Bar Charts in Python and Matplotlib
3D Bar Charts in Python and Matplotlib
sentdex
53 3D Plane wire frame Graph Chart in Python
3D Plane wire frame Graph Chart in Python
sentdex
54 Raspberry Pi Part 1 Introduction
Raspberry Pi Part 1 Introduction
sentdex
55 Raspberry Pi Part 8: First Download and Update! (Firmware)
Raspberry Pi Part 8: First Download and Update! (Firmware)
sentdex
56 Raspberry Pi Part 10: How to set up a Linux Web Server on your Pi
Raspberry Pi Part 10: How to set up a Linux Web Server on your Pi
sentdex
57 Raspberry Pi Part 11: Remote Desktop
Raspberry Pi Part 11: Remote Desktop
sentdex
58 Twitter Analysis: How to rank a user's influence
Twitter Analysis: How to rank a user's influence
sentdex
59 GPIO Tutorial for Pi Part 2 - Programming the GPIO
GPIO Tutorial for Pi Part 2 - Programming the GPIO
sentdex
60 GPIO Tutorial for Raspberry Pi Part 1 - Setting up
GPIO Tutorial for Raspberry Pi Part 1 - Setting up
sentdex

This video teaches how to use Python and Scikit Learn for machine learning in investing, covering data retrieval, visualization, and analysis. It provides a hands-on example of how to parse HTML files, retrieve financial data, and visualize debt to equity ratios.

Key Takeaways
  1. Build full file path using path and stats path
  2. Open file using open function with read intention
  3. Print source code to verify file opening
  4. Read source code from file
  5. Split string data by specific values
  6. Extract desired information from split data
  7. Define ticker and print ticker and debt to equity ratio
  8. Tab over to switch between columns in a table
  9. Run a script to extract data
  10. Save extracted data for later use
💡 The video demonstrates how to integrate multiple libraries and tools, such as Beautiful Soup, Yahoo Finance, and Matplotlib, to build a machine learning pipeline for investing.

Related AI Lessons

The Python Dictionary Trick That Makes Interviewers Smile
Learn the Python dictionary trick that impresses interviewers and improves your coding skills
Dev.to · Ameer Abdullah
I Compared 50 Python Courses. Here Are My Top 5 Recommendations for 2026
Discover the top 5 Python courses for 2026, curated from a comparison of 50 courses, to enhance your programming skills and career prospects
Medium · Python
Machine learning for beginners #5
Learn the basics of machine learning through the analysis of self-driving cars and understand how ML is applied in real-world scenarios
Medium · AI
Beyond the Elephant: On Manifolds, Projections, and the Hidden Assumptions of Neural Geometry
Learn how neural geometry relies on manifolds, projections, and hidden assumptions to understand complex data, and why it matters for AI development
Medium · AI
Up next
Is Python Dead in 2026?| Truth About Python in AI Era | 90 Days Roadmap @FameWorldEducationalHub
FAME WORLD EDUCATIONAL HUB
Watch →