Python vs R | The BEST programming language for your Data Science Project
Key Takeaways
The video compares Python and R programming languages for data science projects, discussing their strengths and weaknesses, and provides guidance on choosing the best language for a project. It covers tools such as TensorFlow, PyTorch, and numpy, and concepts like statistical data analysis and machine learning centric projects.
Full Transcript
hello everyone welcome to the channel data science with her shed I hope you are all well and fine so in this particular video I want to address the most talked-about question in the data science community which is which is the best language Python or R for data science so for me I don't think I have a clear winner because both of these languages have their own specialities and I don't think this is the right question to ask the right question is which is the best programming language for your own data science project so I am going to talk about what are the 4 questions that you should be answering in order to learn about the best programming language for your own project considering the work environment that you are in or the amount of time that you have to invest so I will be talking about all of these factors through these questions so let's look at these questions right away so among many other languages like Java Scala MATLAB Python and are the most widely used languages for statistical data analysis or machine learning centric projects both of these are state-of-the-art open-source programming languages with create community support now you keep learning about new libraries and tools achieving new levels of performance and complexity in each of these languages now when we talk about are now all was developed by academician and statisticians over two decades ago are today enables many statisticians analyst and developers to carry out their analysis now we have over 12,000 packages available in cran which is an open repository and it was developed keeping statisticians in mind so R becomes the fourth choice for all the cool scientific and statistical analysis our today has a very rich ecosystem and we have a package for almost every kind of analysis so R stands out as a clear winner if you want to carry out a core scientific and statistical analysis and it has become very easy to communicate your results using concise and elegant reports if you're using tools like our studio Python is a well known language for it's easy to learn and readable syntax with the general-purpose language like Python you can build complete scientific ecosystems without worrying much about the compatibility or interfacing issues now Python codes have low maintenance cost and they are arguably more robust from data wrangling to feature selection web scraping deployment of our machine learning models Python can get almost everything done with integration support from all the major machine learning and deep learning api's and frameworks like flask Django and Tiano tensorflow and by torch so how does one make the right choice for their work at hand so we have these 4 questions to learn about the best suited language for your project now the first question is which language or framework is preferred in your organization or industry now depending on the industry you are working in and the most commonly used language by your peers or your competitors you might want to speak the same language here is an analysis carried out by David Robinson who's a data scientist at Stack Overflow and the analysis basically is a reflection of popularity of our in an industry and you can see that our is outstandingly being used in academia and healthcare so if you are someone who wants to go into research or academia or bioinformatics if you want to go into healthcare and you might want to consider are over Python because you are already aware of the inter cases of algorithms and you can definitely go ahead and use our which comprises of very complex and advanced packages for such scientific processing the other side of this coin is software industries application driven organizations and product based companies you might have to go hand in hand with the technical stack of your organization's infrastructure or the language that your colleagues or teens are using so most of the organization and most of the industries today use Python for solving their problems so if you look at this graph we have academia electronics healthcare FinTech then we have insurance then we have energy all of these industries are currently have a technical stack based in Python and they are using some so this is basically showing you the amount of traffic from each of these and us trees on stackoverflow so for an aspiring data scientist it is a clear choice to learn something which has many fold application and which could increase their chances of getting a job which we can see the Python has a greater percentage of usage within industries now the second question is what is the scope of your project this is an important question because before you pick up a language you must have an agenda for your project the extent to which you want to work over it for example if you want to simply solve a statistical problem through a data set perform some multivariate analysis and prepare a report or a dashboard explaining the insides then our might turn out to be a better choice because of its powerful visualization and communication libraries on the other hand if the aim is to carry out exploratory analysis develop a deep learning model and then also deploy that model within a web application then pythons web frameworks and support from all the major cloud providers make it a clear winner and you can also deploy pipelines or depending upon what the scope of the project is you can go very far with Python and your tool chain so the next question is how much do you want to understand the underlying algorithms it's a personal choice again for a beginner in data science who has limited familiarity with statistics and mathematical concepts Python might turn out to be a better choice because it lets you code the fragments of an algorithm with ease with libraries like numpy you can manipulate matrices and code algorithms yourself as a novice it's always better to learn to build things from scratch rather than hopping on to using machine learning libraries directly if you already know the fundamentals of machine learning like algorithms you can pick up either of the language to get started with now the fourth and last question is how much time do you have at hand the amount of time you can invest makes another case for your choice depending on your experience with programming and the delivery time of your project you might choose one language over another to get started in the field so if there is a high-priority project and you don't know either of the languages or might be an easier option for you to get started as you need limited or no experience with programming you can write statistical models with a few lines of code using existing libraries whereas Python is a great option to start off with if you have some bandwidth to explore the libraries and learn about the methods of exploring data set which in case of are can be done quickly using our studio in a nutshell the gap between the capabilities of r and python is getting narrower day by day most of the jobs can be done using both the languages and both have rich ecosystems to support you choosing a language for your project then will depend on the following conclusive points first your prior experience with data science or statistics or mathematics and programming so that would highly influence your decision on taking up one particular language the domain of the project at hand and the extent of statistical or scientific processing that is required in the project so maybe if you have a burn for maddox project or a project which requires higher mathematical processing or scientific processing you might be inclined towards using our existing libraries to use in that particular project the future scope of your project and the future scope of your project would tell if you would later on need to deploy your machine learning or your deep learning model what kind of frameworks would you be interacting with what kind of cloud providers would you be using to deploy your application at that point Python might turn out to be a better option at the end the language or the framework which is most widely supported in your teams or your organization or your industry based on which you would decide how easier it would be to contribute to your teams or the kind of technologies that are coming up in your industries that would influence your decision on which language you should pick up so I hope I have helped you get over the dilemma of Python or R for your project now if you found this video useful do not forget to give this a thumbs up like the video subscribe to the channel so that we can grow and go faster and comment down below if you have any questions any queries regarding python or any sort of question that you might have I'll try my best to answer all of you so up till my next video keep learning data science with Rochelle
Original Description
I intend to present to you the right set of questions you should be asking in order to decide upon choosing the best programming language for your data science project.
Blog: https://towardsdatascience.com/and-the-best-programming-language-for-data-science-goes-to-f80b9a5b439c
Growth of Python: https://stackoverflow.blog/2017/09/06/incredible-growth-python/
Growth of R: https://stackoverflow.blog/2017/10/10/impressive-growth-r/
You can connect me here:
Twitter: https://twitter.com/tyagi_harshit24
LinkedIn: https://www.linkedin.com/in/tyagiharshit/
Medium where I -write: https://medium.com/@harshit_tyagi
Instagram(for health and wellness): https://www.instagram.com/upgradewithharshit/?hl=en
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Harshit Tyagi · Harshit Tyagi · 13 of 60
1
2
3
4
5
6
7
8
9
10
11
12
▶
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Your PATH to learning Data Science
Harshit Tyagi
Ideal Python environment setup for Data Science projects - Unix shell, Anaconda and Git.
Harshit Tyagi
Building COVID-19 interactive dashboard from Jupyter Notebook | No frontend/backend coding required.
Harshit Tyagi
Introduction to Jupyter Notebooks - Interface | Ipython Kernel | Sharing | GitHub
Harshit Tyagi
Python fundamentals for Data Science - Part 1 | Data types | Strings | Lists
Harshit Tyagi
Python fundamentals for Data Science - Part 2 Dictionaries | Conditionals | Loops | Functions
Harshit Tyagi
Python fundamentals for Data Science - Part 3 OOPS | Working with External Libraries & Modules
Harshit Tyagi
NumPy Essentials for Data Science - part-1 | One Dimensional Array
Harshit Tyagi
NumPy Essentials for Data Science - part-2 | Multi-Dimensional Array
Harshit Tyagi
Math For Data Science | Practical reasons to learn math for Machine/Deep Learning
Harshit Tyagi
Linear Algebra Ep 1 | Introduction to Vectors, Matrices and Tensors using NumPy
Harshit Tyagi
Linear Algebra Ep 2 | Dot Product in Linear Algebra for Data Science
Harshit Tyagi
Python vs R | The BEST programming language for your Data Science Project
Harshit Tyagi
Linear Algebra for Data Science Ep3 | Identity and Inverse Matrices | NumPy
Harshit Tyagi
The Data Show Ep1 | Elucidating Data Science in Drug Discovery - A CTO's Account
Harshit Tyagi
Google Certified TensorFlow Developer | Learning Plan, Tips, FAQs & my Journey
Harshit Tyagi
Speeding up your Data Analysis | Hacks & Libraries
Harshit Tyagi
How to build an Effective Data Science Portfolio
Harshit Tyagi
End-to-End Machine Learning Project Tutorial - Part 1
Harshit Tyagi
Data Preparation with Sci-kit learn and Pandas | End-to-End ML Project Tutorial - Part 2
Harshit Tyagi
Training and Fine-Tuning ML Models with Sklearn | End-to-End ML Project Tutorial - Part 3
Harshit Tyagi
Deploying a Trained ML model via Flask on Heroku | End-to-End ML Project Tutorial - Part 4
Harshit Tyagi
Three Decades of Practising Data Science | Interview with Dean Abbott
Harshit Tyagi
Calculating Vector Norms - Linear Algebra for Data Science - IV
Harshit Tyagi
Ep1 - Getting Started | Zero to Hero in Computer Vision with TensorFlow
Harshit Tyagi
Ep3 - Designing Data Experiments to enhance your Product | Rapido's Data Science Lead, Pramod N
Harshit Tyagi
Building projects with fastai - From Model Training to Deployment
Harshit Tyagi
October AI - Video Calling with One-Tenth of Internet Bandwidth
Harshit Tyagi
November AI - Breakthrough in biology after 50 years | Datasets, books, research papers and more...
Harshit Tyagi
Data Science learning roadmap for 2021
Harshit Tyagi
Talk is cheap, BUILD - Microsoft Software Engineer | Interview with Abhirath Batra
Harshit Tyagi
Building a Habit of Reading Research Papers | Ft. Anurag Ghosh(Microsoft Researcher)
Harshit Tyagi
Tableau vs Python - Building a COVID tracker dashboard
Harshit Tyagi
[Explained] What is MLOps | Getting started with ML Engineering
Harshit Tyagi
Dmitry Petrov - Creator of DVC | ML Systems, Teams, Scaling challenges, and Learning Data Science
Harshit Tyagi
Five hard truths about building a career in Data Science
Harshit Tyagi
Computing gradients using TensorFlow | Training a Linear Regression model from scratch.
Harshit Tyagi
Foundations for Data Science & ML - First steps for every beginner!
Harshit Tyagi
Course Outline - Foundations for Data Science & ML
Harshit Tyagi
How Machine Learning uses Linear Algebra to solve data problems
Harshit Tyagi
Calculus for ML - How much you should know to get started
Harshit Tyagi
Building a buzzing stocks news feed using NLP and Streamlit | Named Entity Recognition & Linking
Harshit Tyagi
AI Engineer - The next big tech role!
Harshit Tyagi
AI researcher vs AI engineer | The next big tech role!
Harshit Tyagi
Reviewing LLMs for content creation
Harshit Tyagi
Building a chatGPT-like bot on WhatsApp #coding #chatgpt #engineering
Harshit Tyagi
High Signal AI - the most action-oriented newsletter on the web! #ai
Harshit Tyagi
Building an AI-powered Discord Chatbot Locally for FREE using Ollama
Harshit Tyagi
Build a second brain with Khoj 🧠 #ai #obsidian #plugins #productivity #engineering #notes
Harshit Tyagi
Summarising YouTube Videos using Ollama on Discord | Becoming an AI Engineer - Ep 2
Harshit Tyagi
Watch the full video on my channel - Roadmap to become an AI Engineer.
Harshit Tyagi
Mesop - Python-based UI framework from Google!
Harshit Tyagi
How I automated my YouTube | Gumloop tutorial | No Code
Harshit Tyagi
ARC PRIZE - Win $1Million to Beat the ARC-AGI benchmark
Harshit Tyagi
Microsoft's Autogen vs CrewAI - tested on a diverse range of use cases
Harshit Tyagi
Claude #AI artifacts are just amazing!
Harshit Tyagi
OpenAI releases CriticGPT to correct GPT-4's mistakes | Read the paper with me
Harshit Tyagi
Day in my life | Vlog #1
Harshit Tyagi
How to add AI Copilot to your application using CopilotKit | Tutorial
Harshit Tyagi
Quick Questions with an AI Founder - Anudeep Yegireddi
Harshit Tyagi
More on: ML Pipelines
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
The Dark Side of AI: What We Lose When We Stop Thinking
Medium · AI
AI Security Isn't a Product. It's an Engineering Discipline.
Dev.to AI
Why Solving Legal AI's Context Problem Is Harder Than You Think
Forbes Innovation
How Can We Truly Protect Information Privacy in the Age of Artificial Intelligence?
Medium · Machine Learning
🎓
Tutor Explanation
DeepCamp AI