How do I change display options in pandas?
Skills:
ML Maths Basics80%
Key Takeaways
The video demonstrates how to change display options in pandas using the pd.get_option() and pd.set_option() functions, including options such as display.max_rows, display.max_columns, display.max_colwidth, display.precision, and display.float_format.
Full Transcript
Hello and welcome back to my Q&A video series about the pandas library in Python. And the question for today comes from a YouTube viewer who asks, "Is there a simple way to format large numbers with commas when printing or graphing on the axes?" Okay, great question. uh I will answer that but first I'm going to answer the more general question which is how do we change display options in pandas okay so as always need an uh example data set and so we're going to import pandas as pd and the data set for today will be um alcohol consumption by country so drinks equals pdread CSV and it's a URL bit.ly/drinksby country. Okay. And instead of checking out the head, let's try just try printing it out. Okay. So, we'll just type drinks and hit enter. Okay. And what you'll see is that it prints out the first 30 rows, then this ellipsus dot dot dot, and then the last 30 rows. Okay? And it's doing this because it, you know, it doesn't want to overrun your display. But you'll notice there are only 193 rows. Okay? So, what if I want to change this and I want to show all of the rows? How do we do that? Well, it turns out that pandas has display options and I will show you how to modify those. Okay, so first you need to find the relevant option you want to change. And one easy way to look at the options is to go to the documentation for pandas.get option. Okay, so you can Google for this or there is a link to this page in the description of this video below. Okay. So, what you want to do is go to get option and then scroll down and you will see a list of all the available options and their descriptions. And we are going to find display rows. Okay. And it tells you it's an int and it says if max rows is exceeded switch to truncate view. Okay. and uh none value means unlimited. Okay, so this is the option we want to change display.mmax rows. Okay, so what I like to first do before I change an option is look at its current value. Okay, so we're going to use the top level function PD.get option and then all you do is pass it the name of the option. So display.mmax rows and we see that it is 60 which is why it printed out by default 30 rows at the top and 30 rows at the bottom. So to change that, it's just easiest to edit this. We're going to change it to set option. PD set option. We pass it the name of the option and the new value. So I could just say 200 and yes, that would accomplish it. But um I think what I'll do instead is to say none, which means to show all rows. Okay? And you want to be careful with this because if you have a million rows, this is not going to work very well. But we know that we only have 193. So this will work just fine. Okay. So we'll go ahead and print out drinks. And now uh you will see that I can see every single row. Scroll all the way down to the bottom. And now I can kind of browse the data. Okay. All right. Now, um let's say I finished browsing the data and I no longer want to keep this option. How do I change it? Well, uh, you actually just reset. And again, I will just edit this cuz it's easiest. So, all I will say is reset option. And I pass it the name. And then you can confirm by rerunning this. And you'll see that it's back to displaying the first 30 rows and the last 30 rows. Okay. So, again, all I did was PD.reset option. passed it the option to reset. Okay. Uh I will note there's another option uh if you have a lot of columns let's say you might have guessed this already but um there's an option called display domax columns and the default that for that is 20. Okay, or at least it is on my system. And uh you know if you have 25 columns and you want to look at them all then just change it to the appropriate number or just change it to none. Okay. So I want to show you uh two more options then I will get back to Mark's question. Okay. So um for my next example I am going to use a different data set and it is the training data set from Kaggle's Titanic competition and um so what I'll do is train equals PD read CSV and as always it's bit.ly and then this time it's kaggle train and I'll just go ahead and say train.head. Okay. So this is our data set and uh the important thing I want to highlight the first thing is this ellipsus here the dot dot dot why is it here? Well this is a column that contains strings and there is some sort of maximum it is set about how many characters it will display. Why? Well, again, it just doesn't want to overrun your screen if this is 10,000 characters. But sometimes, you know, well, it's actually not that many characters, and I just want to see the data. I don't want you to hide it from me. Okay. So, how do we change that? We're going to say, uh, PD.get option, and this is display.mmax call width. And that is what's controlling this. It's saying only show the first 50 characters. Okay. So, let's change that and see if it works. And we'll change it to set option. And you can't use none in this case. So, we're just going to set a large option. We'll just say a,000. Okay. And then, uh, I'll break this into two cells. And I'll just display the head again. And now you can see there's no more dot dot dot in this cell. Okay. It's going to show all the text actually in all the columns. Okay. Up to a th00and characters. All right. Um I also want to point out this fair column. Uh this is fair like dollars I think and or maybe a different currency. And um you know you might have noticed there's like four decimal points and you might think I don't really want to see four decimal points. I want to see maybe two decimal points. So let's uh change that. We'll say pd.get option. And this one is display.precision. Okay. And it's six which means six digits after the decimal point. Now if we change that to two with the set option function. Okay. Now when we do train.head it will only display two decimal points after. Okay. Now uh let me be clear that did not affect the underlying data that only affected what is displayed. Okay. All right. Let me finally get back to Mark's question, uh, which is how do we format large numbers so that they have commas when you're printing or graphing? Um, well, uh, I'm going to use the drinks data frame. So, uh, let's show that one again. Drinks. And we don't have any really large numbers in here. So, I'm going to add some columns that are really large. Okay. So I'm going to say drinks bracket x. So I'm just creating a new column called x. And I'm just going to say drinks do wine servings time a,000. Okay. Now I'm just broadcasting this operation of multiplication uh times an entire series the wine servings. Now I'm x is not like a meaningful column. I just need something with large numbers. And I'm going to add one more. I'm going to say drinks brackety y equals drinks.total um total uh times 1,000. Okay. And now let's say drinks.head. Okay. And you can see the two new columns X and Y. It's just this column and this column times a,000. And Mark is asking how do we get like a comma? So that's like 54,000 and 4,900. Okay. How do we get some commas in there? Okay. So the option we're going to use is PD. Set option. And we're going to use display.flat format. Okay. And here's the thing we're going to pass it which may look confusing if you haven't seen these before. in braces I'm going to put colon comma and then after the string I'm going to put dot format. Okay. So uh what I've done is I've passed a python format string and that format string actually means use commas as the thousands separator. Okay. Now this is not something that pandas invented. This is something we're using that exists in Python generally which is format strings. Okay. So when we do drinks head you will notice we've got 4900 and you've got the comma you've got 12,400 etc. But it affected y and not x. So the question is why did that happen? And you might have already figured this out by the name of the option which is float format because drinks.dtypes tells us that x was an int and y is a float. So there is no int format um option. There's a float format and it only affects floating point columns. Okay, so there isn't actually an easy way I know of to get it to affect uh integers as well. Um but it does it is pretty simple for floating point numbers. Okay. Uh the second part of Mark's question was how do we get this to affect plotting and the axes? Unfortunately, there's no option for that. For that, you will actually have to learn some mattplot liib and there should be a way to get it to do that with mattplot liib. Mattplot liib is the plotting library that controls the plots that pandas produces. Okay. All right. As always, we're going to end with a bonus. And in fact, I've got two bonuses for you today. The first bonus is uh let's say you're not connected to the internet or you just don't want to do yet another Google search to the pandas documentation, but you want to read up on the pandas options. How can you do that? Uh, and I'm going to show you a cool function called pd.describe option. You run that and it will display every option, the name of it, the type, the default value, and the current value on your system. Okay, you can scroll through it and you can see every last option. Pretty cool. And uh another trick with this one is um you can uh if you know a particular option you're trying to search or maybe you remember part of the name like you search for rows it will search and only show you the options with rows in the name. Okay, it's not searching this this text. It's just searching the names of the options. Okay. All right. Second bonus is let's say you've changed a bunch of options and you want to reset them without like restarting your kernel or restarting your uh IDE. Uh there's another trick for that which is PD.reset option and you're thinking like uh do I have to do it one by one? No. There is a special keyword all just pass it as a string and it will reset all of your options to the defaults. Now you will get a warning and that's because uh some of the options being reset have already been deprecated but you can safely ignore any warnings here. Okay, so that's it for today. Thank you so much for joining me. I really appreciate it. Uh, if you'd like to see more videos like this, please click subscribe. As always, I'd love to hear from you. So, please leave a comment below if you have a comment or a question, and maybe I'll answer your question in a future video. Uh, but that's it. Uh, thanks again for joining me, and I hope to see you again soon.
Original Description
Have you ever wanted to change the way your DataFrame is displayed? Perhaps you needed to see more rows or columns, or modify the formatting of numbers? In this video, I'll demonstrate how to change the settings for five common display options in pandas.
SUBSCRIBE to learn data science with Python:
https://www.youtube.com/dataschool?sub_confirmation=1
JOIN the "Data School Insiders" community and receive exclusive rewards:
https://www.patreon.com/dataschool
== RESOURCES ==
GitHub repository for the series: https://github.com/justmarkham/pandas-videos
"get_option" documentation: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.get_option.html
"set_option" documentation: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.set_option.html
"reset_option" documentation: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.reset_option.html
"describe_option" documentation: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.describe_option.html
More about options and settings: http://pandas.pydata.org/pandas-docs/stable/options.html
== LET'S CONNECT! ==
Newsletter: https://www.dataschool.io/subscribe/
Twitter: https://twitter.com/justmarkham
Facebook: https://www.facebook.com/DataScienceSchool/
LinkedIn: https://www.linkedin.com/in/justmarkham/
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Data School · Data School · 50 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
▶
51
52
53
54
55
56
57
58
59
60
Setting up Git and GitHub
Data School
Navigating a GitHub Repository - Part 1
Data School
Forking a GitHub Repository
Data School
Creating a New GitHub Repository
Data School
Copying a GitHub Repository to Your Local Computer
Data School
Committing Changes in Git and Pushing to a GitHub Repository
Data School
Syncing Your GitHub Fork
Data School
Allstate Purchase Prediction Challenge on Kaggle
Data School
Troubleshooting: Updates Rejected When Pushing to GitHub
Data School
Hands-on dplyr tutorial for faster data manipulation in R
Data School
ROC Curves and Area Under the Curve (AUC) Explained
Data School
Going deeper with dplyr: New features in 0.3 and 0.4 (tutorial)
Data School
What is machine learning, and how does it work?
Data School
Setting up Python for machine learning: scikit-learn and Jupyter Notebook
Data School
Getting started in scikit-learn with the famous iris dataset
Data School
Training a machine learning model with scikit-learn
Data School
Comparing machine learning models in scikit-learn
Data School
Data science in Python: pandas, seaborn, scikit-learn
Data School
Selecting the best model in scikit-learn using cross-validation
Data School
How to find the best model parameters in scikit-learn
Data School
How to evaluate a classifier in scikit-learn
Data School
What is pandas? (Introduction to the Q&A series)
Data School
How do I read a tabular data file into pandas?
Data School
How do I select a pandas Series from a DataFrame?
Data School
Why do some pandas commands end with parentheses (and others don't)?
Data School
How do I rename columns in a pandas DataFrame?
Data School
How do I remove columns from a pandas DataFrame?
Data School
How do I sort a pandas DataFrame or a Series?
Data School
How do I filter rows of a pandas DataFrame by column value?
Data School
How do I apply multiple filter criteria to a pandas DataFrame?
Data School
Your pandas questions answered!
Data School
How do I use the "axis" parameter in pandas?
Data School
How do I use string methods in pandas?
Data School
How do I change the data type of a pandas Series?
Data School
When should I use a "groupby" in pandas?
Data School
How do I explore a pandas Series?
Data School
How do I handle missing values in pandas?
Data School
What do I need to know about the pandas index? (Part 1)
Data School
What do I need to know about the pandas index? (Part 2)
Data School
How do I select multiple rows and columns from a pandas DataFrame?
Data School
Machine Learning with Text in scikit-learn (PyCon 2016)
Data School
When should I use the "inplace" parameter in pandas?
Data School
How do I make my pandas DataFrame smaller and faster?
Data School
How do I use pandas with scikit-learn to create Kaggle submissions?
Data School
More of your pandas questions answered!
Data School
How do I create dummy variables in pandas?
Data School
How do I work with dates and times in pandas?
Data School
How do I find and remove duplicate rows in pandas?
Data School
How do I avoid a SettingWithCopyWarning in pandas?
Data School
How do I change display options in pandas?
Data School
How do I create a pandas DataFrame from another object?
Data School
How do I apply a function to a pandas Series or DataFrame?
Data School
Getting started with machine learning in Python (webcast)
Data School
Q&A about Machine Learning with Text (online course)
Data School
Your pandas questions answered! (webcast)
Data School
Machine Learning with Text in scikit-learn (PyData DC 2016)
Data School
Write Pythonic Code for Better Data Science (webcast)
Data School
Web scraping in Python (Part 1): Getting started
Data School
Web scraping in Python (Part 2): Parsing HTML with Beautiful Soup
Data School
Web scraping in Python (Part 3): Building a dataset
Data School
More on: ML Maths Basics
View skill →Related AI Lessons
🎓
Tutor Explanation
DeepCamp AI