How do I change display options in pandas?

Data School · Intermediate ·📰 AI News & Updates ·9y ago

Key Takeaways

The video demonstrates how to change display options in pandas using the pd.get_option() and pd.set_option() functions, including options such as display.max_rows, display.max_columns, display.max_colwidth, display.precision, and display.float_format.

Full Transcript

Hello and welcome back to my Q&A video series about the pandas library in Python. And the question for today comes from a YouTube viewer who asks, "Is there a simple way to format large numbers with commas when printing or graphing on the axes?" Okay, great question. uh I will answer that but first I'm going to answer the more general question which is how do we change display options in pandas okay so as always need an uh example data set and so we're going to import pandas as pd and the data set for today will be um alcohol consumption by country so drinks equals pdread CSV and it's a URL bit.ly/drinksby country. Okay. And instead of checking out the head, let's try just try printing it out. Okay. So, we'll just type drinks and hit enter. Okay. And what you'll see is that it prints out the first 30 rows, then this ellipsus dot dot dot, and then the last 30 rows. Okay? And it's doing this because it, you know, it doesn't want to overrun your display. But you'll notice there are only 193 rows. Okay? So, what if I want to change this and I want to show all of the rows? How do we do that? Well, it turns out that pandas has display options and I will show you how to modify those. Okay, so first you need to find the relevant option you want to change. And one easy way to look at the options is to go to the documentation for pandas.get option. Okay, so you can Google for this or there is a link to this page in the description of this video below. Okay. So, what you want to do is go to get option and then scroll down and you will see a list of all the available options and their descriptions. And we are going to find display rows. Okay. And it tells you it's an int and it says if max rows is exceeded switch to truncate view. Okay. and uh none value means unlimited. Okay, so this is the option we want to change display.mmax rows. Okay, so what I like to first do before I change an option is look at its current value. Okay, so we're going to use the top level function PD.get option and then all you do is pass it the name of the option. So display.mmax rows and we see that it is 60 which is why it printed out by default 30 rows at the top and 30 rows at the bottom. So to change that, it's just easiest to edit this. We're going to change it to set option. PD set option. We pass it the name of the option and the new value. So I could just say 200 and yes, that would accomplish it. But um I think what I'll do instead is to say none, which means to show all rows. Okay? And you want to be careful with this because if you have a million rows, this is not going to work very well. But we know that we only have 193. So this will work just fine. Okay. So we'll go ahead and print out drinks. And now uh you will see that I can see every single row. Scroll all the way down to the bottom. And now I can kind of browse the data. Okay. All right. Now, um let's say I finished browsing the data and I no longer want to keep this option. How do I change it? Well, uh, you actually just reset. And again, I will just edit this cuz it's easiest. So, all I will say is reset option. And I pass it the name. And then you can confirm by rerunning this. And you'll see that it's back to displaying the first 30 rows and the last 30 rows. Okay. So, again, all I did was PD.reset option. passed it the option to reset. Okay. Uh I will note there's another option uh if you have a lot of columns let's say you might have guessed this already but um there's an option called display domax columns and the default that for that is 20. Okay, or at least it is on my system. And uh you know if you have 25 columns and you want to look at them all then just change it to the appropriate number or just change it to none. Okay. So I want to show you uh two more options then I will get back to Mark's question. Okay. So um for my next example I am going to use a different data set and it is the training data set from Kaggle's Titanic competition and um so what I'll do is train equals PD read CSV and as always it's bit.ly and then this time it's kaggle train and I'll just go ahead and say train.head. Okay. So this is our data set and uh the important thing I want to highlight the first thing is this ellipsus here the dot dot dot why is it here? Well this is a column that contains strings and there is some sort of maximum it is set about how many characters it will display. Why? Well, again, it just doesn't want to overrun your screen if this is 10,000 characters. But sometimes, you know, well, it's actually not that many characters, and I just want to see the data. I don't want you to hide it from me. Okay. So, how do we change that? We're going to say, uh, PD.get option, and this is display.mmax call width. And that is what's controlling this. It's saying only show the first 50 characters. Okay. So, let's change that and see if it works. And we'll change it to set option. And you can't use none in this case. So, we're just going to set a large option. We'll just say a,000. Okay. And then, uh, I'll break this into two cells. And I'll just display the head again. And now you can see there's no more dot dot dot in this cell. Okay. It's going to show all the text actually in all the columns. Okay. Up to a th00and characters. All right. Um I also want to point out this fair column. Uh this is fair like dollars I think and or maybe a different currency. And um you know you might have noticed there's like four decimal points and you might think I don't really want to see four decimal points. I want to see maybe two decimal points. So let's uh change that. We'll say pd.get option. And this one is display.precision. Okay. And it's six which means six digits after the decimal point. Now if we change that to two with the set option function. Okay. Now when we do train.head it will only display two decimal points after. Okay. Now uh let me be clear that did not affect the underlying data that only affected what is displayed. Okay. All right. Let me finally get back to Mark's question, uh, which is how do we format large numbers so that they have commas when you're printing or graphing? Um, well, uh, I'm going to use the drinks data frame. So, uh, let's show that one again. Drinks. And we don't have any really large numbers in here. So, I'm going to add some columns that are really large. Okay. So I'm going to say drinks bracket x. So I'm just creating a new column called x. And I'm just going to say drinks do wine servings time a,000. Okay. Now I'm just broadcasting this operation of multiplication uh times an entire series the wine servings. Now I'm x is not like a meaningful column. I just need something with large numbers. And I'm going to add one more. I'm going to say drinks brackety y equals drinks.total um total uh times 1,000. Okay. And now let's say drinks.head. Okay. And you can see the two new columns X and Y. It's just this column and this column times a,000. And Mark is asking how do we get like a comma? So that's like 54,000 and 4,900. Okay. How do we get some commas in there? Okay. So the option we're going to use is PD. Set option. And we're going to use display.flat format. Okay. And here's the thing we're going to pass it which may look confusing if you haven't seen these before. in braces I'm going to put colon comma and then after the string I'm going to put dot format. Okay. So uh what I've done is I've passed a python format string and that format string actually means use commas as the thousands separator. Okay. Now this is not something that pandas invented. This is something we're using that exists in Python generally which is format strings. Okay. So when we do drinks head you will notice we've got 4900 and you've got the comma you've got 12,400 etc. But it affected y and not x. So the question is why did that happen? And you might have already figured this out by the name of the option which is float format because drinks.dtypes tells us that x was an int and y is a float. So there is no int format um option. There's a float format and it only affects floating point columns. Okay, so there isn't actually an easy way I know of to get it to affect uh integers as well. Um but it does it is pretty simple for floating point numbers. Okay. Uh the second part of Mark's question was how do we get this to affect plotting and the axes? Unfortunately, there's no option for that. For that, you will actually have to learn some mattplot liib and there should be a way to get it to do that with mattplot liib. Mattplot liib is the plotting library that controls the plots that pandas produces. Okay. All right. As always, we're going to end with a bonus. And in fact, I've got two bonuses for you today. The first bonus is uh let's say you're not connected to the internet or you just don't want to do yet another Google search to the pandas documentation, but you want to read up on the pandas options. How can you do that? Uh, and I'm going to show you a cool function called pd.describe option. You run that and it will display every option, the name of it, the type, the default value, and the current value on your system. Okay, you can scroll through it and you can see every last option. Pretty cool. And uh another trick with this one is um you can uh if you know a particular option you're trying to search or maybe you remember part of the name like you search for rows it will search and only show you the options with rows in the name. Okay, it's not searching this this text. It's just searching the names of the options. Okay. All right. Second bonus is let's say you've changed a bunch of options and you want to reset them without like restarting your kernel or restarting your uh IDE. Uh there's another trick for that which is PD.reset option and you're thinking like uh do I have to do it one by one? No. There is a special keyword all just pass it as a string and it will reset all of your options to the defaults. Now you will get a warning and that's because uh some of the options being reset have already been deprecated but you can safely ignore any warnings here. Okay, so that's it for today. Thank you so much for joining me. I really appreciate it. Uh, if you'd like to see more videos like this, please click subscribe. As always, I'd love to hear from you. So, please leave a comment below if you have a comment or a question, and maybe I'll answer your question in a future video. Uh, but that's it. Uh, thanks again for joining me, and I hope to see you again soon.

Original Description

Have you ever wanted to change the way your DataFrame is displayed? Perhaps you needed to see more rows or columns, or modify the formatting of numbers? In this video, I'll demonstrate how to change the settings for five common display options in pandas. SUBSCRIBE to learn data science with Python: https://www.youtube.com/dataschool?sub_confirmation=1 JOIN the "Data School Insiders" community and receive exclusive rewards: https://www.patreon.com/dataschool == RESOURCES == GitHub repository for the series: https://github.com/justmarkham/pandas-videos "get_option" documentation: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.get_option.html "set_option" documentation: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.set_option.html "reset_option" documentation: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.reset_option.html "describe_option" documentation: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.describe_option.html More about options and settings: http://pandas.pydata.org/pandas-docs/stable/options.html == LET'S CONNECT! == Newsletter: https://www.dataschool.io/subscribe/ Twitter: https://twitter.com/justmarkham Facebook: https://www.facebook.com/DataScienceSchool/ LinkedIn: https://www.linkedin.com/in/justmarkham/
Watch on YouTube ↗ (saves to browser)
Sign in to unlock AI tutor explanation · ⚡30

Playlist

Uploads from Data School · Data School · 50 of 60

1 Setting up Git and GitHub
Setting up Git and GitHub
Data School
2 Navigating a GitHub Repository - Part 1
Navigating a GitHub Repository - Part 1
Data School
3 Forking a GitHub Repository
Forking a GitHub Repository
Data School
4 Creating a New GitHub Repository
Creating a New GitHub Repository
Data School
5 Copying a GitHub Repository to Your Local Computer
Copying a GitHub Repository to Your Local Computer
Data School
6 Committing Changes in Git and Pushing to a GitHub Repository
Committing Changes in Git and Pushing to a GitHub Repository
Data School
7 Syncing Your GitHub Fork
Syncing Your GitHub Fork
Data School
8 Allstate Purchase Prediction Challenge on Kaggle
Allstate Purchase Prediction Challenge on Kaggle
Data School
9 Troubleshooting: Updates Rejected When Pushing to GitHub
Troubleshooting: Updates Rejected When Pushing to GitHub
Data School
10 Hands-on dplyr tutorial for faster data manipulation in R
Hands-on dplyr tutorial for faster data manipulation in R
Data School
11 ROC Curves and Area Under the Curve (AUC) Explained
ROC Curves and Area Under the Curve (AUC) Explained
Data School
12 Going deeper with dplyr: New features in 0.3 and 0.4 (tutorial)
Going deeper with dplyr: New features in 0.3 and 0.4 (tutorial)
Data School
13 What is machine learning, and how does it work?
What is machine learning, and how does it work?
Data School
14 Setting up Python for machine learning: scikit-learn and Jupyter Notebook
Setting up Python for machine learning: scikit-learn and Jupyter Notebook
Data School
15 Getting started in scikit-learn with the famous iris dataset
Getting started in scikit-learn with the famous iris dataset
Data School
16 Training a machine learning model with scikit-learn
Training a machine learning model with scikit-learn
Data School
17 Comparing machine learning models in scikit-learn
Comparing machine learning models in scikit-learn
Data School
18 Data science in Python: pandas, seaborn, scikit-learn
Data science in Python: pandas, seaborn, scikit-learn
Data School
19 Selecting the best model in scikit-learn using cross-validation
Selecting the best model in scikit-learn using cross-validation
Data School
20 How to find the best model parameters in scikit-learn
How to find the best model parameters in scikit-learn
Data School
21 How to evaluate a classifier in scikit-learn
How to evaluate a classifier in scikit-learn
Data School
22 What is pandas? (Introduction to the Q&A series)
What is pandas? (Introduction to the Q&A series)
Data School
23 How do I read a tabular data file into pandas?
How do I read a tabular data file into pandas?
Data School
24 How do I select a pandas Series from a DataFrame?
How do I select a pandas Series from a DataFrame?
Data School
25 Why do some pandas commands end with parentheses (and others don't)?
Why do some pandas commands end with parentheses (and others don't)?
Data School
26 How do I rename columns in a pandas DataFrame?
How do I rename columns in a pandas DataFrame?
Data School
27 How do I remove columns from a pandas DataFrame?
How do I remove columns from a pandas DataFrame?
Data School
28 How do I sort a pandas DataFrame or a Series?
How do I sort a pandas DataFrame or a Series?
Data School
29 How do I filter rows of a pandas DataFrame by column value?
How do I filter rows of a pandas DataFrame by column value?
Data School
30 How do I apply multiple filter criteria to a pandas DataFrame?
How do I apply multiple filter criteria to a pandas DataFrame?
Data School
31 Your pandas questions answered!
Your pandas questions answered!
Data School
32 How do I use the "axis" parameter in pandas?
How do I use the "axis" parameter in pandas?
Data School
33 How do I use string methods in pandas?
How do I use string methods in pandas?
Data School
34 How do I change the data type of a pandas Series?
How do I change the data type of a pandas Series?
Data School
35 When should I use a "groupby" in pandas?
When should I use a "groupby" in pandas?
Data School
36 How do I explore a pandas Series?
How do I explore a pandas Series?
Data School
37 How do I handle missing values in pandas?
How do I handle missing values in pandas?
Data School
38 What do I need to know about the pandas index? (Part 1)
What do I need to know about the pandas index? (Part 1)
Data School
39 What do I need to know about the pandas index? (Part 2)
What do I need to know about the pandas index? (Part 2)
Data School
40 How do I select multiple rows and columns from a pandas DataFrame?
How do I select multiple rows and columns from a pandas DataFrame?
Data School
41 Machine Learning with Text in scikit-learn (PyCon 2016)
Machine Learning with Text in scikit-learn (PyCon 2016)
Data School
42 When should I use the "inplace" parameter in pandas?
When should I use the "inplace" parameter in pandas?
Data School
43 How do I make my pandas DataFrame smaller and faster?
How do I make my pandas DataFrame smaller and faster?
Data School
44 How do I use pandas with scikit-learn to create Kaggle submissions?
How do I use pandas with scikit-learn to create Kaggle submissions?
Data School
45 More of your pandas questions answered!
More of your pandas questions answered!
Data School
46 How do I create dummy variables in pandas?
How do I create dummy variables in pandas?
Data School
47 How do I work with dates and times in pandas?
How do I work with dates and times in pandas?
Data School
48 How do I find and remove duplicate rows in pandas?
How do I find and remove duplicate rows in pandas?
Data School
49 How do I avoid a SettingWithCopyWarning in pandas?
How do I avoid a SettingWithCopyWarning in pandas?
Data School
How do I change display options in pandas?
How do I change display options in pandas?
Data School
51 How do I create a pandas DataFrame from another object?
How do I create a pandas DataFrame from another object?
Data School
52 How do I apply a function to a pandas Series or DataFrame?
How do I apply a function to a pandas Series or DataFrame?
Data School
53 Getting started with machine learning in Python (webcast)
Getting started with machine learning in Python (webcast)
Data School
54 Q&A about Machine Learning with Text (online course)
Q&A about Machine Learning with Text (online course)
Data School
55 Your pandas questions answered! (webcast)
Your pandas questions answered! (webcast)
Data School
56 Machine Learning with Text in scikit-learn (PyData DC 2016)
Machine Learning with Text in scikit-learn (PyData DC 2016)
Data School
57 Write Pythonic Code for Better Data Science (webcast)
Write Pythonic Code for Better Data Science (webcast)
Data School
58 Web scraping in Python (Part 1): Getting started
Web scraping in Python (Part 1): Getting started
Data School
59 Web scraping in Python (Part 2): Parsing HTML with Beautiful Soup
Web scraping in Python (Part 2): Parsing HTML with Beautiful Soup
Data School
60 Web scraping in Python (Part 3): Building a dataset
Web scraping in Python (Part 3): Building a dataset
Data School

This video teaches how to change display options in pandas to customize the way DataFrames are displayed, including options for rows, columns, and formatting. It provides step-by-step instructions on how to use the pd.get_option() and pd.set_option() functions to modify these options.

Key Takeaways
  1. Import pandas as pd
  2. Load data from a CSV file using pd.read_csv()
  3. Print the data using print()
  4. Use pd.get_option() to get the current value of a display option
  5. Use pd.set_option() to set a new value for a display option
  6. Change display options using pd.get_option() and pd.set_option()
  7. Use display.max_colwidth to change the maximum number of characters displayed in a column
  8. Use display.precision to change the number of decimal places displayed
  9. Use display.float_format to change the format of floating point numbers
💡 The pd.get_option() and pd.set_option() functions can be used to customize the display of DataFrames in pandas, including options for rows, columns, and formatting.

Related AI Lessons

Up next
Tasty Weird! Book 16 by Anh Do · Audiobook preview
Google Play Books
Watch →