PickupBrain

Pandas Head() & Tail() Functions

Pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.

Pandas introduces two new data types to Python: Series and DataFrame.

Visit here to know about DataFrame and how to Create DataFrame.

Sometimes data are available in millions or billions or even more than that. And you don’t want to see the whole data, you glimpse of data of what you are working with. For this purpose you need head() & tail() method.

head() Function

Suppose we want to extract the data of only the top 5 rows from our dataset. When this type of problem arises, we can use the head() method, which is defined in the Pandas library to extract the top n rows of a dataset.

The head() method is used for returning top n (by default value 5) rows of a DataFrame or Series.

#by default the read_csv function will read a comma separated file

import pandas as pd
df = pd.read_csv('CompleteCricketData.csv')

#we use the head function with argument 3 so Python only shows us the first 3 rows
print(df.head(3))

#Output
Unnamed: 0      Player  Country   ...      Ground        Date   
0           0   RG Sharma    INDIA   ...     Kolkata   11/13/2014          
1           1  MJ Guptill       NZ   ...  Wellington    3/21/2015          
2           2    V Sehwag    INDIA   ...      Indore    12/8/2011          

The head() method in above example contains only one parameter, which is 3. It is an optional parameter. By setting it, we fix the number of rows we want from the DataFrame.

This is useful to see if our data loaded properly, get a sense of the columns, its name and its contents.

tail() Function

The tail() function returns last n rows from the object. It is useful for quickly verifying data.

If argument is not provided then it returns the last 5 rows of the data and with specified n arguments, gets the last n rows of data.

#example of tail method without argument
import pandas as pd
df = pd.read_csv('CompleteCricketData.csv')

#we use the tail function without argument so it only shows us the last 5 rows
print(df.tail())

#Output
       Unnamed: 0          Player   ...      Date   Unnamed: 15
92847       92847      KA Maharaj   ...   3/7/2020          NaN
92848       92848  AL Phehlukwayo   ...   3/7/2020          NaN
92849       92849        A Nortje   ...   3/7/2020          NaN
92850       92850         A Zampa   ...  3/13/2020          NaN
92851       92851    JR Hazlewood   ...  3/13/2020          NaN

[5 rows x 17 columns]

#example of tail method with argument
import pandas as pd
df = pd.read_csv('CompleteCricketData.csv')

#we use the tail function with argument 2 so it only shows us the last 2 rows
print(df.tail(2))
    unnamed: 0        Player  Country   ...  Ground       Date   Unnamed: 15
92850       92850       A Zampa      AUS   ...  Sydney   3/13/2020       NaN
92851       92851  JR Hazlewood      AUS   ...  Sydney   3/13/2020      NaN

[2 rows x 17 columns]

The tail() method in above example is an optional parameter. By setting it, we fix the number of rows we want from the DataFrame.

Head() & tail() functions are not only used to get the top and bottom lines but also are used every time changes that you have made.

Lets see and example, I have created a dataframe from dictionary.

#Example of head and tail method
#Creating a dataframe from dictionary
import pandas as pd
sample_data = {'Model Number':['1101', '1102', '1103', '1104', '1105', '1106', '1107', '1108', '1109','1110'],
        'Price':[10000, 20000, 30000, 40000, 7000, 50000, 500, 4500, 6800, 5500],
              'Quantity':[2, 3, 4, 5, 2, 8, 10, 15, 20, 4]}
df = pd.DataFrame(sample_data)

#Creating Revenue
df['Revenue'] = df['Quantity'] * df['Price']

#Get data from head method
df.head(3)

#Output
Model Number	Price	Quantity	Revenue
0	1101	10000	2	20000
1	1102	20000	3	60000
2	1103	30000	4	120000


#get data from tail method
df.tail()

#Output
Model Number	Price	Quantity	Revenue
5	1106	50000	8	400000
6	1107	500	10	5000
7	1108	4500	15	67500
8	1109	6800	20	136000
9	1110	5500	4	22000

Here, I have 3 columns and 10 rows. Then I want to add a Revenue column, for that I specified Revenue as Quantity * Price.

Now to check whether modification is done successfully or not we used head() and tail() method.

These functions saves a lot of time as instead of fetching whole data we can use these methods.

2 thoughts on “Pandas Head() & Tail() Functions

Leave a Reply