Linux Hint Posted July 31, 2023 Share Posted July 31, 2023 The “Count Distinct” is a common operation in data analysis that provides the number of unique values within a column. In Python, the “groupby()” function of “Pandas” is used along with other functions such as “nunique()”, “unique()”, and others, to group data by a common value and count the number of unique values in each group. This Python article will deliver a detailed guide on how to count the distinct value of the Pandas DataFrame group via the below methods: Using the “nunique()” Method Using the “value_counts()” Method Using the “unique()” Method Using the “agg()” Method Method 1: Determine the Count Distinct Values in Pandas DataFrame Group Using the “nunique()” Method The “nunique()” method is utilized in Python to retrieve the number of unique values in the Pandas DataFrame column. The particular method counts the distinct values of DataFrame groups. Example 1: Using Single Column Value The below code is utilized to count the distinct value of the single group of DataFrame: import pandas df = pandas.DataFrame({'Name': ['Lily', 'Carry', 'Lily', 'Sybil', 'Lily', 'Lily', 'Sybil'],'Age': [15, 17, 16, 19, 15, 15, 21],'Score': [55, 66, 25, 88, 55, 66, 18]}) print(df) df1 = df.groupby('Name')['Age'].nunique() print('\n',df1) In the above example, the “Pandas” module is imported, and the DataFrame is created with multiple columns. Next, the “df.groupby()” method groups the DataFrame based on a single column “Name”. After grouping, the “nunique()” method is applied to the group value to determine the distinct unique values. Output The distinct value of the specified DataFrame group is shown in the above output. Example 2: Using Multiple Column Value Let’s utilize the following code to count distinct values of the DataFrame group based on multiple columns: import pandas df = pandas.DataFrame({'Name': ['Lily', 'Carry', 'Lily', 'Sybil', 'Lily', 'Lily', 'Sybil'],'Age': [15, 17, 16, 19, 15, 15, 21],'Score': [55, 66, 25, 88, 55, 66, 18]}) print(df) df1 = df.groupby('Name')[['Age', 'Score']].nunique() print('\n',df1) In this code, the “df.groupby()” method is utilized to group the DataFrame of Pandas on a single column. The “nunique()” method is then used to determine the distinct values of the multiple columns. Output The distinct values of the multiple columns have been shown. Method 2: Determine the Count Distinct Values in Pandas DataFrame Group Using the “value_counts()” Method The “value_counts()” method is used to retrieve the count of the unique value of single or multiple columns. This method calculates the distinct value of a group of DataFrame. Example 1: Using Single Column Value Here is an example code to count the distinct value of a single column: import pandas df = pandas.DataFrame({'Name': ['Cyndy', 'Carry', 'Lily', 'Sybil', 'Cyndy', 'Lily', 'Sybil'],'Age': [15, 17, 18, 19, 15, 16, 19]}) print(df) df1 = df.groupby('Name')['Age'].value_counts() print('\n',df1) In the above code, the “df.groupby()” method is used along with the “value_counts()” method to count the distinct value of the single column named “Age”. Output The total distinct values for the specified group have been shown in the above snippet. Example 2: Using Multiple Columns Value Let’s overview this for multiple columns values: import pandas df = pandas.DataFrame({'Name': ['Cyndy', 'Carry', 'Lily', 'Sybil', 'Cyndy', 'Lily', 'Sybil'],'Age': [15, 17, 18, 19, 15, 16, 19],'Score': [55, 66, 55, 88, 55, 66, 88]}) print(df) df1 = df.groupby('Name')[['Age', 'Score']].value_counts() print('\n',df1) In the above code, the “df.groupby()” creates a group according to the particular column value. The “value_counts()” method is used to count the distinct value of the multiple columns for the created group. Output The total distinct values for the multiple groups have been returned. Method 3: Determine the Count Distinct Values in Pandas DataFrame Group Using the “unique()” Method The “unique()” method is used to find the unique data/value of the Pandas DataFrame. We can use the below code to count the distinct values of the DataFrame group: import pandas df = pandas.DataFrame({'Name': ['Lily', 'Carry', 'Lily', 'Sybil', 'Lily', 'Lily', 'Sybil'],'Age': [15, 17, 16, 19, 15, 15, 21],'Score': [55, 66, 25, 88, 55, 66, 18]}) print(df) df1 = df.groupby('Name')['Age'].unique() print('\n',df1) Here, in this code, the “df.groupby()” method is used to return the DataFrame having a unique value rather than a count. However, we can determine the distinct value by counting the unique value returned. Output The distinct values of the specified column have been returned successfully. Method 4: Determine the Count Distinct Values in Pandas DataFrame Group Using the “agg()” Method The agg() method can also be utilized to count the distinct values of the Pandas DataFrame group. Here is an example: import pandas df = pandas.DataFrame({'Name': ['Lily', 'Carry', 'Lily', 'Sybil', 'Lily', 'Lily', 'Sybil'],'Age': [15, 17, 16, 19, 15, 15, 21],'Score': [55, 66, 25, 88, 55, 66, 18]}) print(df) df = df.groupby('Name')[['Age']].agg(['nunique']) print('\n',df) In the above code, the “df.groupby()” method is used along with the “agg()” method to return the distinct value of the specified columns according to the specified group. Output The total distinct value has been calculated/determined. Conclusion The “nunique()”, “value_counts()”, “unique()”, and the “agg()” methods are used to determine the count of distinct values in the Pandas DataFrame group. These methods help us count distinct values of single or multiple DataFrame columns based on the group value. The DataFrame first groups by the specific columns and then applies all of these methods to determine the distinct value. This blog has delivered a detailed guide on counting the distinct value of Pandas DataFrame using numerous examples. View the full article Quote Link to comment Share on other sites More sharing options...

## Recommended Posts

## Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.