Convert number to percentage python dataframe

I have a dataframe that I created by a groupby:

hmdf = pd.DataFrame(hm01)
new_hm01 = hmdf[['FinancialYear','Month','FirstReceivedDate']]

hm05 = new_hm01.pivot_table(index=['FinancialYear','Month'], aggfunc='count')
vals1 = ['April    ', 'May      ', 'June     ', 'July     ', 'August   ', 'September', 'October  ', 'November ', 'December ', 'January  ', 'February ', 'March    ']

df_hm = new_hm01.groupby(['Month', 'FinancialYear']).size().unstack(fill_value=0).rename(columns=lambda x: '{}'.format(x))
df_hml = df_hm.reindex(vals1)

The DF looks like this:

FinancialYear   2014/2015   2015/2016   2016/2017   2017/2018
Month               
April               34          24          22          20
May                 29          26          21          25
June                19          39          22          20
July                23          39          18          20
August              36          30          34           0
September           35          23          41           0
October             36          37          27           0
November            38          31          30           0
December            36          41          23           0
January             34          30          35           0
February            37          26          37           0
March               36          31          33           0

The column names are from variables (threeYr,twoYr,oneYr,Yr), and I want to convert the dataframe so that the numbers are percentages of the total for each column, but I cant get it to work.

This is what I want:

FinancialYear       2014/2015   2015/2016   2016/2017   2017/2018
Month               
April                   9%          6%          6%         24%
May                     7%          7%          6%         29%
June                    5%         10%          6%         24%
July                    6%         10%          5%         24%
August                  9%          8%         10%          0%
September               9%          6%         12%          0%
October                 9%         10%          8%          0%
November               10%          8%          9%          0%
December                9%         11%          7%          0%
January                 9%          8%         10%          0%
February                9%          7%         11%          0%
March                   9%          8%         10%          0%

Could anyone help me with doing this?

Edit: I tried the response found at this link: pandas convert columns to percentages of the totals..... I could not get that to work for my dataframe + it does not explain well (to me) how to make it work for any DF. The response from John Galt I believe is better than that response (my opinion).

Data Cleaning and Formatting Tricks for Pandas Beginners

Convert number to percentage python dataframe

Image by Pixabay (Modified by Author)

In the data world, raw data rarely comes to us with a ready-to-consume format. Some level of data cleaning, wrangling, and formatting is almost always needed before we move forward to the data analysis or modeling phase.

I wrote a few posts in my blog last week that aim to help beginners save some time in figuring out how to do some very common yet a bit tricky data wrangling tasks in Pandas. I hope you find those short tutorials and code snippets helpful and convenient (links have been provided at the end of this article). In this post, I will continue to share with you another piece of code that deals with a very common data wrangling task: convert a percentage string column to numeric or vice versa.

Convert Percentage String to Numeric

Let’s look at a simple example below using a sample dataframe that I created from Realtor.com’s open dataset. The raw data can be downloaded for free from here.

Convert number to percentage python dataframe

Image by Author

In this sample dataframe, we can see that the median_listing_price_yy and active_listing_count_yy are displayed as percentages and treated as strings. This may be fine when presenting the table as a report but will be impossible for us to perform any meaningful mathematic operations or analysis on them as they are not numeric variables. How do we convert these percentage strings to numeric data types?

The solution here is to first use pandas.Series.str.rstrip() method to remove the trailing ‘%’ character and then use astype(float) to convert it to numeric. You can also use Series.str.lstrip() to remove leading characters in series and useSeries.str.strip() to remove both leading and trailing characters in series.

This is the piece of the code that does the trick of converting a percentage string to numeric using our example:

df['median_listing_price_yy'] = df['median_listing_price_yy'].str.rstrip("%").astype(float)/100

Convert number to percentage python dataframe

Image by Author

If you want to change the decimal places, say to 2 decimal points, you can use the following code to do it:

pd.options.display.float_format = '{:,.2f}'.format

Convert number to percentage python dataframe

Image by Author

Convert Numeric to Percentage String

Now how to do this vice versa — to convert the numeric back to the percentage string? To convert it back to percentage string, we will need to use python’s string format syntax '{:.2%}’.format to add the ‘%’ sign back. Then we use python’s map() function to iterate and apply the formatting to all the rows in the ‘median_listing_price_yy’ column.

df.loc[:, "median_listing_price_yy"] =df["median_listing_price_yy"].map('{:.2%}'.format)

Convert number to percentage python dataframe

Image by Author

To summarize, if you have a percentage string column in your Pandas dataframe and want to convert it to numeric/float, use the following code:

df[column] = df[column].str.rstrip("%").astype(float)/100

If you have a numeric column and want to convert it to a percentage string, use this code:

df.loc[:, column] = df[column].map('{:.2%}'.format)

Thanks for reading! I hope you find this short tutorial helpful. Here are a few more Pandas beginners’ tutorials for data cleaning and formatting if you are interested.

You can unlock full access to my writing and the rest of Medium by signing up for Medium membership ($5 per month) through this referral link. By signing up through this link, I will receive a portion of your membership fee at no additional cost to you. Thank you!

How do I change a number to percentage in pandas?

Convert Numeric to Percentage String To convert it back to percentage string, we will need to use python's string format syntax '{:. 2%}'. format to add the '%' sign back. Then we use python's map() function to iterate and apply the formatting to all the rows in the 'median_listing_price_yy' column.

How do you find the percentage of a DataFrame in Python?

You can caluclate pandas percentage with total by groupby() and DataFrame. transform() method. The transform() method allows you to execute a function for each value of the DataFrame. Here, the percentage directly summarized DataFrame, then the results will be calculated using all the data.

How do you calculate percent change in a DataFrame in Python?

The pct_change() method returns a DataFrame with the percentage difference between the values for each row and, by default, the previous row. Which row to compare with can be specified with the periods parameter.

What does Pct_change do in pandas?

Percentage change between the current and a prior element. Computes the percentage change from the immediately previous row by default. This is useful in comparing the percentage of change in a time series of elements.