How do you pass a dataframe column into a function in python?

So I have a large amount of columns within a pandas dataframe and I need to pass groups of them through a function. The function is large but I'll create an example below. I am not sure how to pass the reference of df.varName to the function without getting the issue of the variable not being defined. When I try a function such as:

def bianco2[df, varX, varT]:
    stdX = np.std[df.varX]
    stdT = np.std[df.varT]
    newVar = stdX + stdT
    return newVar

I get the error that varX isn't defined. So I wrote the function where I would pass the whole phrase:

def bianco3[varX, varT]:
    stdX = np.std[varX]
    stdT = np.[varT]
    newVar = stdX + stdT
    return newVar

Where "varX = df.varX".

This worked but isn't practical for a large number of variables because I would still have to manually update each varX and varT. So I tried creating a list of variables in the format df.varX and then using a for loop to pass the list of variables. The issue is python sees it as a string and not a reference. I looked at using functools.partial but was unsuccessful.

Any ideas on how to write this in a simple format and be able to pass panda columns to a function?

asked Feb 22, 2017 at 21:13

1

You may want to try this ?

def bianco2[df, varX, varT]:
    stdX = np.std[df[varX]]
    stdT = np.std[df[varT]]
    newVar = stdX + stdT
    return newVar

print bianco2[df,'Customer','Policy']

input

   Policy  Customer  Employee CoveredDate   LapseDate
0     123      1234      1234  2011-06-01  2013-01-01
1     124      1234      1234  2016-01-01  2013-01-01
2     124      5678      5555  2014-01-01  2013-01-01

output

  2095.39309492

answered Feb 22, 2017 at 21:20

ShijoShijo

8,6752 gold badges17 silver badges30 bronze badges

1

View Discussion

Improve Article

Save Article

  • Read
  • Discuss
  • View Discussion

    Improve Article

    Save Article

    In this article, we will learn different ways to apply a function to single or selected columns or rows in Dataframe. We will use Dataframe/series.apply[] method to apply a function.

    Syntax: Dataframe/series.apply[func, convert_dtype=True, args=[]]

    Parameters: This method will take following parameters :
    func: It takes a function and applies it to all values of pandas series.
    convert_dtype: Convert dtype as per the function’s operation.
    args=[]: Additional arguments to pass to function instead of series.

    Return Type: Pandas Series after applied function/operation.

    Method 1: Using Dataframe.apply[] and lambda function.
    Example 1: For Column

    import pandas as pd

    import numpy as np

    matrix = [[1, 2, 3],

              [4, 5, 6],

              [7, 8, 9]

             ]

    df = pd.DataFrame[matrix, columns = list['xyz'], 

                      index = list['abc']]

    new_df = df.apply[lambda x: np.square[x] if x.name == 'z' else x]

    new_df

    Output :

    Example 2: For Row.

    import pandas as pd

    import numpy as np

    matrix = [[1, 2, 3],

              [4, 5, 6],

              [7, 8, 9]

             ]

    df = pd.DataFrame[matrix, columns = list['xyz'], 

                       index = list['abc']]

    new_df = df.apply[lambda x: np.square[x] if x.name == 'b' else x, 

                    axis = 1]

    new_df

    Output :

    Method 2: Using Dataframe/series.apply[] & [ ] Operator.

    Example 1: For Column.

    import pandas as pd

    import numpy as np

    matrix = [[1, 2, 3],

              [4, 5, 6],

              [7, 8, 9]

             ]

    df = pd.DataFrame[matrix, columns = list['xyz'], 

                       index = list['abc']]

    df['z'] = df['z'].apply[np.square]

    df

    Output :

    Example 2: For Row.

    import pandas as pd

    import numpy as np

    matrix = [[1, 2, 3],

              [4, 5, 6],

              [7, 8, 9]

             ]

    df = pd.DataFrame[matrix, columns = list['xyz'], 

                      index = list['abc']]

    df.loc['b'] = df.loc['b'].apply[np.square]

    df

    Output :

    Method 3: Using numpy.square[] method and [ ] operator.
    Example 1: For Column

    import pandas as pd

    import numpy as np

    matrix = [[1, 2, 3],

              [4, 5, 6],

              [7, 8, 9]

             ]

    df = pd.DataFrame[matrix, columns = list['xyz'], 

                      index = list['abc']]

    df['z'] = np.square[df['z']]

    print[df]

    Output :

    Example 2: For Row.

    import pandas as pd

    import numpy as np

    matrix = [[1, 2, 3],

              [4, 5, 6],

              [7, 8, 9]

             ]

    df = pd.DataFrame[matrix, columns = list['xyz'], index = list['abc']]

    df.loc['b'] = np.square[df.loc['b']]

    df

    Output :

    We can also apply a function to more than one column or row in the dataframe.

    Example 1: For Column

    import pandas as pd

    import numpy as np

    matrix = [[1, 2, 3],

              [4, 5, 6],

              [7, 8, 9]

             ]

    df = pd.DataFrame[matrix, columns = list['xyz'], 

                      index = list['abc']]

    new_df = df.apply[lambda x: np.square[x] if x.name in ['x', 'y'] else x]

    new_df

    Output :

    Example 2: For Row.

    import pandas as pd

    import numpy as np

    matrix = [[1, 2, 3],

              [4, 5, 6],

              [7, 8, 9]

             ]

    df = pd.DataFrame[matrix, columns = list['xyz'],

                      index = list['abc']]

    new_df = df.apply[lambda x: np.square[x] if x.name in ['b', 'c'] else x,

                     axis = 1]

    new_df

    Output :


    How do you apply a function to a DataFrame column?

    Create a two-dimensional, size-mutable, potentially heterogeneous tabular data, df..
    Print input DataFrame, df..
    Override column x with lambda x: x*2 expression using apply[] method..
    Print the modified DataFrame..

    Can we pass DataFrame to function?

    The apply[] function, as its name states allows us to apply a function to each row of a dataframe. The apply[] method passes into the the string_to_float[] function a row from the df dataframe one by one. The row['value'] is simply the row of the df dataframe that is being passed onto the function.

    How do I pass a specific column in pandas?

    Selecting columns based on their name This is the most basic way to select a single column from a dataframe, just put the string name of the column in brackets. Returns a pandas series. Passing a list in the brackets lets you select multiple columns at the same time.

    How do you use data frames in a function?

    DataFrame - apply[] function. The apply[] function is used to apply a function along an axis of the DataFrame. Objects passed to the function are Series objects whose index is either the DataFrame's index [axis=0] or the DataFrame's columns [axis=1].

    Chủ Đề