pydit.wrangling.coalesce_dataframe_columns.coalesce_columns¶
- pydit.wrangling.coalesce_dataframe_columns.coalesce_columns(df: DataFrame, *column_names, target_column_name: str | None = None, default_value: int | float | str | None = None, operation=None, separator=' ', silent: bool = False) DataFrame[source]¶
Coalesce columns.
Coalesce means to merge multiple columns together. The first non null value prevails by default, but we can also set it to take the last, or concatenate all values.
- Parameters:
df (pandas.DataFrame) – The dataframe to clean up
*column_names (str) – The column names to coalesce
target_column_name (str, optional, default None) – The name of the column to store the coalesced values in. If None, the values will be stored in the first column.
default_value (int, float, str, optional, default None) – The default value to use if the target column is empty.
operation (str, optional, "concatenate","last" or None, default None) – If None, the first non nan value will prevail, from left to right, ignoring the rest of the row. If “concatenate”, all values will be converted to text and concatenated together, nans will be replaced with nullstring. If “last”, the last non nan value will prevail, from right to left,
separator (str, optional, default " ") – The separator to use when concatenating values
silent (bool, optional, default False) – If True, suppress logging output
- Returns:
Pandas DataFrame with coalesced column.
- Return type:
pandas.DataFrame