pydit.wrangling.anonymise.anonymise_key

pydit.wrangling.anonymise.anonymise_key(df_list, key_list, map_table_or_dict=None, hash_list=None, create_new_hash_list=False, hash_list_size=1000000)[source]

Anonymise a column of one or many dataframes with a scrambled list of integers.

Will persist across the list and it will return the translation table used.

Parameters:
  • df_list (list of pandas.DataFrame) – List of dataframes to anonymise.

  • key_list (list of str) – List of key columns to anonymise, must align to the previous list of dataframes.

  • map_table_or_dict (dict or pandas.DataFrame, optional, default None) – Dictionary or dataframe to use for the translation table.

  • hash_list (list of int, optional, default None) – List of integers to use for the translation table.

  • create_new_hash_list (bool, optional, default False) – If True, a new list of integers will be created.

  • hash_list_size (int, optional, default 1000000) – Size of the hash list to create.

Returns:

A tuple of the translation table and the hash list.

Return type:

tuple