Featurizer#

from cider.datastore import DataStore
from cider.featurizer import Featurizer

Load some mobile phone metadata. See standardized data formats for file schemas.

# This path should point to your cider installation, where configs and data for this demo are located.
from pathlib import Path
cider_installation_directory = Path('../../cider')

datastore = DataStore(config_file_path_string= cider_installation_directory / 'configs' / 'config_quickstart.yml')
featurizer = Featurizer(datastore=datastore, clean_folders=True)
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
23/03/31 13:21:02 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Loading CDR...
                                                                                
Loading recharges...
SUCCESS!
Loading mobile data...
Loading mobile data...
Loading antennas...
Warning: 10 antennas missing location

Remove duplicate records, filter to just a specific date range, remove outlier days and spammers based on call and text volumes, and remove duplicate records in CDR, recharges, mobile data records, and mobile money records.

# Deduplication
featurizer.ds.deduplicate()

# Filter to just January 1 - February 28 (inclusive)
featurizer.ds.filter_dates('2020-01-01', '2020-02-28')

# Remove transactions involving spammers who place 1.8+ calls/texts per active day
spammers = featurizer.ds.remove_spammers(spammer_threshold=1.8)
                                                                                
Number of spammers identified: 19
# Remove all records from days more than 2 standard deviations from the mean transaction volume
outlier_days = featurizer.ds.filter_outlier_days(num_sds=2)
Outliers removed: 2020-01-24, 2020-02-06
/Users/leo/Documents/gpl/cider/cider/datastore.py:435: FutureWarning: The default value of numeric_only in DataFrameGroupBy.sum is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.
  timeseries = timeseries.groupby('day', as_index=False).agg('sum')

Produce summary statistics and plots.

print(featurizer.diagnostic_statistics())
                                                                                
{'CDR': {'Days': 59, 'Transactions': 91299, 'Subscribers': 981, 'Recipients': 981}, 'Recharges': {'Days': 59, 'Transactions': 9315, 'Subscribers': 783}, 'Mobile Data': {'Days': 59, 'Transactions': 9388, 'Subscribers': 592}, 'Mobile Money': {'Days': 59, 'Transactions': 5371, 'Subscribers': 682, 'Recipients': 682}}
featurizer.diagnostic_plots()
                                                                                
../_images/0baa2d1b6614760957f9c7e130f7643fa9a8215740f5064c4d19e2340c53e953.png ../_images/420e56c9d04afac8d7ab1ff9b8ead898e518fa45637dedd63079e362ed4b01b4.png ../_images/a76c6017f94af833687e9ea137109ae454f801502a006c5c4a7ab37753095121.png ../_images/217dfbafff42bf53902754630d3584ef1c28e1337a4387022174d048c860ade6.png ../_images/f6ed9b611ba83b40a98a9ebee3fd017cfffefd940355f9658bb0470189b6f5f0.png ../_images/819d435fae7691f4cca334b738404eff509e379ce03d3b5f09cef917ecfc9339.png ../_images/5c3f745f8a89c385648a165c03dcd08d14046cfd345ccf30d0b433570f0c819d.png ../_images/abfa3b5918b21c75aa6119e9ac154ad5acbf137582a34f6c23f3107c712d259c.png

Featurize the data

featurizer.cdr_features_spark()
featurizer.international_features()
featurizer.location_features()
featurizer.recharges_features()
featurizer.mobiledata_features()
featurizer.mobilemoney_features()

featurizer.all_features()

Now we read the features back with pandas in to see what the table looks like. This works fine because our synthetic dataset is small, but such files can be too large to fit in memory if the number of subscribers is large; Cider uses pyspark to manage large datasets. Another option for working with large datasets is dask.

import pandas as pd
path_to_all_features = datastore.cfg.path.working.directory_path / 'featurizer' / 'datasets' / 'features.csv'
pd.read_csv(path_to_all_features).head()
name active_days_allweek_allday active_days_allweek_day active_days_allweek_night active_days_weekday_allday active_days_weekday_day active_days_weekday_night active_days_weekend_allday active_days_weekend_day active_days_weekend_night ... mobilemoney_outgoing_p2p_amount_min mobilemoney_outgoing_p2p_amount_max mobilemoney_outgoing_p2p_balance_before_mean mobilemoney_outgoing_p2p_balance_before_min mobilemoney_outgoing_p2p_balance_before_max mobilemoney_outgoing_p2p_balance_after_mean mobilemoney_outgoing_p2p_balance_after_min mobilemoney_outgoing_p2p_balance_after_max mobilemoney_outgoing_p2p_txns mobilemoney_outgoing_p2p_contacts
0 AASxnSdfla 56 46 50 40 31 35 16 15 15 ... 31.799332 58.921886 187.554639 138.324660 228.47707 138.780946 106.525330 180.20863 6.0 6.0
1 ANnpaBqoKb 55 45 47 40 36 33 15 9 14 ... 37.976850 57.337646 182.190155 158.963560 223.29039 137.548449 106.978870 185.31354 3.0 3.0
2 AWrbWzWkYp 55 45 46 39 32 33 16 13 13 ... 53.090126 58.511192 202.108686 108.117546 272.34055 146.345222 52.118343 219.25041 4.0 4.0
3 AaJProAtlR 56 48 48 41 36 34 15 12 14 ... 32.858997 68.323395 235.638050 194.502760 287.86615 188.334709 143.178710 242.73692 4.0 4.0
4 ApDvhzOUJr 56 39 51 40 28 36 16 11 15 ... 44.307076 52.375270 192.653709 185.676930 199.63050 144.312531 133.301650 155.32341 2.0 2.0

5 rows × 1122 columns

Plot the distributions of some of the features.

featurizer.feature_plots()
../_images/6edc93f3bd147a3ba30a78f00d565a62afa0922a3b5f52e6823bb29381fdb563.png ../_images/c3c7b99b062a9f9ee1cf41f6a003a421b71fdb662588ede5ec2413f61897d445.png ../_images/2d9d29422f58830a2fe668aba33dc15094de9208fc4add44c9d0a910c6ff0c2d.png
                                                                                
../_images/6a8f245ed4957a8e7919d30dbf7212dcd4f343e45ee43a98caeb11e062d9615e.png ../_images/f7aa28c99ea0127f7ec7f475e2a24eb5ed0cd67ce5f0ae3072db97309334703d.png ../_images/3e33209d627669119e29153792b348d13c767db3cf54abc0a6384d3674d764f7.png ../_images/ca450cfe24392609460b6116a1cde5bf5a41d900340b1f33da06c7ef3239024a.png ../_images/5de62f65cbb0fa0afd91a65567f60364ff190cb253fe2f989b8fd023d19811a1.png
---------------------------------------------------------------------------
AnalysisException                         Traceback (most recent call last)
Cell In[51], line 1
----> 1 featurizer.feature_plots()

File ~/Documents/gpl/cider/cider/featurizer.py:739, in Featurizer.feature_plots(self, read_from_disk)
    737 if subscribers is not None:
    738     users = self.features['cdr'].join(subscribers, how='inner', on='name')
--> 739     slice = users.select(['name', features[a]]).toPandas()
    740     slice['slice_name'] = slice_name
    741     boxplot.append(slice)

File ~/miniconda3/envs/cider/lib/python3.8/site-packages/pyspark/sql/dataframe.py:2023, in DataFrame.select(self, *cols)
   2002 def select(self, *cols: "ColumnOrName") -> "DataFrame":  # type: ignore[misc]
   2003     """Projects a set of expressions and returns a new :class:`DataFrame`.
   2004 
   2005     .. versionadded:: 1.3.0
   (...)
   2021     [Row(name='Alice', age=12), Row(name='Bob', age=15)]
   2022     """
-> 2023     jdf = self._jdf.select(self._jcols(*cols))
   2024     return DataFrame(jdf, self.sparkSession)

File ~/miniconda3/envs/cider/lib/python3.8/site-packages/py4j/java_gateway.py:1321, in JavaMember.__call__(self, *args)
   1315 command = proto.CALL_COMMAND_NAME +\
   1316     self.command_header +\
   1317     args_command +\
   1318     proto.END_COMMAND_PART
   1320 answer = self.gateway_client.send_command(command)
-> 1321 return_value = get_return_value(
   1322     answer, self.gateway_client, self.target_id, self.name)
   1324 for temp_arg in temp_args:
   1325     temp_arg._detach()

File ~/miniconda3/envs/cider/lib/python3.8/site-packages/pyspark/sql/utils.py:196, in capture_sql_exception.<locals>.deco(*a, **kw)
    192 converted = convert_exception(e.java_exception)
    193 if not isinstance(converted, UnknownException):
    194     # Hide where the exception came from that shows a non-Pythonic
    195     # JVM exception message.
--> 196     raise converted from None
    197 else:
    198     raise

AnalysisException: Column 'cdr_active_days__allweek__day__callandtext' does not exist. Did you mean one of the following? [active_days_allweek_allday, active_days_weekday_allday, active_days_allweek_day, active_days_allweek_night, active_days_weekday_night, active_days_weekend_allday, active_days_weekday_day, active_days_weekend_night, call_duration_allweek_day_call_max, call_duration_allweek_day_call_mean, call_duration_allweek_day_call_std, active_days_weekend_day, call_duration_allweek_day_call_median, call_duration_allweek_day_call_min, interevent_time_allweek_day_call_max, interevent_time_allweek_day_call_mean, interevent_time_allweek_day_call_std, balance_of_contacts_allweek_day_call_max, balance_of_contacts_allweek_day_call_mean, balance_of_contacts_allweek_day_call_std, call_duration_allweek_allday_call_max, call_duration_allweek_allday_call_mean, call_duration_allweek_allday_call_std, call_duration_allweek_day_call_kurtosis, call_duration_allweek_day_call_skewness, interevent_time_allweek_day_call_median, interevent_time_allweek_day_call_min, balance_of_contacts_allweek_day_call_median, balance_of_contacts_allweek_day_call_min, call_duration_allweek_allday_call_median, call_duration_allweek_allday_call_min, call_duration_allweek_night_call_max, call_duration_allweek_night_call_mean, call_duration_allweek_night_call_std, interevent_time_allweek_allday_call_max, interevent_time_allweek_allday_call_mean, interevent_time_allweek_allday_call_std, interevent_time_allweek_day_call_kurtosis, interevent_time_allweek_day_call_skewness, number_of_contacts_allweek_day_call, number_of_contacts_allweek_day_text, balance_of_contacts_allweek_allday_call_max, balance_of_contacts_allweek_allday_call_mean, balance_of_contacts_allweek_allday_call_std, balance_of_contacts_allweek_day_call_kurtosis, balance_of_contacts_allweek_day_call_skewness, call_duration_allweek_allday_call_kurtosis, call_duration_allweek_allday_call_skewness, call_duration_allweek_night_call_median, call_duration_allweek_night_call_min, call_duration_weekday_day_call_max, call_duration_weekday_day_call_mean, call_duration_weekday_day_call_std, call_duration_weekend_day_call_max, call_duration_weekend_day_call_mean, call_duration_weekend_day_call_std, entropy_of_contacts_allweek_day_call, entropy_of_contacts_allweek_day_text, entropy_of_contacts_weekday_allday_text, interactions_per_contact_allweek_day_call_max, interactions_per_contact_allweek_day_call_mean, interactions_per_contact_allweek_day_call_std, interevent_time_allweek_allday_call_median, interevent_time_allweek_allday_call_min, interevent_time_allweek_night_call_max, interevent_time_allweek_night_call_mean, interevent_time_allweek_night_call_std, number_of_contacts_allweek_allday_text, number_of_contacts_weekday_allday_text, percent_at_home_allweek_allday, percent_at_home_weekday_allday, percent_nocturnal_weekday_call, percent_nocturnal_weekday_text, response_delay_text_allweek_day_max, response_delay_text_allweek_day_mean, response_delay_text_allweek_day_median, response_delay_text_allweek_day_skewness, response_delay_text_weekday_allday_max, response_delay_text_weekday_allday_mean, response_rate_text_weekday_allday, balance_of_contacts_allweek_allday_call_median, balance_of_contacts_allweek_allday_call_min, balance_of_contacts_allweek_night_call_max, balance_of_contacts_allweek_night_call_mean, balance_of_contacts_allweek_night_call_std, call_duration_allweek_night_call_kurtosis, call_duration_allweek_night_call_skewness, call_duration_weekday_day_call_median, call_duration_weekday_day_call_min, call_duration_weekend_day_call_median, call_duration_weekend_day_call_min, entropy_of_contacts_allweek_allday_text, frequent_antennas_weekday_allday, interactions_per_contact_allweek_day_call_median, interactions_per_contact_allweek_day_call_min, interevent_time_allweek_allday_call_kurtosis, interevent_time_allweek_allday_call_skewness, interevent_time_allweek_day_text_max, interevent_time_allweek_day_text_mean, interevent_time_allweek_day_text_median, interevent_time_allweek_day_text_std, interevent_time_allweek_night_call_median, interevent_time_allweek_night_call_min, number_of_contacts_allweek_allday_call, percent_at_home_allweek_day, response_delay_text_allweek_day_min, response_delay_text_allweek_day_std, response_delay_text_weekday_allday_min, response_delay_text_weekday_allday_std, balance_of_contacts_allweek_allday_call_kurtosis, balance_of_contacts_allweek_allday_call_skewness, balance_of_contacts_allweek_day_text_max, balance_of_contacts_allweek_day_text_mean, balance_of_contacts_allweek_day_text_median, balance_of_contacts_allweek_day_text_std, balance_of_contacts_allweek_night_call_median, balance_of_contacts_allweek_night_call_min, balance_of_contacts_weekday_day_call_max, balance_of_contacts_weekday_day_call_mean, balance_of_contacts_weekday_day_call_std, balance_of_contacts_weekend_day_call_max, balance_of_contacts_weekend_day_call_mean, balance_of_contacts_weekend_day_call_std, call_duration_weekday_allday_call_max, call_duration_weekday_allday_call_mean, call_duration_weekday_allday_call_std, call_duration_weekday_day_call_kurtosis, call_duration_weekday_day_call_skewness, call_duration_weekend_day_call_kurtosis, call_duration_weekend_day_call_skewness, entropy_of_antennas_weekday_allday, entropy_of_contacts_allweek_allday_call, entropy_of_contacts_weekday_day_text, entropy_of_contacts_weekend_allday_text, frequent_antennas_allweek_allday, interactions_per_contact_allweek_allday_call_max, interactions_per_contact_allweek_allday_call_mean, interactions_per_contact_allweek_allday_call_std, interactions_per_contact_allweek_day_call_kurtosis, interactions_per_contact_allweek_day_call_skewness, interevent_time_allweek_day_text_kurtosis, interevent_time_allweek_day_text_min, interevent_time_allweek_day_text_skewness, interevent_time_allweek_night_call_kurtosis, interevent_time_allweek_night_call_skewness, interevent_time_weekday_day_call_max, interevent_time_weekday_day_call_mean, interevent_time_weekday_day_call_std, interevent_time_weekend_day_call_max, interevent_time_weekend_day_call_mean, interevent_time_weekend_day_call_std, number_of_antennas_allweek_allday, number_of_antennas_weekday_allday, number_of_contacts_allweek_night_call, number_of_contacts_allweek_night_text, number_of_contacts_weekday_day_text, number_of_contacts_weekend_allday_text, number_of_interactions_in_allweek_day_call, number_of_interactions_in_allweek_day_text, number_of_interactions_in_weekday_allday_text, number_of_interactions_out_weekday_allday_text, percent_at_home_allweek_night, percent_at_home_weekday_night, percent_at_home_weekend_allday, percent_nocturnal_allweek_call, percent_nocturnal_allweek_text, percent_nocturnal_weekend_call, percent_nocturnal_weekend_text, radius_of_gyration_weekday_allday, response_delay_text_allweek_allday_max, response_delay_text_allweek_allday_mean, response_delay_text_allweek_allday_median, response_delay_text_allweek_allday_skewness, response_delay_text_allweek_day_kurtosis, response_delay_text_weekday_allday_median, response_delay_text_weekday_day_max, response_delay_text_weekday_day_mean, response_delay_text_weekday_day_median, response_delay_text_weekday_day_std, response_delay_text_weekend_allday_max, response_delay_text_weekend_allday_mean, response_rate_text_allweek_allday, response_rate_text_weekday_night, response_rate_text_weekend_allday, balance_of_contacts_allweek_day_text_kurtosis, balance_of_contacts_allweek_day_text_min, balance_of_contacts_allweek_day_text_skewness, balance_of_contacts_allweek_night_call_kurtosis, balance_of_contacts_allweek_night_call_skewness, balance_of_contacts_weekday_allday_text_max, balance_of_contacts_weekday_allday_text_min, balance_of_contacts_weekday_allday_text_std, balance_of_contacts_weekday_day_call_median, balance_of_contacts_weekday_day_call_min, balance_of_contacts_weekend_day_call_median, balance_of_contacts_weekend_day_call_min, call_duration_weekday_allday_call_median, call_duration_weekday_allday_call_min, call_duration_weekday_night_call_max, call_duration_weekday_night_call_mean, call_duration_weekday_night_call_std, call_duration_weekend_allday_call_max, call_duration_weekend_allday_call_mean, call_duration_weekend_allday_call_std, entropy_of_antennas_allweek_allday, entropy_of_contacts_allweek_night_call, entropy_of_contacts_allweek_night_text, entropy_of_contacts_weekday_night_text, frequent_antennas_allweek_day, frequent_antennas_weekday_night, frequent_antennas_weekend_allday, interactions_per_contact_allweek_allday_call_median, interactions_per_contact_allweek_allday_call_min, interactions_per_contact_allweek_night_call_max, interactions_per_contact_allweek_night_call_mean, interactions_per_contact_allweek_night_call_std, interactions_per_contact_weekday_day_call_max, interactions_per_contact_weekday_day_call_mean, interactions_per_contact_weekday_day_call_std, interactions_per_contact_weekend_day_call_max, interactions_per_contact_weekend_day_call_mean, interactions_per_contact_weekend_day_call_std, interevent_time_allweek_allday_text_max, interevent_time_allweek_allday_text_mean, interevent_time_allweek_allday_text_median, interevent_time_allweek_allday_text_min, interevent_time_allweek_allday_text_std, interevent_time_weekday_allday_text_max, interevent_time_weekday_allday_text_min, interevent_time_weekday_allday_text_std, interevent_time_weekday_day_call_median, interevent_time_weekday_day_call_min, interevent_time_weekend_day_call_median, interevent_time_weekend_day_call_min, number_of_antennas_allweek_day, number_of_contacts_weekday_night_text, number_of_interactions_alldir_weekday_allday_text, number_of_interactions_in_allweek_allday_text, number_of_interactions_out_allweek_day_call, number_of_interactions_out_allweek_day_text, percent_at_home_weekday_day, percent_pareto_interactions_weekday_allday_text, radius_of_gyration_allweek_allday, response_delay_text_allweek_allday_min, response_delay_text_allweek_allday_std, response_delay_text_allweek_night_max, response_delay_text_allweek_night_mean, response_delay_text_allweek_night_median, response_delay_text_allweek_night_skewness, response_delay_text_weekday_day_kurtosis, response_delay_text_weekday_day_min, response_delay_text_weekday_day_skewness, response_delay_text_weekday_night_max, response_delay_text_weekday_night_mean, response_delay_text_weekday_night_std, response_delay_text_weekend_allday_min, response_delay_text_weekend_allday_std, response_delay_text_weekend_day_max, response_delay_text_weekend_day_mean, response_delay_text_weekend_day_median, response_delay_text_weekend_day_skewness, response_rate_text_allweek_day, response_rate_text_weekday_day, balance_of_contacts_allweek_allday_text_max, balance_of_contacts_allweek_allday_text_mean, balance_of_contacts_allweek_allday_text_median, balance_of_contacts_allweek_allday_text_min, balance_of_contacts_allweek_allday_text_std, balance_of_contacts_weekday_allday_text_mean, balance_of_contacts_weekday_day_call_kurtosis, balance_of_contacts_weekday_day_call_skewness, balance_of_contacts_weekday_day_text_std, balance_of_contacts_weekend_day_call_kurtosis, balance_of_contacts_weekend_day_call_skewness, call_duration_weekday_allday_call_kurtosis, call_duration_weekday_allday_call_skewness, call_duration_weekday_night_call_median, call_duration_weekday_night_call_min, call_duration_weekend_allday_call_median, call_duration_weekend_allday_call_min, call_duration_weekend_night_call_max, call_duration_weekend_night_call_mean, call_duration_weekend_night_call_std, entropy_of_antennas_allweek_day, entropy_of_antennas_weekday_night, entropy_of_antennas_weekend_allday, entropy_of_contacts_weekday_allday_call, entropy_of_contacts_weekday_day_call, entropy_of_contacts_weekend_day_call, entropy_of_contacts_weekend_day_text, frequent_antennas_allweek_night, frequent_antennas_weekday_day, interactions_per_contact_allweek_allday_call_kurtosis, interactions_per_contact_allweek_allday_call_skewness, interactions_per_contact_allweek_day_text_max, interactions_per_contact_allweek_day_text_mean, interactions_per_contact_allweek_day_text_median, interactions_per_contact_allweek_day_text_std, interactions_per_contact_allweek_night_call_median, interactions_per_contact_allweek_night_call_min, interactions_per_contact_weekday_allday_text_max, interactions_per_contact_weekday_allday_text_min, interactions_per_contact_weekday_allday_text_std, interactions_per_contact_weekday_day_call_median, interactions_per_contact_weekday_day_call_min, interactions_per_contact_weekend_day_call_median, interactions_per_contact_weekend_day_call_min, interevent_time_allweek_allday_text_kurtosis, interevent_time_allweek_allday_text_skewness, interevent_time_allweek_night_text_max, interevent_time_allweek_night_text_mean, interevent_time_allweek_night_text_std, interevent_time_weekday_allday_call_max, interevent_time_weekday_allday_call_mean, interevent_time_weekday_allday_call_std, interevent_time_weekday_allday_text_mean, interevent_time_weekday_day_call_kurtosis, interevent_time_weekday_day_call_skewness, interevent_time_weekday_day_text_std, interevent_time_weekend_day_call_kurtosis, interevent_time_weekend_day_call_skewness, number_of_antennas_allweek_night, number_of_antennas_weekday_night, number_of_antennas_weekend_allday, number_of_contacts_weekday_allday_call, number_of_contacts_weekday_day_call, number_of_contacts_weekend_day_call, number_of_contacts_weekend_day_text, number_of_interactions_in_allweek_allday_call, number_of_interactions_in_weekday_day_text, number_of_interactions_in_weekend_allday_text, number_of_interactions_out_allweek_allday_text, number_of_interactions_out_weekday_day_text, number_of_interactions_out_weekend_allday_text, percent_at_home_weekend_night, percent_pareto_durations_weekday_allday, percent_pareto_interactions_allweek_day_call, percent_pareto_interactions_allweek_day_text, radius_of_gyration_allweek_day, radius_of_gyration_weekday_night, radius_of_gyration_weekend_allday, response_delay_text_allweek_allday_kurtosis, response_delay_text_allweek_night_min, response_delay_text_allweek_night_std, response_delay_text_weekday_allday_kurtosis, response_delay_text_weekday_allday_skewness, response_delay_text_weekday_night_median, response_delay_text_weekday_night_min, response_delay_text_weekend_allday_median, response_delay_text_weekend_day_min, response_delay_text_weekend_day_std, response_rate_text_allweek_night, response_rate_text_weekend_night, balance_of_contacts_allweek_allday_text_kurtosis, balance_of_contacts_allweek_allday_text_skewness, balance_of_contacts_allweek_night_text_max, balance_of_contacts_allweek_night_text_mean, balance_of_contacts_allweek_night_text_std, balance_of_contacts_weekday_allday_call_max, balance_of_contacts_weekday_allday_call_mean, balance_of_contacts_weekday_allday_call_std, balance_of_contacts_weekday_day_text_max, balance_of_contacts_weekday_day_text_mean, balance_of_contacts_weekday_day_text_min, balance_of_contacts_weekday_night_call_max, balance_of_contacts_weekday_night_call_mean, balance_of_contacts_weekday_night_call_std, balance_of_contacts_weekend_allday_call_max, balance_of_contacts_weekend_allday_call_mean, balance_of_contacts_weekend_allday_call_std, balance_of_contacts_weekend_allday_text_max, balance_of_contacts_weekend_allday_text_min, balance_of_contacts_weekend_allday_text_std, call_duration_weekday_night_call_kurtosis, call_duration_weekday_night_call_skewness, call_duration_weekend_allday_call_kurtosis, call_duration_weekend_allday_call_skewness, call_duration_weekend_night_call_median, call_duration_weekend_night_call_min, entropy_of_antennas_allweek_night, entropy_of_antennas_weekday_day, entropy_of_contacts_weekend_night_text, frequent_antennas_weekend_night, interactions_per_contact_allweek_day_text_kurtosis, interactions_per_contact_allweek_day_text_min, interactions_per_contact_allweek_day_text_skewness, interactions_per_contact_allweek_night_call_kurtosis, interactions_per_contact_allweek_night_call_skewness, interactions_per_contact_weekday_allday_text_mean, interactions_per_contact_weekday_day_call_kurtosis, interactions_per_contact_weekday_day_call_skewness, interactions_per_contact_weekday_day_text_std, interactions_per_contact_weekend_day_call_kurtosis, interactions_per_contact_weekend_day_call_skewness, interevent_time_allweek_night_text_median, interevent_time_allweek_night_text_min, interevent_time_weekday_allday_call_median, interevent_time_weekday_allday_call_min, interevent_time_weekday_day_text_max, interevent_time_weekday_day_text_mean, interevent_time_weekday_day_text_min, interevent_time_weekday_night_call_max, interevent_time_weekday_night_call_mean, interevent_time_weekday_night_call_std, interevent_time_weekend_allday_call_max, interevent_time_weekend_allday_call_mean, interevent_time_weekend_allday_call_std, interevent_time_weekend_allday_text_max, interevent_time_weekend_allday_text_min, interevent_time_weekend_allday_text_std, number_of_antennas_weekday_day, number_of_contacts_weekend_night_text, number_of_interactions_alldir_allweek_day_call, number_of_interactions_alldir_allweek_day_text, number_of_interactions_alldir_weekday_day_text, number_of_interactions_alldir_weekend_allday_text, number_of_interactions_in_allweek_night_call, number_of_interactions_in_allweek_night_text, number_of_interactions_in_weekday_night_text, number_of_interactions_out_allweek_allday_call, number_of_interactions_out_weekday_night_text, percent_at_home_weekend_day, percent_pareto_durations_allweek_allday, percent_pareto_interactions_allweek_allday_text, percent_pareto_interactions_weekday_day_text, percent_pareto_interactions_weekend_allday_text, radius_of_gyration_allweek_night, radius_of_gyration_weekday_day, response_delay_text_allweek_night_kurtosis, response_delay_text_weekday_night_kurtosis, response_delay_text_weekday_night_skewness, response_delay_text_weekend_day_kurtosis, response_delay_text_weekend_night_max, response_delay_text_weekend_night_mean, response_delay_text_weekend_night_std, response_rate_text_weekend_day, balance_of_contacts_allweek_night_text_median, balance_of_contacts_allweek_night_text_min, balance_of_contacts_weekday_allday_call_median, balance_of_contacts_weekday_allday_call_min, balance_of_contacts_weekday_allday_text_median, balance_of_contacts_weekday_day_text_median, balance_of_contacts_weekday_night_call_median, balance_of_contacts_weekday_night_call_min, balance_of_contacts_weekday_night_text_max, balance_of_contacts_weekday_night_text_min, balance_of_contacts_weekday_night_text_std, balance_of_contacts_weekend_allday_call_median, balance_of_contacts_weekend_allday_call_min, balance_of_contacts_weekend_allday_text_mean, balance_of_contacts_weekend_day_text_max, balance_of_contacts_weekend_day_text_mean, balance_of_contacts_weekend_day_text_median, balance_of_contacts_weekend_day_text_std, balance_of_contacts_weekend_night_call_max, balance_of_contacts_weekend_night_call_mean, balance_of_contacts_weekend_night_call_std, call_duration_weekend_night_call_kurtosis, call_duration_weekend_night_call_skewness, entropy_of_antennas_weekend_night, entropy_of_contacts_weekend_allday_call, frequent_antennas_weekend_day, interactions_per_contact_allweek_allday_text_max, interactions_per_contact_allweek_allday_text_mean, interactions_per_contact_allweek_allday_text_median, interactions_per_contact_allweek_allday_text_min, interactions_per_contact_allweek_allday_text_std, interactions_per_contact_weekday_allday_call_max, interactions_per_contact_weekday_allday_call_mean, interactions_per_contact_weekday_allday_call_std, interactions_per_contact_weekday_day_text_max, interactions_per_contact_weekday_day_text_mean, interactions_per_contact_weekday_day_text_min, interactions_per_contact_weekday_night_call_max, interactions_per_contact_weekday_night_call_mean, interactions_per_contact_weekday_night_call_std, interactions_per_contact_weekend_allday_call_max, interactions_per_contact_weekend_allday_call_mean, interactions_per_contact_weekend_allday_call_std, interactions_per_contact_weekend_allday_text_max, interactions_per_contact_weekend_allday_text_min, interactions_per_contact_weekend_allday_text_std, interevent_time_allweek_night_text_kurtosis, interevent_time_allweek_night_text_skewness, interevent_time_weekday_allday_call_kurtosis, interevent_time_weekday_allday_call_skewness, interevent_time_weekday_allday_text_median, interevent_time_weekday_day_text_median, interevent_time_weekday_night_call_median, interevent_time_weekday_night_call_min, interevent_time_weekday_night_text_max, interevent_time_weekday_night_text_min, interevent_time_weekday_night_text_std, interevent_time_weekend_allday_call_median, interevent_time_weekend_allday_call_min, interevent_time_weekend_allday_text_mean, interevent_time_weekend_day_text_max, interevent_time_weekend_day_text_mean, interevent_time_weekend_day_text_median, interevent_time_weekend_day_text_std, interevent_time_weekend_night_call_max, interevent_time_weekend_night_call_mean, interevent_time_weekend_night_call_std, number_of_antennas_weekend_night, number_of_contacts_weekend_allday_call, number_of_interactions_alldir_allweek_allday_text, number_of_interactions_alldir_weekday_night_text, number_of_interactions_in_weekday_allday_call, number_of_interactions_in_weekday_day_call, number_of_interactions_in_weekend_day_call, number_of_interactions_in_weekend_day_text, number_of_interactions_out_allweek_night_call, number_of_interactions_out_allweek_night_text, number_of_interactions_out_weekday_allday_call, number_of_interactions_out_weekday_day_call, number_of_interactions_out_weekend_day_call, number_of_interactions_out_weekend_day_text, percent_pareto_durations_allweek_day, percent_pareto_durations_weekday_night, percent_pareto_durations_weekend_allday, percent_pareto_interactions_allweek_allday_call, percent_pareto_interactions_weekday_night_text, radius_of_gyration_weekend_night, response_delay_text_weekend_allday_kurtosis, response_delay_text_weekend_allday_skewness, response_delay_text_weekend_night_median, response_delay_text_weekend_night_min, response_delay_text_weekend_night_skewness, balance_of_contacts_allweek_night_text_kurtosis, balance_of_contacts_allweek_night_text_skewness, balance_of_contacts_weekday_allday_call_kurtosis, balance_of_contacts_weekday_allday_call_skewness, balance_of_contacts_weekday_day_text_kurtosis, balance_of_contacts_weekday_day_text_skewness, balance_of_contacts_weekday_night_call_kurtosis, balance_of_contacts_weekday_night_call_skewness, balance_of_contacts_weekday_night_text_mean, balance_of_contacts_weekend_allday_call_kurtosis, balance_of_contacts_weekend_allday_call_skewness, balance_of_contacts_weekend_day_text_kurtosis, balance_of_contacts_weekend_day_text_min, balance_of_contacts_weekend_day_text_skewness, balance_of_contacts_weekend_night_call_median, balance_of_contacts_weekend_night_call_min, entropy_of_antennas_weekend_day, entropy_of_contacts_weekday_night_call, interactions_per_contact_allweek_allday_text_kurtosis, interactions_per_contact_allweek_allday_text_skewness, interactions_per_contact_allweek_night_text_max, interactions_per_contact_allweek_night_text_mean, interactions_per_contact_allweek_night_text_std, interactions_per_contact_weekday_allday_call_median, interactions_per_contact_weekday_allday_call_min, interactions_per_contact_weekday_allday_text_median, interactions_per_contact_weekday_day_text_median, interactions_per_contact_weekday_night_call_median, interactions_per_contact_weekday_night_call_min, interactions_per_contact_weekday_night_text_max, interactions_per_contact_weekday_night_text_min, interactions_per_contact_weekday_night_text_std, interactions_per_contact_weekend_allday_call_median, interactions_per_contact_weekend_allday_call_min, interactions_per_contact_weekend_allday_text_mean, interactions_per_contact_weekend_day_text_max, interactions_per_contact_weekend_day_text_mean, interactions_per_contact_weekend_day_text_median, interactions_per_contact_weekend_day_text_std, interactions_per_contact_weekend_night_call_max, interactions_per_contact_weekend_night_call_mean, interactions_per_contact_weekend_night_call_std, interevent_time_weekday_day_text_kurtosis, interevent_time_weekday_day_text_skewness, interevent_time_weekday_night_call_kurtosis, interevent_time_weekday_night_call_skewness, interevent_time_weekday_night_text_mean, interevent_time_weekend_allday_call_kurtosis, interevent_time_weekend_allday_call_skewness, interevent_time_weekend_day_text_kurtosis, interevent_time_weekend_day_text_min, interevent_time_weekend_day_text_skewness, interevent_time_weekend_night_call_median, interevent_time_weekend_night_call_min, number_of_antennas_weekend_day, number_of_contacts_weekday_night_call, number_of_interactions_alldir_allweek_allday_call, number_of_interactions_alldir_weekday_allday_call, number_of_interactions_alldir_weekday_day_call, number_of_interactions_alldir_weekend_day_call, number_of_interactions_alldir_weekend_day_text, number_of_interactions_in_weekend_night_text, number_of_interactions_out_weekend_night_text, percent_pareto_durations_allweek_night, percent_pareto_durations_weekday_day, percent_pareto_interactions_allweek_night_call, percent_pareto_interactions_allweek_night_text, percent_pareto_interactions_weekday_allday_call, percent_pareto_interactions_weekday_day_call, percent_pareto_interactions_weekend_day_call, percent_pareto_interactions_weekend_day_text, radius_of_gyration_weekend_day, response_delay_text_weekend_night_kurtosis, balance_of_contacts_weekday_allday_text_kurtosis, balance_of_contacts_weekday_allday_text_skewness, balance_of_contacts_weekend_allday_text_median, balance_of_contacts_weekend_night_call_kurtosis, balance_of_contacts_weekend_night_call_skewness, balance_of_contacts_weekend_night_text_max, balance_of_contacts_weekend_night_text_min, balance_of_contacts_weekend_night_text_std, entropy_of_contacts_weekend_night_call, interactions_per_contact_allweek_night_text_median, interactions_per_contact_allweek_night_text_min, interactions_per_contact_weekday_allday_call_kurtosis, interactions_per_contact_weekday_allday_call_skewness, interactions_per_contact_weekday_day_text_kurtosis, interactions_per_contact_weekday_day_text_skewness, interactions_per_contact_weekday_night_call_kurtosis, interactions_per_contact_weekday_night_call_skewness, interactions_per_contact_weekday_night_text_mean, interactions_per_contact_weekend_allday_call_kurtosis, interactions_per_contact_weekend_allday_call_skewness, interactions_per_contact_weekend_day_text_kurtosis, interactions_per_contact_weekend_day_text_min, interactions_per_contact_weekend_day_text_skewness, interactions_per_contact_weekend_night_call_median, interactions_per_contact_weekend_night_call_min, interevent_time_weekday_allday_text_kurtosis, interevent_time_weekday_allday_text_skewness, interevent_time_weekend_allday_text_median, interevent_time_weekend_night_call_kurtosis, interevent_time_weekend_night_call_skewness, interevent_time_weekend_night_text_max, interevent_time_weekend_night_text_min, interevent_time_weekend_night_text_std, number_of_contacts_weekend_night_call, number_of_interactions_alldir_allweek_night_call, number_of_interactions_alldir_allweek_night_text, number_of_interactions_alldir_weekend_night_text, number_of_interactions_in_weekend_allday_call, number_of_interactions_out_weekend_allday_call, percent_initiated_conversations_allweek_day, percent_initiated_conversations_weekend_day, percent_initiated_interactions_allweek_day, percent_initiated_interactions_weekday_allday, percent_initiated_interactions_weekend_day, percent_pareto_durations_weekend_night, percent_pareto_interactions_weekend_night_text, balance_of_contacts_weekday_night_text_median, balance_of_contacts_weekend_night_text_mean, interactions_per_contact_allweek_night_text_kurtosis, interactions_per_contact_allweek_night_text_skewness, interactions_per_contact_weekday_allday_text_kurtosis, interactions_per_contact_weekday_allday_text_skewness, interactions_per_contact_weekend_allday_text_median, interactions_per_contact_weekend_night_call_kurtosis, interactions_per_contact_weekend_night_call_skewness, interactions_per_contact_weekend_night_text_max, interactions_per_contact_weekend_night_text_min, interactions_per_contact_weekend_night_text_std, interevent_time_weekday_night_text_median, interevent_time_weekend_night_text_mean, number_of_interactions_alldir_weekend_allday_call, number_of_interactions_in_weekday_night_call, number_of_interactions_out_weekday_night_call, percent_initiated_conversations_allweek_night, percent_initiated_conversations_weekday_allday, percent_initiated_conversations_weekend_night, percent_initiated_interactions_allweek_night, percent_initiated_interactions_weekend_night, percent_pareto_durations_weekend_day, percent_pareto_interactions_weekend_allday_call, balance_of_contacts_weekend_allday_text_kurtosis, balance_of_contacts_weekend_allday_text_skewness, interactions_per_contact_weekday_night_text_median, interactions_per_contact_weekend_night_text_mean, interevent_time_weekend_allday_text_kurtosis, interevent_time_weekend_allday_text_skewness, number_of_interactions_alldir_weekday_night_call, number_of_interactions_in_weekend_night_call, number_of_interactions_out_weekend_night_call, percent_initiated_conversations_weekday_day, percent_initiated_conversations_weekday_night, percent_initiated_interactions_allweek_allday, percent_initiated_interactions_weekday_day, percent_initiated_interactions_weekday_night, percent_initiated_interactions_weekend_allday, percent_pareto_interactions_weekday_night_call, balance_of_contacts_weekday_night_text_kurtosis, balance_of_contacts_weekday_night_text_skewness, balance_of_contacts_weekend_night_text_median, interactions_per_contact_weekend_allday_text_kurtosis, interactions_per_contact_weekend_allday_text_skewness, interevent_time_weekday_night_text_kurtosis, interevent_time_weekday_night_text_skewness, interevent_time_weekend_night_text_median, number_of_interactions_alldir_weekend_night_call, percent_initiated_conversations_allweek_allday, percent_initiated_conversations_weekend_allday, percent_pareto_interactions_weekend_night_call, balance_of_contacts_weekend_night_text_kurtosis, balance_of_contacts_weekend_night_text_skewness, interactions_per_contact_weekday_night_text_kurtosis, interactions_per_contact_weekday_night_text_skewness, interactions_per_contact_weekend_night_text_median, interevent_time_weekend_night_text_kurtosis, interevent_time_weekend_night_text_skewness, interactions_per_contact_weekend_night_text_kurtosis, interactions_per_contact_weekend_night_text_skewness, name];
'Project [name#510300, 'cdr_active_days__allweek__day__callandtext]
+- Project [name#510300, active_days_allweek_allday#510301, active_days_allweek_day#510302, active_days_allweek_night#510303, active_days_weekday_allday#510304, active_days_weekday_day#510305, active_days_weekday_night#510306, active_days_weekend_allday#510307, active_days_weekend_day#510308, active_days_weekend_night#510309, number_of_contacts_allweek_allday_call#510310, number_of_contacts_allweek_allday_text#510311, number_of_contacts_allweek_day_call#510312, number_of_contacts_allweek_day_text#510313, number_of_contacts_allweek_night_call#510314, number_of_contacts_allweek_night_text#510315, number_of_contacts_weekday_allday_call#510316, number_of_contacts_weekday_allday_text#510317, number_of_contacts_weekday_day_call#510318, number_of_contacts_weekday_day_text#510319, number_of_contacts_weekday_night_call#510320, number_of_contacts_weekday_night_text#510321, number_of_contacts_weekend_allday_call#510322, number_of_contacts_weekend_allday_text#510323, ... 685 more fields]
   +- Join Inner, (name#510300 = name#566678)
      :- Relation [name#510300,active_days_allweek_allday#510301,active_days_allweek_day#510302,active_days_allweek_night#510303,active_days_weekday_allday#510304,active_days_weekday_day#510305,active_days_weekday_night#510306,active_days_weekend_allday#510307,active_days_weekend_day#510308,active_days_weekend_night#510309,number_of_contacts_allweek_allday_call#510310,number_of_contacts_allweek_allday_text#510311,number_of_contacts_allweek_day_call#510312,number_of_contacts_allweek_day_text#510313,number_of_contacts_allweek_night_call#510314,number_of_contacts_allweek_night_text#510315,number_of_contacts_weekday_allday_call#510316,number_of_contacts_weekday_allday_text#510317,number_of_contacts_weekday_day_call#510318,number_of_contacts_weekday_day_text#510319,number_of_contacts_weekday_night_call#510320,number_of_contacts_weekday_night_text#510321,number_of_contacts_weekend_allday_call#510322,number_of_contacts_weekend_allday_text#510323,... 685 more fields] csv
      +- Project [name#566678]
         +- Relation [name#566678,active_days_allweek_allday#566679,active_days_allweek_day#566680,active_days_allweek_night#566681,active_days_weekday_allday#566682,active_days_weekday_day#566683,active_days_weekday_night#566684,active_days_weekend_allday#566685,active_days_weekend_day#566686,active_days_weekend_night#566687,number_of_contacts_allweek_allday_call#566688,number_of_contacts_allweek_allday_text#566689,number_of_contacts_allweek_day_call#566690,number_of_contacts_allweek_day_text#566691,number_of_contacts_allweek_night_call#566692,number_of_contacts_allweek_night_text#566693,number_of_contacts_weekday_allday_call#566694,number_of_contacts_weekday_allday_text#566695,number_of_contacts_weekday_day_call#566696,number_of_contacts_weekday_day_text#566697,number_of_contacts_weekday_night_call#566698,number_of_contacts_weekday_night_text#566699,number_of_contacts_weekend_allday_call#566700,number_of_contacts_weekend_allday_text#566701,... 685 more fields] csv
../_images/b42fe81f3d770dd5699f50572450516fa03a3512d1f471a3fe4683f87d6ec524.png