so4gp.data_gp.DataGP¶
- class DataGP(data_source, min_sup=0.5, eq=False)[source]¶
- Parameters:
data_source (pd.DataFrame | str) – [required] a data source, it can either be a ‘file in csv format’ or a ‘Pandas DataFrame’
min_sup (float) – [optional] minimum support threshold, the default is 0.5
eq (bool) – [optional] encode equal values as gradual, the default is False
- __init__(data_source, min_sup=0.5, eq=False)[source]¶
A class for creating data-gp objects. A data-gp object is meant to store all the parameters required by GP algorithms to extract gradual patterns (GP). It takes a numeric file (in CSV format) as input and converts it into an object whose attributes are used by algorithms to extract GPs.
- Parameters:
data_source (pd.DataFrame | str) – [required] a data source, it can either be a ‘file in csv format’ or a ‘Pandas DataFrame’
min_sup (float) – [optional] minimum support threshold, the default is 0.5
eq (bool) – [optional] encode equal values as gradual, the default is False
- Return type:
None
Methods
__init__(data_source[, min_sup, eq])A class for creating data-gp objects.
add_gradual_pattern(pattern)Adds a gradual pattern to the list of gradual patterns.
analyze_gps(data_src, min_sup, est_gps[, ...])For each estimated GP, computes its true support using the GRAANK approach and returns the statistics (% error, and standard deviation).
clean_data(df)Cleans a data-frame (i.e., missing values, outliers) before extraction of GPs
clear_gradual_patterns()Clears the list of gradual patterns.
fit_bitmap([attr_data])Generates bitmaps for columns with numeric objects.
fit_warpingset()Generates transaction ids (tids) for each column/feature with numeric objects.
gen_gradual_warping_set(pairwise_mat[, as_array])A method that decomposes the pairwise matrix of a gradual item/pattern into a warping set.
generate_output_files(alg_data[, ...])Generates output of results (as files) for the GP mining algorithm.
read(data_src)Reads all the contents of a file (in CSV format) or a data-frame.
remove_subsets(gi_arr[, gradual_patterns])Remove subset GPs from the list.
test_time(date_str)Tests if a str represents a date-time variable.
Attributes
attr_colsattr_sizecol_countdatadisplay_patternsdisplay_patterns_as_dfgradual_patternsrow_countthd_supptime_colstitlesvalid_binswarping_set