so4gp.data_gp.DataGP

class DataGP(data_source, min_sup=0.5, eq=False)[source]
Parameters:
  • data_source (pd.DataFrame | str) – [required] a data source, it can either be a ‘file in csv format’ or a ‘Pandas DataFrame’

  • min_sup (float) – [optional] minimum support threshold, the default is 0.5

  • eq (bool) – [optional] encode equal values as gradual, the default is False

__init__(data_source, min_sup=0.5, eq=False)[source]

A class for creating data-gp objects. A data-gp object is meant to store all the parameters required by GP algorithms to extract gradual patterns (GP). It takes a numeric file (in CSV format) as input and converts it into an object whose attributes are used by algorithms to extract GPs.

Parameters:
  • data_source (pd.DataFrame | str) – [required] a data source, it can either be a ‘file in csv format’ or a ‘Pandas DataFrame’

  • min_sup (float) – [optional] minimum support threshold, the default is 0.5

  • eq (bool) – [optional] encode equal values as gradual, the default is False

Return type:

None

Methods

__init__(data_source[, min_sup, eq])

A class for creating data-gp objects.

add_gradual_pattern(pattern)

Adds a gradual pattern to the list of gradual patterns.

analyze_gps(data_src, min_sup, est_gps[, ...])

For each estimated GP, computes its true support using the GRAANK approach and returns the statistics (% error, and standard deviation).

clean_data(df)

Cleans a data-frame (i.e., missing values, outliers) before extraction of GPs

clear_gradual_patterns()

Clears the list of gradual patterns.

fit_bitmap([attr_data])

Generates bitmaps for columns with numeric objects.

fit_warpingset()

Generates transaction ids (tids) for each column/feature with numeric objects.

gen_gradual_warping_set(pairwise_mat[, as_array])

A method that decomposes the pairwise matrix of a gradual item/pattern into a warping set.

generate_output_files(alg_data[, ...])

Generates output of results (as files) for the GP mining algorithm.

read(data_src)

Reads all the contents of a file (in CSV format) or a data-frame.

remove_subsets(gi_arr[, gradual_patterns])

Remove subset GPs from the list.

test_time(date_str)

Tests if a str represents a date-time variable.

Attributes

attr_cols

attr_size

col_count

data

display_patterns

display_patterns_as_df

gradual_patterns

row_count

thd_supp

time_cols

titles

valid_bins

warping_set