so4gp.algorithms.graank_pso.ParticleGRAANK¶
- class ParticleGRAANK(*args, max_iter=1, n_particle=5, vel=0.9, coeff_p=0.01, coeff_g=0.9, **kwargs)[source]¶
- Parameters:
args – [required] data source path of Pandas DataFrame, [optional] minimum-support, [optional] eq
max_iter (int) – [optional] maximum_iteration, default is 1
n_particle (int) – [optional] initial particle population, default is 5
vel (float) – [optional] velocity, default is 0.9
coeff_p (float) – [optional] personal coefficient, default is 0.01
coeff_g (float) – [optional] global coefficient, default is 0.9
>>> from so4gp.algorithms import ParticleGRAANK >>> import pandas >>> >>> dummy_data = [[30, 3, 1, 10], [35, 2, 2, 8], [40, 4, 2, 7], [50, 1, 1, 6], [52, 7, 1, 2]] >>> dummy_df = pandas.DataFrame(dummy_data, columns=['Age', 'Salary', 'Cars', 'Expenses']) >>> >>> mine_obj = ParticleGRAANK(data_source=dummy_df, min_sup=0.5, max_iter=3, n_particle=10) >>> result_json = mine_obj.discover() >>> # print(result['Patterns']) >>> print(result_json) {"Algorithm": "PSO-GRAANK", "Best Patterns": [], "Invalid Count": 12, "Iterations": 2}
- __init__(*args, max_iter=1, n_particle=5, vel=0.9, coeff_p=0.01, coeff_g=0.9, **kwargs)[source]¶
Extract gradual patterns (GPs) from a numeric data source using the Particle Swarm Optimization Algorithm approach (proposed in a published research paper by Dickson Owuor). A GP is a set of gradual items (GI), and its quality is measured by its computed support value. For example, given a data set with 3 columns (age, salary, cars) and 10 objects. A GP may take the form: {age+, salary-} with a support of 0.8. This implies that 8 out of 10 objects have the values of column age ‘increasing’ and column ‘salary’ decreasing.
In this approach, it is assumed that every GP candidate may be represented as a particle that has a unique position and fitness. The fitness is derived from the computed support of that candidate, the higher the support value, the higher the fitness. The aim of the algorithm is to search through a population of particles (or candidates) and find those with the highest fitness as efficiently as possible.
- Parameters:
args – [required] data source path of Pandas DataFrame, [optional] minimum-support, [optional] eq
max_iter (int) – [optional] maximum_iteration, default is 1
n_particle (int) – [optional] initial particle population, default is 5
vel (float) – [optional] velocity, default is 0.9
coeff_p (float) – [optional] personal coefficient, default is 0.01
coeff_g (float) – [optional] global coefficient, default is 0.9
>>> from so4gp.algorithms import ParticleGRAANK >>> import pandas >>> >>> dummy_data = [[30, 3, 1, 10], [35, 2, 2, 8], [40, 4, 2, 7], [50, 1, 1, 6], [52, 7, 1, 2]] >>> dummy_df = pandas.DataFrame(dummy_data, columns=['Age', 'Salary', 'Cars', 'Expenses']) >>> >>> mine_obj = ParticleGRAANK(data_source=dummy_df, min_sup=0.5, max_iter=3, n_particle=10) >>> result_json = mine_obj.discover() >>> # print(result['Patterns']) >>> print(result_json) {"Algorithm": "PSO-GRAANK", "Best Patterns": [], "Invalid Count": 12, "Iterations": 2}
Methods
__init__(*args[, max_iter, n_particle, vel, ...])Extract gradual patterns (GPs) from a numeric data source using the Particle Swarm Optimization Algorithm approach (proposed in a published research paper by Dickson Owuor).
add_gradual_pattern(pattern)Adds a gradual pattern to the list of gradual patterns.
analyze_gps(data_src, min_sup, est_gps[, ...])For each estimated GP, computes its true support using the GRAANK approach and returns the statistics (% error, and standard deviation).
clean_data(df)Cleans a data-frame (i.e., missing values, outliers) before extraction of GPs
clear_gradual_patterns()Clears the list of gradual patterns.
discover()Searches through particle positions to find GP candidates.
fit_bitmap([attr_data])Generates bitmaps for columns with numeric objects.
fit_warpingset()Generates transaction ids (tids) for each column/feature with numeric objects.
gen_gradual_warping_set(pairwise_mat[, as_array])A method that decomposes the pairwise matrix of a gradual item/pattern into a warping set.
generate_output_files(alg_data[, ...])Generates output of results (as files) for the GP mining algorithm.
read(data_src)Reads all the contents of a file (in CSV format) or a data-frame.
remove_subsets(gi_arr[, gradual_patterns])Remove subset GPs from the list.
test_time(date_str)Tests if a str represents a date-time variable.
Attributes
attr_colsattr_sizecol_countdatadisplay_patternsdisplay_patterns_as_dfgradual_patternsrow_countthd_supptime_colstitlesvalid_binswarping_set