The classical c-mean algorithm

CMeans(
  data,
  k,
  m,
  maxiter = 500,
  tol = 0.01,
  standardize = TRUE,
  robust = FALSE,
  noise_cluster = FALSE,
  delta = NULL,
  verbose = TRUE,
  init = "random",
  seed = NULL
)

Arguments

data

A dataframe with only numerical variables. Can also be a list of rasters (produced by the package raster). In that case, each raster is considered as a variable and each pixel is an observation. Pixels with NA values are not used during the classification.

k

An integer describing the number of cluster to find

m

A float for the fuzziness degree

maxiter

An integer for the maximum number of iterations

tol

The tolerance criterion used in the evaluateMatrices function for convergence assessment

standardize

A boolean to specify if the variables must be centred and reduced (default = True)

robust

A boolean indicating if the "robust" version of the algorithm must be used (see details)

noise_cluster

A boolean indicatong if a noise cluster must be added to the solution (see details)

delta

A float giving the distance of the noise cluster to each observation

verbose

A boolean to specify if the progress should be printed

init

A string indicating how the initial centres must be selected. "random" indicates that random observations are used as centres. "kpp" use a distance-based method resulting in more dispersed centres at the beginning. Both of them are heuristic.

seed

An integer used for random number generation. It ensures that the starting centres will be the same if the same value is selected.

Value

An S3 object of class FCMres with the following slots

  • Centers: a dataframe describing the final centers of the groups

  • Belongings: the final membership matrix

  • Groups: a vector with the names of the most likely group for each observation

  • Data: the dataset used to perform the clustering (might be standardized)

  • isRaster: TRUE if rasters were used as input data, FALSE otherwise

  • k: the number of groups

  • m: the fuzyness degree

  • alpha: the spatial weighting parameter (if SFCM or SGFCM)

  • beta: beta parameter for generalized version of FCM (GFCM or SGFCM)

  • algo: the name of the algorithm used

  • rasters: a list of rasters with membership values and the most likely group (if rasters were used)

  • missing: a boolean vector indicating raster cell with data (TRUE) and with NA (FALSE) (if rasters were used)

  • maxiter: the maximum number of iterations used

  • tol: the convergence criterio

  • lag_method: the lag function used (if SFCM or SGFCM)

  • nblistw: the neighbours list used (if vector data were used for SFCM or SGFCM)

  • window: the window used (if raster data were used for SFCM or SGFCM)

Examples

data(LyonIris)
AnalysisFields <-c("Lden","NO2","PM25","VegHautPrt","Pct0_14","Pct_65","Pct_Img",
"TxChom1564","Pct_brevet","NivVieMed")
dataset <- sf::st_drop_geometry(LyonIris[AnalysisFields])
result <- CMeans(dataset,k = 5, m = 1.5, standardize = TRUE)