ExploratoryFA
ExploratoryFA(in_dataset, id_column, desc_columns, out_folder=’./ResultsEFA/’, rotation=RotationTypes.promax, method=MethodTypes.minres, nb_factors=None, train_dataset_size=0.5, random_state=1234, mean=False, median=False, verbose=False, save_parameters=False, overwrite=False)
Exploratory Factorial Analysis
ExploratoryFA is a script that can be used to perform an exploratory factorial analysis (EFA).
In the case of performing only an EFA (use the flag –use_only_efa), the script will use Horn’s parallel analysis to determine the optimal number of factors to extract from the data. Then the final EFA model will be fitted using the provided rotation and method.
It is also possible to perform EFA on a training dataset and export the test dataset to be used for further analysis (e.g. ConfirmatoryFA). The script will output the EFA model, the loadings, communalities, and the transformed dataset.
Input Specifications
Dataset can contain multiple descriptive rows before the variables of interest. Simply specify the number of descriptive rows using –desc-columns. Rows with missing values will be removed by default, please select the mean or median option to impute missing data (be cautious when doing this).
References
[1] Costa, V., & Sarmento, R. Confirmatory Factor Analysis. (https://arxiv.org/ftp/arxiv/papers/1905/1905.05598.pdf)
[2] Whether to use EFA or CFA to predict latent variables scores. (https://stats.stackexchange.com/questions/346499/whether-to-use-efa-or-cfa-to-predict-latent-variables-scores)
[3] Comparison of factor score estimation methods (https://github.com/gagnonanthony/NeuroStatX/pull/11)
Example Usage
Parameters
in_dataset : Input dataset(s) to use in the factorial analysis. If multiple files are provided as input, will be merged according to the subject id columns. For multiple inputs, use this: –in-dataset df1 –in-dataset df2 […]
id_column : Name of the column containing the subject’s ID tag. Required for proper handling of IDs and merging multiple datasets.
desc_columns : Number of descriptive columns at the beginning of the dataset to exclude in statistics and descriptive tables.
out_folder : Path of the folder in which the results will be written. If not specified, current folder and default name will be used (e.g. = ./output/).
rotation : Select the type of rotation to apply on your data.
method : Select the method for fitting the data.
nb_factors : Specify the number of factors to extract from the data. If not specified, the script will use Horn’s parallel analysis to determine the optimal number of factors.
train_dataset_size : Specify the proportion of the input dataset to use as training dataset in the EFA. (value from 0 to 1)
random_state : Random seed for reproducibility.
mean : Impute missing values in the original dataset based on the column mean.
median : Impute missing values in the original dataset based on the column median.
verbose : If true, produce verbose output.
save_parameters : If true, save the parameters used in the analysis in a text file.
overwrite : If true, force overwriting of existing output files.