cli.ExploratoryFA

ExploratoryFA(in_dataset, id_column, desc_columns, out_folder=’./ResultsEFA/’, rotation=RotationTypes.promax, method=MethodTypes.minres, nb_factors=None, train_dataset_size=0.5, random_state=1234, verbose=False, save_parameters=False, overwrite=False)

Exploratory Factorial Analysis

ExploratoryFA is a script that can be used to perform an exploratory factorial analysis (EFA).

In the case of performing only an EFA (use the flag –use_only_efa), the script will use Horn’s parallel analysis to determine the optimal number of factors to extract from the data. Then the final EFA model will be fitted using the provided rotation and method.

It is also possible to perform EFA on a training dataset and export the test dataset to be used for further analysis (e.g. ConfirmatoryFA). The script will output the EFA model, the loadings, communalities, and the transformed dataset.

Input Specifications

Dataset can contain multiple descriptive rows before the variables of interest. Simply specify the number of descriptive rows using –desc-columns. Rows with missing values will be removed by default, please select the mean or median option to impute missing data (be cautious when doing this).

References

[1] Costa, V., & Sarmento, R. Confirmatory Factor Analysis. (https://arxiv.org/ftp/arxiv/papers/1905/1905.05598.pdf)

[2] Whether to use EFA or CFA to predict latent variables scores. (https://stats.stackexchange.com/questions/346499/whether-to-use-efa-or-cfa-to-predict-latent-variables-scores)

[3] Comparison of factor score estimation methods (https://github.com/gagnonanthony/NeuroStatX/pull/11)

Example Usage

ExploratoryFA --in-dataset df --id-column IDs --out-folder results_FA/
--rotation promax --method ml --train_dataset_size 0.5 -v -f -s

Parameters

in_dataset : Input dataset to use in the factorial analysis.

id_column : Name of the column containing the subject’s ID tag. Required for proper handling of IDs and merging multiple datasets.

desc_columns : Number of descriptive columns at the beginning of the dataset to exclude in statistics and descriptive tables.

out_folder : Path of the folder in which the results will be written. If not specified, current folder and default name will be used (e.g. = ./output/).

rotation : Select the type of rotation to apply on your data.

method : Select the method for fitting the data.

nb_factors : Specify the number of factors to extract from the data. If not specified, the script will use Horn’s parallel analysis to determine the optimal number of factors.

train_dataset_size : Specify the proportion of the input dataset to use as training dataset in the EFA. (value from 0 to 1)

random_state : Random seed for reproducibility.

verbose : If true, produce verbose output.

save_parameters : If true, save the parameters used in the analysis in a text file.

overwrite : If true, force overwriting of existing output files.