What statistical test is used for categorical data?

From Wikipedia, the free encyclopedia

Jump to navigation Jump to search

This a list of statistical procedures which can be used for the analysis of categorical data, also known as data on the nominal scale and as categorical variables.

Contents

  • 1 General tests
  • 2 Binomial data
  • 3 2 × 2 tables
  • 4 Measures of association
  • 5 Categorical manifest variables as latent variable
  • 6 See also

General tests[edit]

  • Bowker's test of symmetry
  • Categorical distribution, general model
  • Chi-squared test
  • Cochran–Armitage test for trend
  • Cochran–Mantel–Haenszel statistics
  • Correspondence analysis
  • Cronbach's alpha
  • Diagnostic odds ratio
  • G-test
  • Generalized estimating equations
  • Generalized linear models
  • Krichevsky–Trofimov estimator
  • Kuder–Richardson Formula 20
  • Linear discriminant analysis
  • Multinomial distribution
  • Multinomial logit
  • Multinomial probit
  • Multiple correspondence analysis
  • Odds ratio
  • Poisson regression
  • Powered partial least squares discriminant analysis
  • Qualitative variation
  • Randomization test for goodness of fit
  • Relative risk
  • Stratified analysis
  • Tetrachoric correlation
  • Uncertainty coefficient
  • Wald test

Binomial data[edit]

  • Bernstein inequalities (probability theory)
  • Binomial regression
  • Binomial proportion confidence interval
  • Chebyshev's inequality
  • Chernoff bound
  • Gauss's inequality
  • Markov's inequality
  • Rule of succession
  • Rule of three (medicine)
  • Vysochanskiï–Petunin inequality

2 × 2 tables[edit]

  • Chi-squared test
  • Diagnostic odds ratio
  • Fisher's exact test
  • G-test
  • Odds ratio
  • Relative risk
  • McNemar's test
  • Yates's correction for continuity

Measures of association[edit]

  • Aickin's α
  • Andres and Marzo's delta
  • Bangdiwala's B
  • Bennett, Alpert, and Goldstein’s S
  • Brennan and Prediger’s κ
  • Coefficient of colligation - Yule's Y
  • Coefficient of consistency
  • Coefficient of raw agreement
  • Conger’s Kappa
  • Contingency coefficient – Pearson's C
  • Cramér's V
  • Dice's coefficient
  • Fleiss' kappa
  • Goodman and Kruskal's lambda
  • Guilford’s G
  • Gwet’s AC1
  • Hanssen–Kuipers discriminant
  • Heidke skill score
  • Jaccard index
  • Janson and Vegelius’ C
  • Kappa statistics
  • Klecka's tau
  • Krippendorff's Alpha
  • Kuipers performance index
  • Matthews correlation coefficient
  • Phi coefficient
  • Press' Q
  • Renkonen similarity index
  • Prevalence adjusted bias adjusted kappa
  • Sakoda's adjusted Pearson's C
  • Scott's Pi
  • Sørensen similarity index
  • Stouffer’s Z
  • True skill statistic
  • Tschuprow's T
  • Tversky index
  • Von Eye's kappa

Categorical manifest variables as latent variable[edit]

  • Latent variable model
    • Item response theory
      • Rasch model
    • Latent class analysis

See also[edit]

  • Categorical distribution

  • v
  • t
  • e

Statistics

  • Outline
  • Index

Descriptive statistics

Continuous data
Center

  • Mean
    • Arithmetic
    • Cubic
    • Generalized/power
    • Geometric
    • Harmonic
    • Heinz
    • Lehmer
  • Median
  • Mode

Dispersion

  • Average absolute deviation
  • Coefficient of variation
  • Interquartile range
  • Percentile
  • Range
  • Standard deviation
  • Variance

Shape

  • Central limit theorem
  • Moments
    • Kurtosis
    • L-moments
    • Skewness

Count data

  • Index of dispersion

Summary tables

  • Contingency table
  • Frequency distribution
  • Grouped data

Dependence

  • Partial correlation
  • Pearson product-moment correlation
  • Rank correlation
    • Kendall's τ
    • Spearman's ρ
  • Scatter plot

Graphics

  • Bar chart
  • Biplot
  • Box plot
  • Control chart
  • Correlogram
  • Fan chart
  • Forest plot
  • Histogram
  • Pie chart
  • Q–Q plot
  • Radar chart
  • Run chart
  • Scatter plot
  • Stem-and-leaf display
  • Violin plot

Data collection

Study design

  • Effect size
  • Missing data
  • Optimal design
  • Population
  • Replication
  • Sample size determination
  • Statistic
  • Statistical power

Survey methodology

  • Sampling
    • Cluster
    • Stratified
  • Opinion poll
  • Questionnaire
  • Standard error

Controlled experiments

  • Blocking
  • Factorial experiment
  • Interaction
  • Random assignment
  • Randomized controlled trial
  • Randomized experiment
  • Scientific control

Adaptive designs

  • Adaptive clinical trial
  • Stochastic approximation
  • Up-and-down designs

Observational studies

  • Cohort study
  • Cross-sectional study
  • Natural experiment
  • Quasi-experiment

Statistical inference

Statistical theory

  • Population
  • Statistic
  • Probability distribution
  • Sampling distribution
    • Order statistic
  • Empirical distribution
    • Density estimation
  • Statistical model
    • Model specification
    • Lp space
  • Parameter
    • location
    • scale
    • shape
  • Parametric family
    • Likelihood (monotone)
    • Location–scale family
    • Exponential family
  • Completeness
  • Sufficiency
  • Statistical functional
    • Bootstrap
    • U
    • V
  • Optimal decision
    • loss function
  • Efficiency
  • Statistical distance
    • divergence
  • Asymptotics
  • Robustness

Frequentist inference
Point estimation

  • Estimating equations
    • Maximum likelihood
    • Method of moments
    • M-estimator
    • Minimum distance
  • Unbiased estimators
    • Mean-unbiased minimum-variance
      • Rao–Blackwellization
      • Lehmann–Scheffé theorem
    • Median unbiased
  • Plug-in

Interval estimation

  • Confidence interval
  • Pivot
  • Likelihood interval
  • Prediction interval
  • Tolerance interval
  • Resampling
    • Bootstrap
    • Jackknife

Testing hypotheses

  • 1- & 2-tails
  • Power
    • Uniformly most powerful test
  • Permutation test
    • Randomization test
  • Multiple comparisons

Parametric tests

  • Likelihood-ratio
  • Score/Lagrange multiplier
  • Wald

Specific tests

  • Z-test (normal)
  • Student's t-test
  • F-test

Goodness of fit

  • Chi-squared
  • G-test
  • Kolmogorov–Smirnov
  • Anderson–Darling
  • Lilliefors
  • Jarque–Bera
  • Normality (Shapiro–Wilk)
  • Likelihood-ratio test
  • Model selection
    • Cross validation
    • AIC
    • BIC

Rank statistics

  • Sign
    • Sample median
  • Signed rank (Wilcoxon)
    • Hodges–Lehmann estimator
  • Rank sum (Mann–Whitney)
  • Nonparametric anova
    • 1-way (Kruskal–Wallis)
    • 2-way (Friedman)
    • Ordered alternative (Jonckheere–Terpstra)
  • Van der Waerden test

Bayesian inference

  • Bayesian probability
    • prior
    • posterior
  • Credible interval
  • Bayes factor
  • Bayesian estimator
    • Maximum posterior estimator

  • Correlation
  • Regression analysis

Correlation

  • Pearson product-moment
  • Partial correlation
  • Confounding variable
  • Coefficient of determination

Regression analysis

  • Errors and residuals
  • Regression validation
  • Mixed effects models
  • Simultaneous equations models
  • Multivariate adaptive regression splines (MARS)

Linear regression

  • Simple linear regression
  • Ordinary least squares
  • General linear model
  • Bayesian regression

Non-standard predictors

  • Nonlinear regression
  • Nonparametric
  • Semiparametric
  • Isotonic
  • Robust
  • Heteroscedasticity
  • Homoscedasticity

Generalized linear model

  • Exponential families
  • Logistic (Bernoulli) / Binomial / Poisson regressions

Partition of variance

  • Analysis of variance (ANOVA, anova)
  • Analysis of covariance
  • Multivariate ANOVA
  • Degrees of freedom

Categorical / Multivariate / Time-series / Survival analysis

Categorical

  • Cohen's kappa
  • Contingency table
  • Graphical model
  • Log-linear model
  • McNemar's test
  • Cochran-Mantel-Haenszel statistics

Multivariate

  • Regression
  • Manova
  • Principal components
  • Canonical correlation
  • Discriminant analysis
  • Cluster analysis
  • Classification
  • Structural equation model
    • Factor analysis
  • Multivariate distributions
    • Elliptical distributions
      • Normal

Time-series
General

  • Decomposition
  • Trend
  • Stationarity
  • Seasonal adjustment
  • Exponential smoothing
  • Cointegration
  • Structural break
  • Granger causality

Specific tests

  • Dickey–Fuller
  • Johansen
  • Q-statistic (Ljung–Box)
  • Durbin–Watson
  • Breusch–Godfrey

Time domain

  • Autocorrelation (ACF)
    • partial (PACF)
  • Cross-correlation (XCF)
  • ARMA model
  • ARIMA model (Box–Jenkins)
  • Autoregressive conditional heteroskedasticity (ARCH)
  • Vector autoregression (VAR)

Frequency domain

  • Spectral density estimation
  • Fourier analysis
  • Least-squares spectral analysis
  • Wavelet
  • Whittle likelihood

Survival
Survival function

  • Kaplan–Meier estimator (product limit)
  • Proportional hazards models
  • Accelerated failure time (AFT) model
  • First hitting time

Hazard function

  • Nelson–Aalen estimator

Test

  • Log-rank test

Applications

Biostatistics

  • Bioinformatics
  • Clinical trials / studies
  • Epidemiology
  • Medical statistics

Engineering statistics

  • Chemometrics
  • Methods engineering
  • Probabilistic design
  • Process / quality control
  • Reliability
  • System identification

Social statistics

  • Actuarial science
  • Census
  • Crime statistics
  • Demography
  • Econometrics
  • Jurimetrics
  • National accounts
  • Official statistics
  • Population statistics
  • Psychometrics

Spatial statistics

  • Cartography
  • Environmental statistics
  • Geographic information system
  • Geostatistics
  • Kriging

  • What statistical test is used for categorical data?
    Category
  • What statistical test is used for categorical data?
     Mathematics portal
  • What statistical test is used for categorical data?
    Commons
  • What statistical test is used for categorical data?
    WikiProject

What measure is best for categorical data?

The mode is the only central tendency measure for categorical data, while a median works best with ordinal data.

Is t

For categorical variables, you can use a one-sample t-test for proportion to test the distribution of categories.

What techniques of statistical analysis are used for categorical data?

General tests.
Bowker's test of symmetry..
Categorical distribution, general model..
Chi-squared test..
Cochran–Armitage test for trend..
Cochran–Mantel–Haenszel statistics..
Correspondence analysis..
Cronbach's alpha..
Diagnostic odds ratio..

Is chi

Revised on November 10, 2022. A Pearson's chi-square test is a statistical test for categorical data. It is used to determine whether your data are significantly different from what you expected.