Using an extension of fragmentary data analysis methods to predict sex from fragments of bones

This project started when an Anthropology student in my methods class asked me to be on her advisory committee. She had bone measurements from individuals recovered in an archeological dig and needed to predict each individual’s sex. Some individuals could be sexed based on other characteristics. Sounds like a straightforward logistic regression problem but it wasn’t straightforward, because many bone measurements were missing. Fragmentary data analysis is a new approach to predict outcomes for all individuals, even those missing some covariate values. However, those methods rely on having many individuals with complete data. Mallory and Andrew’s data had complete data for 3 individuals.

My solution had four components:

  1. Identify a subset of known-sexed individuals and subset of bone measurements that is complete.
  2. Use the complete data to fit logistic regressions, one bone at a time, to predict sex
  3. Predict the logit probabilities that an individual was female using all possible bones for that individual
  4. Use model averaging to combine the logit probabilities into a single prediction.

A Bayesian approach was needed because some bones had complete separation of covariate values. This was implemented using rstanarm functions. Predictions generally agreed with the sex of known-sex individuals. When they didn’t, the archeologists had reasons to suspect a mistake in the original sex determination.

Philip Dixon
Philip Dixon
Iowa State University