Skip to content

yoavram-lab/Cultural-Clusters-and-Networks

Repository files navigation

Repository for the paper "Cultural transmission, networks, and clusters among Austronesian-speaking peoples"

Abstract

With its linguistic and cultural diversity, Austronesia is important in the study of evolutionary forces that generate and maintain cultural variation. By analyzing publicly available datasets, we have identified four classes of cultural features in Austronesia and distinct clusters within each class. We hypothesized that there are differing modes of transmission and patterns of variation in these cultural classes and that geography alone would be insufficient to explain some of these patterns of variation. We detected relative differences in the verticality of transmission and distinct patterns of cultural variation in each cultural class. There is support for pulses and pauses in the Austronesian expansion, a west-to-east increase in isolation with explicable exceptions, and correspondence between linguistic and cultural outliers. Our results demonstrate how cultural transmission and patterns of variation can be analyzed using methods inspired by population genetics.

Software version, package, and license information

These files include all of the data nessecary to generate the results of our paper as well as useful python/matlab functions. Data files are organized by file type. Be sure to check the beginning of the program files for data and package dependicies. For the raw unprocessed data as well as feature encodings download the original data from dplace: https://github.com/D-PLACE

All code in this repository is available under a Creative Commons International 4.0 license with attribution. Authors wishing to modify this code for their own purposes should cite the version of this work archived in Zenodo. The MATLAB scripts in this repository use only base MATLAB install modules and were written using release R2020b. We have tested the code with release R2023b and found no compatibility issues. The packages used in this notebook are dirichlet, pandas, numpy, seaborn, matplotlib, py-pcha, MNE, panel, scipy, sklearn, and sys. We have tested this software with python 3.10.14 using versions 0.9, 2.2.2, 1.26.4,0.13.2, 3.9.2, 0.1.3, 1.7.1, 1.4.5, 1.13.1, and 1.5.2 of these packages respectively and found no compatibility issues.

Authors

Joshua C. Macdonald, Javier Blanco-Portillo, Marcus W. Feldman, and Yoav Ram

Corresponding authors contact

YR: yoavram-AT-tauex.tau.ac.il, MWF: mfeldman-AT-stanford.edu

I. Metadata (start here)

A. map rows to cultures and other medtadata

  • EAAustronesian.csv
  • Pulotu_idents.csv

B. variables analyzed

  • VariablesAnalyzed.csv

II. Generated data files

A. Mean centered and binarized data reconstructions

  • EA_VBPCA_Recon_Subsist.csv
  • EA_VBPCA_Recon_Kinship_Org.csv
  • PUL_VBPCA_Recon_Iso.csv
  • PUL_VBPCA_Recon_Rel.csv

B. Variational Bayesian Principal Components

  • Kinship_Org_PC.csv
  • Subsist_PC.csv
  • Rel_PC.csv
  • Iso_PC.csv

C. Archetypes

  • Kinship_Arch.csv
  • Subsist_Arch.csv
  • Rel_Arch.csv
  • Iso_Arch.csv

D. Pairwise distance matrices (For use in splitstree)

  • KinDistsAll.dist
  • SubDistsAll.dist
  • RelDistsAll.dist
  • IsoDistsAll.dist

E. Q-residuals and delta scores (Output of splitstree)

  • delta_kin.csv
  • delta_sub.csv
  • delta_rel.csv
  • delta_iso.csv

F. Generated datasets for hypothesis testing (Austronesian Outliers)

  • EA_Kin_repli.csv
  • EA_Sub_repli.csv
  • Pul_Rel_repli.csv
  • Pul_Iso_repli.csv

H. Pruned linguistic phylogeny (See Gray et al. (2009) Science for full tree)

  • prunedtree_Pul_gray.phy
  • prunedtree_EA_gray.phy

I. files for imputation comparision

i. Deletion indicies

  • KinDelIdx.csv
  • SubDelIdx.csv
  • RelDelIdx.csv
  • IsoDelIdx.csv

ii. Metrics

  • VBPCA_metrics.csv
  • Kin_Accu.csv
  • Sub_Accu.csv
  • Rel_Accu.csv
  • Iso_Accu.csv
  • Kin_MSE.csv
  • Sub_MSE.csv
  • Rel_MSE.csv
  • Iso_MSE.csv

III. Program files

A. Python notebooks

Notebooks with analysis functions and demonstration of their use

  • ArchetypalAnalysis.ipynb
  • Dirichlet.ipynb
  • MCAR_test.ipynb

B. Matlab scripts

These require the VBPCA package https://users.ics.aalto.fi/alexilin/software/ and the raw cultural data reshaped so that columns are features and rows are samples

  • VBPCACulturalCheck.m
  • testStat.m
  • GetMatrix.m
  • DeleteBootstrapMCAR.m
  • OHSpecial.m

About

Program and data files for "Cultural transmission, networks, and clusters among Austronesian-speaking peoples"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors