4.6 Data dictionary

Lastly, we need to create a data dictionary which elaborates on how our data is stored and organized. To do this, open any markdown or text editor (e.g. VS Code, Typora, notepad etc.), open a new file and save it to the data subfolder for your project. This time we will save the file as _DATA-DICTIONARY.md.

A data dictionary helps others understand the meaning of each element in your datasets within the broader context of the project. Typically you will have an individual readme file for each dataset. This file should include:

  • Date when the data dictionary was created and who created it
  • Date when the data dictionary file was updated and who updated it
  • A description of the raw data file
  • A description of each variable for all datasets including data type, units, number of levels if categorical, and a description of variable levels where relevant
  • When describing variables you need to provide the full names and definitions of each variable because often variables are abbreviated in datasets

To see an example data dictionary click here.