So what does this look like?
Say you’re in BIOL 116 and you’re working on your research project. Your research project involves:
- Preparing the beginning of a manuscript that states your research question, hypothesis, and proposed methods.
- Conducting your experiment and recording your data. This process might span more than one day.
- Updating your manuscript, describing any changes that were made to your methods.
- Organizing the results of your experiment and interpreting and visualizing your data.
- Updating your manuscript to include your results and your interpretation of these results, including a visual interpretation.
- Completing your manuscript by discussing the importance of and / or limitations of the experiment, and finally producing a conclusion.
In this scenario, we have 1 project, 1 manuscript, 1 dataset, and at least 1 figure. In addition, our dataset is constructed from data collected over several days, and our manuscript is revised 3 times before final submission.
So, first we will come up with a project name, and then we will date our data and version our figures and manuscript. And we’ll see how this evolves over the course of several days.
We create the following file:
This is our manuscript, so it will get a version, but no date. Looking at it, we quickly see that this is a lab report (Lab-report), authored by someone with the last name Pither (Pither) associated with a BIOL 116 Research Project (BIOL116RProject), that it has only just been created (V0), and that I should expect it to open in Microsoft Word (docx).
Let’s imagine that I will put my research question, hypothesis, and methods in this document and submit it.
Today, I conducted the first part of our experiment and collected some data. Now we have the following files:
We have not changed our manuscript, so there’s no change to the name. However, we have collected some data. We can easily see who collected this data (Pither), when it was collected (September 21, 2021), that it’s connected to the BIOL 116 Research Project (BIOL116RProject), and that it’s data related to PH exposure. Lastly, it is formatted as comma separated values (csv), which can be opened by any spreadsheet program or text editor.
I continue to collect data over the next several days, and here is what my files now look like:
Pither_20210921_BIOL116RProject_ph-data.csv Pither_20210922_BIOL116RProject_ph-data.csv Pither_20210923_BIOL116RProject_ph-data.csv Pither_20210924_BIOL116RProject_ph-data.csv Pither_BIOL116RProject_Lab-report_V0.docx
Again, we have not changed our manuscript, so there’s no change to the name. However, we have collected some more data related to PH. We have one file for each day, organized from the earliest day of collection to the most recent.
Today, I did two things. I have no more data to collect, so I updated my manuscript to include any modifications made to my original methods section, I then submitted this. I also started to analyze my data; to do this, I merged all my data into a single file for analysis. Now my files look like this:
Pither_20210921_BIOL116RProject_ph-data.csv Pither_20210922_BIOL116RProject_ph-data.csv Pither_20210923_BIOL116RProject_ph-data.csv Pither_20210924_BIOL116RProject_ph-data.csv Pither_BIOL116RProject_Analysis_V0.xlsx Pither_BIOL116RProject_Lab-report_V0.docx Pither_BIOL116RProject_Lab-report_V1.docx
At this stage, I have my data collated into a document where I can work on it without impacting the original data. We can see that I have done this in Excel (xlsx), and that I should expect to be able to open this file in Excel. I also now have a V1 of my manuscript, as I have now added a new section to it; when submitting it, my TA knows that the file with V1 should have this updated section.
Today, I built two visualizations using the data in my analysis document, one linear regression and one bar plot of frequency counts; I save these as images to insert into my manuscript. I then updated my manuscript to include my results and these two figures and submitted V2 of my manuscript. Now my files look like this:
Pither_20210921_BIOL116RProject_ph-data.csv Pither_20210922_BIOL116RProject_ph-data.csv Pither_20210923_BIOL116RProject_ph-data.csv Pither_20210924_BIOL116RProject_ph-data.csv Pither_BIOL116RProject_Analysis_V0.xlsx Pither_BIOL116RProject_Figure-freq-plot_V0.png Pither_BIOL116RProject_Figure-linear-reg_V0.png Pither_BIOL116RProject_Lab-report_V0.docx Pither_BIOL116RProject_Lab-report_V1.docx Pither_BIOL116RProject_Lab-report_V2.docx
We can start to see the advantage here of naming conventions. I can easily see which files are which, what they contain, and what their timeline of development is. Also, my computer easily sorts these into meaningful categories - my data is grouped together, sorted by date. My analyses, figures, and manuscripts are all respectively grouped and sorted by version.
I got feedback that my linear regression model had an error in it. So I fixed this today, added the new figure into my manuscript, and wrote the discussion and conclusion sections. I’m now ready to submit. Here is what my files look like now (I will be submitting V3 of my manuscript):
Pither_20210921_BIOL116RProject_ph-data.csv Pither_20210922_BIOL116RProject_ph-data.csv Pither_20210923_BIOL116RProject_ph-data.csv Pither_20210924_BIOL116RProject_ph-data.csv Pither_BIOL116RProject_Analysis_V0.xlsx Pither_BIOL116RProject_Figure-freq-plot_V0.png Pither_BIOL116RProject_Figure-linear-reg_V0.png Pither_BIOL116RProject_Figure-linear-reg_V1.png Pither_BIOL116RProject_Lab-report_V0.docx Pither_BIOL116RProject_Lab-report_V1.docx Pither_BIOL116RProject_Lab-report_V2.docx Pither_BIOL116RProject_Lab-report_V3.docx