What? Why? Who? How? #
What? #
Academic writing in general, and scientific and social scientific writing in particular, requires that we use empirical data to mount an argument. There are many conventions that have become sacrosanct that we both need to use and find quite helpful for fulfilling the task of providing convincing data and a tight argument, grounded in a long line of other worthwhile research. Citations, figures, tables, equations, proofs, diagrams and prose are all key to this endeavor. Most undergraduate and graduate students, post docs, early, mid and late career faculty do most of their writing in What You See is What You Get style programs like Word or Google Docs. They are easy to use and you already know how to use them but they were not made with scientists and social scientists in mind. Anyone who has had to format a bibliography, update their in-text citations, re-number figures, or properly format a table in a Word document will quickly realize the limitations of Microsoft’s flagship Office program.
In this document, I am going to give you an introduction to some of the most helpful tools for adopting a plain text tool kit for conducting and writing your research, which was designed with researchers in mind who need to leverage all these conventions in their work. It will require that you actually write code, install programs, manipulate text and images to compose documents and then typeset them. If you are familiar with a programming language like python or R, you will be well positioned to take advantage of these tools. If you are not familiar with how your computer works, you will have to do a little extra work but I will attempt to make the burden of learning how to use these tools a little less onerous. But what is plain text?
A plain text file is a file where the contents are human readable, and do not contain any information other than the alphanumeric characters in the file itself. Plain text is primarily distinguished from a formatted text file, such as a Word Document. For example, if you open a Word Document that has italicized, bolded, underlined, different type faces, and margins, the instructions for how that text is formatted is not immediately rendered to you, just the formatted text is. In a plain text file containing the same information, the styling conventions are readable to you as they are encoded in the text along with the prose. The distinction is not easy to immediately capture but will become clear very quickly, the more important question to address is why would you be interested in learning anything about plain text files and software.
Why? #
I have colleagues who really don’t understand my obsession with plaintext software. Most think it is the product of a desire for esoteric and complicated things, a severe and somtimes productive neurosis. They think that it is actually just harder to use Markdown,
\(\LaTeX\)
, and pandoc
, that Word is enough, and that when something is poorly formatted or not particularly pretty then the words and argument have to speak for themselves. I used to be willing to concede the point that just using Word is easier, but it actually isn’t easier. Yes, you already know how to use Word, but Word wasn’t designed with scientists and social scientists in mind, people who have to communicate very complicated things in concise and elegant ways, taking advantage of all the contemporary tools we have at our disposal.
It is in fact easier to do what I will teach you in this document. No one wants to emulate a document that looks like it was written in Word. What more, one day the people who read your work and take it seriously might just want to replicate it. Maybe they’ll ask you how a particular figure was made, or what process you adopted to generate a particular regression table. If you have a PDF file for the submitted article version called my_fabulous_paper_vFINAL_2.pdf
and 20 different Word documents all with slightly different versions of the same name you will lament the times you thought writing in Word was “easier”.
To give just a brief introduction to why you should make the effort to switch to plaintext tools for research and writing I will say there are three reasons:
-
They Are Free: All the things I will outline here are things that are free to use, have a wide user base, and keep your data and writing in an accessible and non-proprietary format.
-
They Encourage Good Practices: Working with plaintext tools makes you interact with your data and writing files in a way that encourages good practices at the expense of bad ones. I will say more about these shortly so you will have to have sufficient faith in the project to get through chapters 1 and 2 to be truly convinced but convinced you certainly shall be.
-
They Make Your Work Accessible: You’ll make documents that are reproducible and allow you to distribute the means to replicate your work with ease.
These tools are not just for quantitative scientists and social scientists. They were originally written for scientists who needed to use a lot of math, but they are now quite useful to scientists and social scientists of any methodological commitment, particularly considering the other important feature of a plain text workflow, its synergy with reproducibility and openness.
Trivia
You might be surprised to learn that I am actually a qualitative sociologist who specializes in interviews and ethnographic data collection.
Several other social scientists have noted the usefulness and superiority of plaintext tools for research and writing. My extended introduction is hardly a substitute for some canonical texts on the subject. I myself became an acolyte when I found Kieran Healy’s The Plain Person’s Guide to Plain Text Social Science in my second year of graduate school. I recommend reading it as it outlines in a far more parsimonious and convincing way the why of why make the switch to a set of software that is a little more onerous at first to use but rewarding once it is mastered. The document you are looking at here is superior in only one regard: it takes you step by step through some of the more dizzying steps in the process of switching to this tool-kit, and is geared toward the graduate student who knows how to use their computer but hasn’t made a lot of effort to learn the more archaic and powerful tools already available to them.
Who? #
Adopting this work flow and taking advantage of this book in particular to do so, will be most helpful to undergraduate and graduate students as well as post docs and early career professors or researchers. Especially those students who are taking courses in computation, statistics, or higher level methods courses will find this work helpful. But I do not think that the benefits of the plain text tool kit should be solely confined to scholars and students in the quantitative side of the social sciences as this software will help you not only get things done, and make things look beautiful, but it will also help enhance your skills in scientific communication. Along the way, I imagine you will also learn a few things about data and project management as well that are applicable across the sciences and social sciences.
How? #
For complete beginners, people who are proficient at using their computer but have never opened up a terminal, review Chapter 1, “The Terminal”, and Chapter 3, “Project Organization”, before going to Chapter 4, “Installation”. If you are familiar with how to use the terminal then you can briefly checkout Chapter 3 before going to Chapter 4 to install the software that we are going to be using. Installing the software is the first big jump to adopting plaintext software for your research and writing and it is a little complicated.
Once you have the software installed you will be ready to start learning how to actually use it in
Chapter 5, “Document Composition”,, where I will show you all the most important conventions in LaTeX and Markdown before showing you how to typeset your first document. After that you can proceed to
Chapter 6, “Dynamic Documents” to learn how to use dynamics documents and RMarkdown in particular. At that point you will be ready to learn how to keep a record of everything you’ve done and make it available to others using git
and Github in
Chapter 7, “Version Control”.