Things I wish I knew before I started my Ph.D.
(Note: this is a long post)
I was invited to give a talk at this year’s International Symposium on Reliability Engineering (ISSRE2017) doctoral symposium.
It was quite a new experience since it gave me the opportunity to reflect upon my Ph.D. one year after completion and being on the other side of the fence after attending two symposia as a student.
The organizers did an excellent job in mixing the point-of-views of a junior researcher (aka me) with the one of a senior, Lionel Briand contributing to the success of the event that doubled the expected participants and was praised by the conference general chair during the welcome speech.
I summarized my presentation in this blog post. Notice that I organized my ideas around the 3P principle (Product, Process, and People) but intentionally skipped the Product (i.e., the Ph.D. dissertation) since I cannot give advice on other people’s topics and do not want to bore anyone with my own 😉. Finally, before digging in, please notice that what’s reported here is based on a sample of n = 1, most of the ideas come from personal realizations that I hope will apply to the reader, too.
Understanding the final goal of a Ph.D. is the first step towards understanding the process and, in my experience, it was important to scope the problem as a whole. To achieve this, there is one thing which every student should get clear in his or her mind: your Ph.D. does not have to be big.
This post from prof. Matt Might conveys the point. In visual terms, you need to push the boundaries of the red circle—in a very focused manner—until the boundary gives up and you expand the existing knowledge of just a little bit.
Setting the bar too high—i.e., thinking to make a big bump in the circle—can be detrimental for (at least) two reasons:
- you will have problem understanding where your thesis is done
- you will feel you not going to make it.
On the other hand, this is not an excuse for slacking off and work on a small (even worse irrelevant) problem. I believe that one skill to develop as a scientist is not only the capacity to identify a problem but also understanding when to stop working on it and communicating the results. Lionel also covered the issue of selecting a worthy problem for a Ph.D. which generated some interesting discussion, see his slides here.
The next logical step in the process is to understand where the boundary lies. My suggestion here is to invest time in studying the literature. My process is to dedicate a single day of the week to deliberately read the key papers in the field of your research. Those are not only the must-know-by-heart, high-cited papers but also the ones published in areas close to your sub-field which are recently published. For example, the area of research for my Ph.D. was unit-testing and test-driven development, but I would also read papers about Agile software development, in general, to be up-to-date with the larger scope of the topic.
To get a hold of the literature, my approach is to set up one or more Google Scholar Alerts to get a digest of newly published papers that are matching a Scholar search string (or a specific author). What I mean by deliberate reading is the process of going through the paper, annotate it and collate the notes about:
- Motivation—Does the same motivation apply to my research? If yes, what is my specific perspective?
- Relevant literature—Identify the key citations in the paper (usually the one that the authors build on, contrasts, or compares) and snowball them to identify other relevant papers.
- Methodology—Can the methodology be applied to your problem? If that is the case, make sure you understand it and try to replicate it.
- Limitations—Do the same limitations apply to your study? Can your study be motivated by addressing one (or more) of such limitations? If that’s the case be sure to understand them properly.
- Discussion—A good paper is characterised by a thorough discussion section (which is also the most difficult to write). Pay extra attention when reading this section to learn how to write your discussion section (not regarding contents, of course, but regarding form).
While you are busy figuring out where the boundary is, you should also stock up the tools needed to push it. Usually, these are obtained by attending courses (for those who require clearing some credits as part of their doctoral training) which I divide into two categories:
- Hands-on—these are the practical skills, specific to your niche of research, which you usually learn in engineering school or advanced courses.
- Heads-on—these courses teach less concrete concepts (like philosophy of science or research methods) but are very beneficial for writing the dissertation. Having a good understanding of research methods (in general, not only the one used in your research) and their philosophical assumptions helps to justify your choices of methods, sampling, addressing threats to validity, etc. Remember that you are a scientist first and an engineer later.
Writing is a big part of being a scientist, so much that you should consider yourself as a writer (or so I have been told 🙂). One of skill a scientific writer should definitively possess is cogency. From the New Oxford American dictionary:
cogency | ˈkəʊdʒ(ə)nsi | noun [mass noun] the quality of being clear, logical, and convincing; lucidity.
One of the most powerful advice I came across is to start writing your papers early, earlier than you think. I got this idea from a presentation by Simon Peyton Jones of Microsoft Research Cambridge. I would start writing down a paper before doing the study itself. For example, I would start by writing the introduction, jot down the background part (which forces to understand the key references), the methodology and design I envision for the study, and the possible limitations. This approach drammatically helps to crystallize the idea of the paper.
Many times you want to cram as much stuff as possible in a single paper. In my experience, this is detrimental to you as a writer since you need to juggle different lines of thoughts (and eventually trying to connect them), and for the readers (e.g., your reviewers) who can then get lost in the manuscript without understanding what is the point you are trying to make.
Deciding on the main idea early on, helps you to more clearly communicate it, let alone giving you the possibility to focus and improve on it while doing the study. However, as I pointed out in my presentation, this is no excuse for least publishable unit—i.e., trying to publish thin slices of findings to increase one’s publication counts (sadly, a large part of career advancement in academia relies on bean counting).
In the quest for properly communicating the results of my work, there were two clear fallacies I fell for:
- Believing that I could formally communicate the rationale for my proposed approach.
- Believing that since I spent most of the time on a particular aspect of the work, it should be the focus of the paper.
I guess that the above originates from a naïve view of science in which everyone is supposed to understand everything given the least amount of words. According to the above definition of cogency, (scientific) writing can be recognized as an act of persuasion—an aspect easy to overlook. As Jones says in his presentation, you should spoon-feed the intuition of your ideas to the readers before formalizing it. Moreover, the best way to achieve the latter is through examples which gets the reader to understand the idea “as if you were jotting it on the whiteboard the first time it came to your mind.”
Regarding the second point, the temptation is to go over the whole journey that took you to write the paper thinking that the time invested in a specific activity (e.g., deciding on the experiment design) should be proportional to the number of pages dedicated to reporting it. Do not believe that the readers want to know every detail but instead choose the most direct path to the idea.
As a fresh Ph.D. student, I found myself naïvely believing that I (i.e., my work) was right while others were wrong. I soon discovered that scientific studies are like a blanket too short to cover all the parts of the bed and that there is no such a thing a the perfect scientific study, but rather a study which took a good trade-off. I, nowadays, put much effort in identify the shortcomings of my work and communicate them upfront to the readers. (It turns out that reviewers appreciate such effort and in several of instance they praised my “threat to validity” section.)
Alongside, I figured that giving praise to other researchers is important too. Here, I follow another advice from Jones and put an effort in acknowledging others work not only in the paper but also on a personal level. I would write to the authors of a paper asking for feedback on a paragraph where I cite them.
Related to this, I have also figured out the importance of reviews. Initially, you might be upset when getting a rejection and discard the comments as the work of an imbecile who didn’t understand your work. The important realization here is that if someone could not understand you, 80% of the time you are the culprit as you should have better explained your ideas in the first place (I leave the remaining 20% to an incompetent reviewer, not by his or her choice but due to how the peer-reviewing system currently works, at least in software engineering research). Note that your gratefulness for a good review (even if for a rejection) will increase with your understanding of how much effort goes into (properly) reviewing a paper.
It is finally important to understand that your writings (and presentation) will always have several audiences. These can be organized as in the picture below.
The first readers of your paper (after you and your inner circle of colleagues and hopefully supervisors) will be the reviewers. Mainly, they will be interested in the Form of the paper and check whether it fulfills the requirements for publications—is the paper coherent? is it properly written? etc. The reviewers will also check the Information contained in the paper. These are the bits that make your work worthwhile to be communicated to the scientific community (e.g., novelty, validated evidence). Do not forget that reviewers are researchers too, and one of the reasons why they agreed to review your paper (without getting paid for it) is because they found it interesting (in the best case scenario by reading title or abstract, but hopefully not at random) and want to get early access to the Content.
On the bottom part, your audience will focus on specific content. Other researchers are unlikely to read the whole paper but mostly will look for easy access to content interesting for their research. They will hunt for graphics, tables, algorithm and so on that quickly communicates Content. Similarly, practitioners are interested in another kind of Content, the one they could apply in their work (e.g., a new development process, or a piece of software). The important takeaway is that you should make clear where each of audiences can get what they’re looking for in the paper. This could be as easy as explicitly stating where the information and using an appropriate typeface.
Science is a network of people working together to expand human knowledge.
During your Ph.D., the most important people in this network is your supervisor.
For three/four years you will be academically married to your supervisor. Moreover, as in every good marriage, good communication is pivotal. The contents will be about tracking your progress and exchanging knowledge on your research topic—you get feedback from your supervisor, but after a while you will be the more knowledgeable one, needing to communicate this knowledge back to him/her.
The process I set up with my supervisor included communication at different levels and was adjusted according to the progress towards the defence. As suggested by Daniel M. Berry in his famous “How to finish that damn Ph.D.” talk, I preferred e-mail over face-to-face meetings to communicate and discuss important decisions. The rationale is that writing forces you to distill your thoughts and leaves a trace which is searchable once you, later on, need to justify a decision— for example, when writing a paper or the dissertation.
We also had weekly (bi-weekly) 10–15 minutes, Agile-style, stand-ups to inform my supervisor about what I did since last week, what went good, what went bad and my plan for improvement the next week.
Finally, we had “traditional” meeting (i.e., everything that would take more than 30 minutes) for specific occasions, such as an important deadline, paper reviews, teaching, etc.
One crucial thing about implementing this system, and in general any communication plan with your supervisor, is to make it easier for him/her to manage his/her time. For e-mails, I would try to enforce getting an answer within one week before going into panic mode. Of course, you should experiment a bit with the time frame and agree upon one give your supervisor level of business.
Sometimes, during your career as a Ph.D. student, you will do well, other times very well but sometimes you will do bad, too. When you’ll get a bad assessment from your supervisor, however bad it may be, remember that business is business. Your work is being assessed not you, so you shouldn’t take it personally (in case you constantly feel that you, as a person, are assessed, then I recommend to change supervisor).
As I mentioned earlier in this post, being able to understand when you are done is one skill that you need to acquire as a researcher. Differently from undergraduate school, do not expect that anyone—including your supervisor—will tell you when you are done. Ideally, you should be able to demonstrate to your supervisor that your work is self-contained and meets the standard for publication, just do not wait to be told!
You, as a Ph.D. student, are the second most important person in your journey. Unfortunately, in the last years, there has been a series of reports about mental health problems of Ph.D. students. Perfectionism, the incapacity of admitting ignorance, as well as difficulty in accepting failure, are some of the causes that led to such issue. As I have already mentioned, understanding when your work is done rather than making it perfect, being able to say “I don’t know,” ask for help from supervisor/fellow Ph.D. student/university counseling service, and understanding what is being assessed are survival skills in grad school.
One of the symptoms I have experienced, and which seems to be particularly spread is the imposter syndrome. That feeling (or for some a strong belief) of not being good at what you do despite evidence showing otherwise. I guess it is quite normal to go through a period, perhaps once you are in the thesis writing/defence preparation phase when you see yourself as a fraud and do not deserve what you have achieved. The typical reaction is to attribute your success to luck or to your capacity to deceive other people into thinking you are competent. To some extent, showing doubt is a sign of maturity (especially for a scientist), but this could easily backfire and demotivate you.
As you are approaching the final year of your Ph.D. and defence is in sight, you should give yourself a nudge and remember that there is life after Ph.D. If you want to purse an academic career, be aware that the hiring process (at least for faculty positions) is slow, really, really slow. Do not be surprised if it will take eight or nine months to hear something back from a university hiring committee. The advice I was given (and that I did not follow) was to start preparing an “application package” and apply for jobs approximately one year before the expected defence.
The next important people in your Ph.D. journey are the other researchers in your field. It is important, in my opinion, to meet and exchange ideas with others face-to-face and the best way to do this is by attending c̶o̶n̶f̶e̶r̶e̶n̶c̶e̶s̶ doctoral symposia. Many new Ph.Ds, when networking at conferences, want to talk to the big names in their field (which is ok) but under-estimate the importance of building a network of peers. A doctoral symposium is the perfect setting for this; you have fellow Ph.D. students who are working in your same field and who can give you much more attention than a senior researcher. This should be the first layer where you try to establish your collaborations (and also make friends 🙂). On the other hand, networking at conferences is especially good when you are towards the end, and you are looking for a job, or for thesis reviewers.
Do not forget that you do not live in a bubble during your Ph.D. I recommend to interact with other transversal communities—circles of people who are not directly in your research field (or not in research at all) but are interesting for your research. In my case, I attend(ed) several events organized by the local Agile and start-up communities to get some inspiration for my research from practitioners, “validate” my findings by talking to them and recruit participants for my experiments. In my experience, it is also beneficial to interact with Ph.D. students from other departments to get a different prospective on how research is done in their field.
Many exciting conferences are going on throughout the year, and you cannot possibly attend all of them. Here social media comes to the rescue. Twitter seems to be quite widespread in the (software engineering) research community, so use it (wisely) to ask quick questions, giving deeds, but also to advertise your research. My approach to Twitter is to have one list for researchers I often meet in person, one list for other researchers divided according to their sub-field, and list one for hashtags of the most important conferences. For longer content, a research blog sounded like a good idea to me but requires quite some motivation (which I do not seem to have, given the poor status of this blog).
You can find the annotated version of the presentation here. Hope it helps.