Jeremy Reilly


The Four F’s

Iteach a 3rd year undergraduate course on data science. It is not your typical course of lectures, practicals and tutorials. Lectures are few and far between and it is more about the practice of data science than it is about book-learning. After a one-week data science bootcamp, where we work together on a sample project from start to finish, students are supported through a data science project of their own creation. It can be a shock to the system because they are more used to pre-defined projects, and following a precise specification, so the idea of defining their own from scratch is as daunting as it is exciting.

In my experience the greatest challenge they confront is not the coding, but rather the definition of the research question(s) they wish to pursue. Getting this right helps to shape the entire project, and usually leads to a good outcome, but getting it wrong can leave the student in a never-ending struggle for clarity and purpose, which rarely results in a stand-out project.

In this article, I try to capture some of the advice I give to my students early on: what to look for in a research topic; and how to think about their research objective; and how to translate this into an appropriate research question(s) that will serve them well during their project. Although I have my 3rd years in mind as I write the article, I don’t think the advice is at all limited to them. Certainly, I frequently counsel my graduate students and other researchers on similar matters, as they confront many of the same challenges when establishing their research objectives. As such, I think that this article should be of interest to anyone working on data-driven projects or tasks.

It’s always nice to start with a catchy checklist. In marketing, they have The Four P’s (ProductPricePromotionPlace). The best I can come up with is the Four F’s – FascinatingFocusedFalsifiableFeasible – okay so it doesn’t exactly roll off the tongue, but it does a good job of capturing what is important to think about when defining your research and formulating a research question.


Try to find a topic that fascinates you. If you don’t care about your chosen topic then nobody else will. Plus, you won’t find work satisfying, you won’t enjoy doing it, and the result you produce will be mediocre at best. It doesn’t have to be a topic that is so important and compelling that everyone agrees that it needs to be answered. Truth be told, such questions are few and far between anyway, and pursing the obvious candidates runs the risk of your work being considered derivative!

I am forever encouraging my students to pursue their own niche interests. After all, these are the topics that excite them, and if they are excited then chances are others will be too. Pursuing your own interests also brings the added advantage of a topic in which you have some expertise, which can give you a valuable head-start. It usually also makes it easier for you to intuit an interesting research question too, and your intuitions may be good enough to evaluate whether your findings are reasonable at an earlier stage in the process than might otherwise be the case. This can provide you with time to adjust your research or replan as necessary.

Whatever topic you choose, take the time early on to reflect on why it is interesting to you and who else might be interested in it. This will help you to better appreciate your own motivations and will enable you to better motivate your work for others. Rest assured, even if you chose a very niche topic you will find that there will be others who are interested in it, that’s the nature of our connected world. Your passion for it will shine through, which can be a catalyst to capture the attention of others.

As an example, a few years ago I became more interested in running and starting exploring marathon data collected online. It was something I was interested in for myself but I quickly found that the questions I was exploring were of interest to others too, and what started as a personal project outside of my core research has since emerged as one of main research themes in my current work, resulting in numerous blog postsscientific articles and even a few media invitations. I pursue the work because it was of interest to me and I wanted to know the answer to the questions I was asking. But when I talked about this work, when I wrote about it, others became interested too, revealing new opportunities and more research questions.

