Advice to an aspiring Data Analyst
Yesterday, we received an inspiring message from a young, aspiring data analyst named L. (full name withheld):
“Hello.
I am a Year 12 student at Didcot Sixth Form who are requiring us to gain work experience linked to the career pathway we wish to pursue, in my case is data analysis or areas surrounding my A Level subjects of Mathematics, Computer Science and Physics.
As you specialise in Data Science and analysis, I would like to enquire if you take in 16+ for work experience as I believe this will nurture my understanding of real life application of this chosen field of specialty.
The work experience week for my Sixth form is the 24th to 28th March 2025.Thank you and hope to hear from you.
L.”
I felt compelled to respond immediately. As I got carried away with my reply, I realized it might be valuable to share my thoughts more broadly. After all, there might be other 16-year-olds (or even older learners) searching for similar guidance online.
Below is my response, which I hope will serve as a useful starting point for anyone interested in pursuing a career in data analysis.
Dear L.,
I received your email, and I must say I was genuinely touched by it. Like you, at your age, I also sought help and guidance. Back in the 80s, there was no Internet, so we had to rely on local assistance, in-person training, and support.
Although we do not offer training for sixth formers, I feel it is my duty to respond to you and try to assist you in your quest to become involved in data analysis. I owe it not just to you but also to the people who helped me when I was your age.
Your A-levels
Data are outcomes of physical phenomena, actions, transactions, and more. To thoroughly analyze data, you first need a solid understanding of how they are generated. You may find that the data you are asked to analyze are the wrong ones, or there may be better data to serve your purpose. To that end, I suggest that for your A-levels, you consider taking and giving emphasis to the following:
A. Physics
B. Applied Maths / Mathematics
C. Pure Maths / Further Mathematics
D. Statistics
E. Computer Science
- Physics: You will need Physics to better understand the world, the causality of things, and to apply your curiosity effectively.
- Applied Maths: This will help you tackle some of the difficult problems you will encounter in Physics.
- Pure Maths: This supports your understanding of both Applied Maths and Physics.
- Statistics: While statistics can sometimes be confusing (after all, it’s the science that can conclude we each ate half a chicken if you ate one and I ate none), it is an essential language for data analysis. Learning the language of statistics will help you communicate effectively with other data analysts.
- Computer Science: This subject will help you translate solutions into practical applications. Often, a solution may be computationally impossible to implement in a given timeframe. For example, consider electoral predictions: if elections are in 30 days and your solution takes 90 days to compute, it would be of no use. Computer Science teaches you how to retrieve, process, analyze, and present data using the best tools available.
Even if you decide not to write code yourself, you’ll need to communicate your analysis plan effectively with Computer Scientists. At times, you may find it more efficient to handle some tasks yourself. For example, when I was working on my initial electoral predictions, I learned CUDA (the language used to program NVIDIA GPUs) to work with Oxford’s Emerald Supercomputer. Oftentimes projects may require confidentiality, making it impractical to share details with others.
Out-of-school study
Please visit this link to download one of the best books on applied Data Analysis, using state-of-the-art databases: Graph Databases. At Oxford, we have used them since around 2010, but they were largely unknown to most businesses until recently. In my native Greece, for example, they are not even taught at universities. By studying them, you will gain a significant advantage. The book and the Neo4j demos will help you understand what cutting-edge Data Analysis involves.
I happen to know the author of the book, and I can assure you he knows what he is talking about. Follow the instructions to download the Neo4j desktop, which comes with its own sample application, and start experimenting.
Although Graph Databases are a state-of-the-art tool, you will still need to familiarize yourself with the language most data analysts use to query their data: SQL. It is one of the simplest — yet deceptively difficult to master — languages. Despite being over 50 years old, it has remained largely unchanged, which is a testament to its value. Also, learn regular expressions, which have similarly stood the test of time.
To learn SQL, I recommend downloading Microsoft’s SQL Server 2022 (Developer Edition is free) along with SQL Management Studio to manage it. Install their sample databases, such as Northwind (a simulation of a large company’s fundamental data) and AdventureWorks.
If you are using a Mac, you can still install Microsoft’s SQL Server but will need other tools to manage it. I recommend Microsoft SQL Server because their documentation, onboarding, and training material are superior. Unfortunately, MariaDB/MySQL — while excellent for many of my projects — lack in this department.
Closing thoughts
That’s all from me for now.
I truly wish you success. In two years, I hope you will be admitted to your local university :), so I can offer you part-time work while you study. When it comes to choosing a college, please consider applying to Exeter College.
Carpe Diem!
Best regards,
Dimitris Vayenas
Founder of Oxford Metadata