Pandas is a python tool used extensively for data analysis and manipulation. Recently I’ve been using pandas with large DataFrames (>50M rows) and through the PyDataUK May Talks and exploring StackOverflow threads have discovered several tips that have been incredibly useful in optimising my analysis.
This tutorial is part 1 of a series and aims to give an introduction to pandas and some of the useful features it offers while exploring the Palmer Penguin dataset.
In this article, we will go through:
Are you wanting to learn to code but unsure where to start? You’ve come to the right place. I’ll walk you through the reasons to learn Python to help you decide if it’s right for you and provide a list of resources to learn the basics, get your environment set up, find projects to get started with and where you can go for help and support through your coding journey.
Python is a versatile programming language that can be used for software development, data analysis, machine learning and even web development! …
In 2020, my focus is on reading more non-fiction and whilst reading Atomic Habits by James Clear, I summarised each chapter with one sentence. By taking notes in this way I can easily digest and review the content and then come back to these again and again.
The book is incredibly well written with accompanying diagrams and examples to demonstrate important concepts and at the end of each chapter, James Clear summarises all the key points.
If you are looking to create new habits or break bad ones, check out the chapter by chapter breakdown below for a brief insight…
Over the past few months, I’ve started to use Notion for my digital organisation. Whilst I like to think that I’m quite organised and tidy, I must admit my digital organisation is not up to scratch. On discovering Notion I fully immersed myself in tutorials, articles and using different templates to find a set up that works for me. Since October I’ve been using this tool on a daily basis to monitor and track my tasks, projects, fitness as well as to save documents and articles.
This post can also be found at https://kaparker.com/posts/technomads-datascience
On Monday 25th November we held our monthly TechNomads meetup where we focussed on Data Science. Joseph Allen & John Carney, Data Scientists based in Manchester, visited us at Liverpool Science Park to give an introduction to Data Science followed by a Q&A session. They provided fascinating insight drawn from their experience and shared plenty of tips and advice with us. The audience was made up of a mix of people working with data, including researchers, students and analysts as well as developers, aspiring data scientists and people working on side projects.
During a recent NLP project, I came across an article where word clouds were created in the shape of US Presidents using words from their inauguration speeches. Whilst I had used word clouds to visualise the most frequent words in a document, I’d not considered using this with a mask to represent the topic or subject. This got me thinking...
Several months ago, shortly before the final series of Game Of Thrones and after having just rewatched seasons 1–7, I was eagerly awaiting the finale — so much so that I went looking for any Game Of Thrones data I…
There are different ways of scraping web pages using python. In my previous article, I gave an introduction to web scraping by using the libraries:
seleniumwith Firefox web driver
One of the first tasks that I was given in my job as a Data Scientist involved Web Scraping. This was a completely alien concept to me at the time, gathering data from websites using code, but is one of the most logical and easily accessible sources of data. After a few attempts, web scraping has become second nature to me and one of the many skills that I use almost daily.
In this tutorial I will go through a simple example of how to scrape a website to gather data on the top 100 companies in 2018 from Fast…