Nine Tools I Wish I Mastered before My PhD in Machine Learning

Despite its monumental role in advancing technology, academia is often ignorant of industrial achievements. By the end of my PhD I realised that there is a myriad of great auxiliary tools, overlooked…

Stop Using CSVs for Storage — This File Format Is 150 Times Faster

CSV is not the only data storage format out there. In fact, it’s likely the last one you should consider. If you don’t plan to edit the saved data manually, you’re wasting both time and money by…

Do You Use Apply in Pandas? There is a 600x Faster Way

Do You Use Apply in Pandas? By leveraging numpy vectorization and data types, you can massively speed up complex computations in Pandas by a 600x factor.

No, [] And list() Are Different In Python

Python Literal Values are faster to initialise a list and dictionary compare to searching the local/global/enclosed/built-in scope proved by the bytecode.

10 Most Practical Data Science Skills You Should Know in 2022

Many “How to Data Science” courses and articles, including my own, tend to highlight fundamental skills like Statistics, Math, and Programming. Recently, however, I noticed through my own experiences…

A Complete 15 Week Curriculum to Master SQL for Data Science

As I work more and more in the corporate world as a data scientist, I am increasingly convinced that mastering SQL is essential to have a successful career. That’s why, if you’ve been following my…

Simulating Traffic Flow in Python

Although traffic doesn’t always flow smoothly, cars seamlessly crossing intersections and turning and stopping at traffic signals can look quite magnificent. This contemplation got me thinking of how…

5 Online Data Science Courses You Can Finish in 1 Day

Don’t we all wish that our days were longer than 24 hours? So that we can fit more stuff in every day? But, unfortunately, time today is — and always has been — very tricky to handle, especially if…

GPT-4 Will Have 100 Trillion Parameters — 500x the Size of GPT-3

OpenAI will release GPT-4 in the next few years. It will have around 100 trillion parameters.

All the Datasets You Need to Practice Data Science Skills and Make a Great Portfolio

Every time I attempt to do a project for learning a new topic or for a project I spend a significant amount of time finding a suitable dataset for that. That way I have quite a lot of datasets that…