Evolution of Natural Language Processing

Understand how NLP has evolved to the point of the huge and awe-inspiring Transformer models such as BERT and GPT-3 in use today with easy to grasp visuals.

3 Simple Outlier/Anomaly Detection Algorithms every Data Scientist needs

In statistics, an outlier is a data point that differs significantly from other observations. From the figure above, we can clearly see that while most points lie in and around the linear hyperplane…

Struggling with data imbalance? Semi-supervised & Self-supervised learning help!

Let me introduce to you our latest work, which has been accepted by NeurIPS 2020: Rethinking the Value of Labels for Improving Class-Imbalanced Learning. This work mainly studies a classic but very…

How to Practice SQL on AWS for Free

One of the required skills of a data scientist is working on databases using SQL. You might argue that it is the job of a data engineer but the data scientist roles are inclined to being full-stack…

Google Colab 101 Tutorial with Python — Tips, Tricks, and FAQ

Google Colab is a project from Google Research, a free, Jupyter based environment that allows us to create Jupyter [programming] notebooks to write and execute Python [1](and other Python-based…

PyCaret 2.2 is here — What’s new?

PyCaret is an open-source, low-code machine learning library in Python that automates machine learning workflows. It is an end-to-end machine learning and model management tool that speeds up the…

towardsdatascience.com 15 hours ago

Escalating your database

When a project grows, one of the most common bottlenecks is on the database side. It could come because the program's complexity has been increased due to new requirements, or more often, due to an…

Four Types of Random Sampling Techniques Explained with Visuals

The secret to minimizing biased data!. “Four Types of Random Sampling Techniques Explained with Visuals” is published by Terence S in Towards Data Science.

Improve your SQL with these templates for formatting and documentation

To save time for everyone who read my code (including myself), I have tried to apply two templates to my query and found that they are very helpful to: To demonstrate their usage, I will go through…

Pandas vs SQL — Compared with Examples

How the same tasks can be done using both Pandas and SQL. Comparison based on the tasks of a typical data analysis process.