Deep Dive into Transformers by Hand ✍︎

It is a Tesla Cyber Truck and I have tried to explain that name to my son many times but he insists on calling it Robo-Truck. Now every time I look at Robo-Truck and hear that name, it reminds me of…

Write-Audit-Publish for Data Lakes in Pure Python (no JVM)

In this blog post we provide a no-nonsense, reference implementation for Write-Audit-Publish (WAP) patterns on a data lake, using Apache Iceberg as an open table format, and Project Nessie as a data…

Enhancing Readability of Python Code via Annotations

Code clarity is both a virtue and a necessity. If you write clear and readable code, other developers will be able to understand it, users will understand how to use it, and even the future you will…

How to Train a Decision Tree Classifier… In SQL

When it comes to machine learning, I’m an avid fan of attacking data where it lives. 90%+ of the time, that’s going to be a relational database, assuming we’re talking about supervised machine…

In Defense of LLMs in Data Science: What ChatGPT Can and Can’t Do for Your Data Science Career

ChatGPT can take your data science game to the next level - if you know how to use it. Take your data science career to the next level with Generative AI.

Learn AI Together — Towards AI Community Newsletter #19

Good morning, AI enthusiasts! After a much-needed break last week, we are back with exciting collaboration opportunities, some of our best articles written by AI experts worldwide, and fun…

How to Use Synthetic and Simulated Data Effectively

Using synthetic data isn’t exactly a new practice: it’s been a productive approach for several years now, providing practitioners with the data they need for their projects in situations where…

Introduction to DBpedia

Have you heard of Wikipedia? You most likely have. We love Wikipedia and its extensive information. There are times when we need to search for information on a topic and compile it into a table. For…

Using a Multimodal Document ML Model to Query Your Documents

This article will discuss the Alibaba document understanding model, recently released with model weights and datasets. It is a powerful model capable of performing various tasks such as document…

Moirai: Time Series Foundation Models for Universal Forecasting

Future of Predictive Analytics: Explore Moirai, Salesforce's New Foundation Model for Advanced Time Series Forecasting