Better Visualizations, Advanced ETL Techniques, RAG Pain Points, and Other February Must-Reads

By Jessie Hobb On Feb 29, 2024

February might be the shortest month, but it certainly didn’t feel this way here at TDS, where our authors have been on top of their game, sharing strong contributions on timely topics — including some of the longest and most-read articles of the year so far.

Now that most of us have settled into the flow of things in 2024, we see our readers focus slightly less on career moves and more on core skills and concrete solutions to common issues. Our most-read and -discussed articles of the past month reflect that, and below you’ll find a representative sample of our February standouts.

Monthly Highlights

The Math Behind the Adam Optimizer
In a clear, accessible, and widely shared explainer, Cristian Leo unpacks the mathematical inner workings of the Adam (Adaptive Moment Estimation) optimizer and, along the way, helps us understand why it’s become such a popular choice among deep learning practitioners.
12 RAG Pain Points and Proposed Solutions
While retrieval-augmented generation continues to make waves as a powerful option for boosting LLMs’ performance, its shortcomings are becoming clearer, too. Wenqi Glantz offers a useful resource for anyone who’s felt stuck implementing a RAG system recently, compiling 12 common pitfalls as well as suggested workarounds.
Data Visualization 101: Playbook for Attention-Grabbing Visuals
For anyone looking to create “clearer, sharper and smarter visuals”—and who isn’t, really?—the latest data-visualization guide by Mariya Mansurova is essential reading, as it leverages numerous concrete examples (in Plotly) to showcase essential design principles in action.

Photo by Kelly Sikkema on Unsplash

Advanced ETL Techniques for Beginners
If you’re an early-stage data engineer who’d like to give your data-ingestion skills a boost, 💡Mike Shakhomirov’s new tutorial is one you should definitely explore (and bookmark): it covers typical ingestion patterns and provides code snippets you can use to start tinkering on your own.
Advanced Retrieval-Augmented Generation: From Theory to LlamaIndex Implementation
Interested in diving further into the exciting world of RAG? Leonie Monigatti explains the nitty-gritty details of pre-retrieval, retrieval, and post-retrieval optimizations, before walking us through the process of transforming a “naive” RAG pipeline into an advanced one.
Top Evaluation Metrics for RAG Failures
We turn to RAG one final time this week, this time for Amber Roberts’s most recent contribution: a handy resource on troubleshooting unexpected or underwhelming performance, and on applying robust response and retrieval evaluation metrics to ensure all the pieces in your pipeline are working in harmony.
Building a Data Platform in 2024
Three years after first tackling this topic, we were thrilled to welcome back Dave Melillo, whose new post reevaluates the key components of efficient data platforms. He shares valuable insights based on his experience navigating the data challenges of various industries, and having worked with both “large corporations and nimble startups.”

An Extra Dose of Python

Some of our most popular posts in the past few weeks covered the always-timely topic of Python programming for data and ML professionals. In case you missed them:

How should you go about learning Python as a total beginner? Egor Howell offers a clear and practical roadmap.
If you’re not familiar with the @property decorator yet, you definitely will be by the time you finish reading Siavash Yasini’s comprehensive introduction.
Anyone into AI app-building should take a look at Naomi Kriger’s hands-on tutorial on creating a speech-to-text-to-speech program with the pyttsx3 library.
Taking inspiration from Robert C. Martin’s time-tested Clean Code book, Patrick Brus outlines the core principles behind writing—you guessed it—clean and effective Python code.
For even more Python tutorials and project walkthroughs, don’t miss our recent roundup of advanced and niche use cases.

Our latest cohort of new authors

Every month, we’re thrilled to see a fresh group of authors join TDS, each sharing their own unique voice, knowledge, and experience with our community. If you’re looking for new writers to explore and follow, just browse the work of our latest additions, including Sarthak Handa, Vadim Arzamasov, Mahyar Aboutalebi, Ph.D. 🎓, James W, Mohammed Mohammed, Kirsten Jiayi Pan, Matthew Chak, Ugur Yildirim, Mikayil Ahadli, Hamza Gharbi, Sami Abboud, Matthew Gunton, Eivind Kjosbakken, Eva Revear, Nithhyaa Ramamoorthy, Rami Krispin, Kennedy Selvadurai, PhD, Vassily Morozov, Patrick Beukema, Thomas Rouch, Ritanshi Agarwal, Rohan Nanda, Nikolaus Correll, Mert Ersoz, Dani Lisle, Roberta Rocca, Adil Rizvi, Matthew Turk, Celia Banks, Ph.D., Skylar Jean Callis, Ryan McDermott, Anand Subramanian, Aayush Agarwal, P.G. Baumstarck, Jose D. Hernandez-Betancur, Khin Yadanar Lin, and Daniel Kang, among others.

Thank you for supporting the work of our authors! If you’re feeling inspired to join their ranks, why not write your first post? We’d love to read it.

Until the next Variable,

TDS Team

Better Visualizations, Advanced ETL Techniques, RAG Pain Points, and Other February Must-Reads was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.