Data Science Portfolios, Speeding Up Python, KANs, and Other May Must-Reads

Feeling inspired to write your first TDS post? We’re always open to contributions from new authors.

With May drawing to a close and summer right around the corner for those of us in the Northern Hemisphere, it’s time once again to look back at the standout articles we’ve published in the past month: those stories that resonated the most with learners and practitioners across a wide swath of data science and machine learning disciplines.

We were delighted to see a particularly eclectic lineup of posts strike a chord with our readers. It’s a testament to the diverse interests and experiences that TDS authors bring to the table, as well as to the increasing demand for well-rounded data professionals who can write clean code, stay up-to-date with the latest LLMs, and—while they’re at it—know how to tell a good story about (and through) their projects. Let’s dive right in.

Monthly Highlights

Python One Billion Row Challenge — From 10 Minutes to 4 Seconds
With a longstanding reputation for slowness, you’d think that Python wouldn’t stand a chance at doing well in the popular “one billion row” challenge. Dario Radečić’s viral post aims to show that with some flexibility and outside-the-box thinking, you can still squeeze impressive time savings out of your code.
N-BEATS — The First Interpretable Deep Learning Model That Worked for Time Series Forecasting
Anyone who enjoys a thorough look into a model’s inner workings should bookmark Jonte Dancker’s excellent explainer on N-BEATS, the “first pure deep learning approach that outperformed well-established statistical approaches” for time-series forecasting tasks.
Build a Data Science Portfolio Website with ChatGPT: Complete Tutorial
In a competitive job market, data scientists can’t afford to be coy about their achievements and expertise. A portfolio website can be a powerful way to showcase both, and Natassha Selvaraj’s patient guide demonstrates how you can build one from scratch with the help of generative-AI tools.

Data Science Portfolios, Speeding Up Python, KANs, and Other May Must-Reads 1 — Photo by Tim Mossholder on Unsplash

A Complete Guide to BERT with Code
Why not take a step back from the latest buzzy model to learn about those precursors that made today’s innovations possible? Bradney Smith invites us to go all the way back to 2018 (or several decades ago, in AI time) to gain a deep understanding of the groundbreaking BERT (Bidirectional Encoder Representations from Transformers) model.
Why LLMs Are Not Good for Coding — Part II
Back in the present day, we keep hearing about the imminent obsolescence of programmers as LLMs continue to improve. Andrea Valenzuela’s latest article serves as a helpful “not so fast!” interjection, as she focuses on their inherent limitations when it comes to staying up-to-date with the latest libraries and code functionalities.
PCA & K-Means for Traffic Data in Python
What better way to round out our monthly selection than with a hands-on tutorial on a core data science workflow? In her debut TDS post, Beth Ou Yang walks us through a real-world example—traffic data from Taiwan, in this case—of using principle component analysis (PCA) and K-means clustering.

KANs in the Spotlight

If we had to name the topic that created the biggest splash in recent weeks, KANs (Kolmogorov-Arnold Networks) would be an easy choice. Here are three excellent resources to help you get acquainted with this new type of neural network, introduced in a widely circulated paper.

Kolmogorov-Arnold Networks: The Latest Advance in Neural Networks, Simply Explained
For a clear and accessible primer on KANs, you can’t do better than Theo Wolf’s easy-to-follow post.
Kolmogorov-Arnold Networks (KANs) for Time Series Forecasting
Looking at KANs from the perspective of a more specialized use case, Marco Peixeiro shows how they can be applied in the context of time series forecasting.
Understanding Kolmogorov–Arnold Networks (KAN)
Finally, for a more complete (but still reader-friendly) paper walkthrough, look no further than Hesam Sheikh’s debut TDS article.

Our latest cohort of new authors

Every month, we’re thrilled to see a fresh group of authors join TDS, each sharing their own unique voice, knowledge, and experience with our community. If you’re looking for new writers to explore and follow, just browse the work of our latest additions, including Eyal Aharoni and Eddy Nahmias, Hesam Sheikh, Michał Marcińczuk, Ph.D., Alexander Barriga, Sasha Korovkina, Adam Beaudet, Gurman Dhaliwal, Ankur Manikandan, Konstantin Vasilev, Nathan Reitinger, Mandy Liu, Beth Ou Yang, Maicol Nicolini, Alex Shpurov, Geremie Yeo, W Brett Kennedy, Rômulo Pauliv, Ananya Bajaj, 林育任 (Yu-Jen Lin), Sumit Makashir, Subarna Tripathi, Yu-Cheng Tsai, Nika, Bradney Smith, Katia Gil Guzman, Miguel Dias, PhD, Bào Bùi, Baptiste Lefort, Sheref Nasereldin, Ph.D., Marcus Sena, Atisha Rajpurohit, Jonathan Bennion, Dunith Danushka, Bernd Wessely, Barna Lipics, Henam Singla, Varun Joshi and Gauri Kamat, and Yu Dong.

Thank you for supporting the work of our authors! We love publishing articles from new authors, so if you’ve recently written an interesting project walkthrough, tutorial, or theoretical reflection on any of our core topics, don’t hesitate to share it with us.

Until the next Variable,

TDS Team

Data Science Portfolios, Speeding Up Python, KANs, and Other May Must-Reads was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

Source link

What's Hot

Microsoft Released LLM2CLIP: A New AI Technique in which a LLM Acts as a Teacher for CLIP’s Visual Encoder

This Machine Learning Paper Transforms Embodied AI Efficiency: New Scaling Laws for Optimizing Model and Dataset Proportions in Behavior Cloning and World Modeling Tasks

Gradient Boosting | Towards Data Science

Data Science Portfolios, Speeding Up Python, KANs, and Other May Must-Reads

Gradient Boosting | Towards Data Science

A Practical Framework for Data Analysis: 6 Essential Principles | by Pararawendy Indarjo | Nov, 2024

How I Created a Data Science Project Following CRISP-DM Lifecycle | by Gustavo Santos | Nov, 2024

Leave A Reply Cancel Reply

How ML AI Can Help Businesses Reduce Overhead Costs

How the AI Surge May Help Current WFH Employees

The ultimate contact center automation guide

Top 5AI Development Companies To Transform Your Business | by Amyra Sheldon

Microsoft Released LLM2CLIP: A New AI Technique in which a LLM Acts as a Teacher for CLIP’s Visual Encoder

This Machine Learning Paper Transforms Embodied AI Efficiency: New Scaling Laws for Optimizing Model and Dataset Proportions in Behavior Cloning and World Modeling Tasks

Gradient Boosting | Towards Data Science

The Complete Guide to NetSuite Saved Searches

Our Picks

Microsoft Released LLM2CLIP: A New AI Technique in which a LLM Acts as a Teacher for CLIP’s Visual Encoder

This Machine Learning Paper Transforms Embodied AI Efficiency: New Scaling Laws for Optimizing Model and Dataset Proportions in Behavior Cloning and World Modeling Tasks

Gradient Boosting | Towards Data Science

What's Hot

Data Science Portfolios, Speeding Up Python, KANs, and Other May Must-Reads

Monthly Highlights

KANs in the Spotlight

Our latest cohort of new authors

Related Posts

Leave A Reply Cancel Reply