Image by Author
Do you remember that one data science course you signed up for but never got around to finishing? Well, you’re not alone.
Most data science beginners enroll in one or more courses: free or paid. But because data science courses typically cover a wide range of topics—from programming to data analysis, visualization, and more—it takes several weeks to work through them. And even if they start strong, most learners start feeling overwhelmed after the first few modules and fail to make progress. Enter Kaggle (micro)courses.
The series of micro-courses from Kaggle are a good alternative if you find longer courses harder to get through. They are great resources to learn data science skills—Python, pandas, machine learning, and more—without feeling overwhelmed. The courses are designed such that they take only a few hours to finish, and include tutorial and practice components. Now let’s go over some beginner-friendly courses and what they cover.
Python is one of the most widely used languages in data science. Besides helping you in your data career, Python is also helpful if you want to break into software engineering at some point. The Python course on Kaggle will help you learn the following:
- Python basics (syntax and variables)
- Functions
- Booleans and conditionals
- Lists, loops, and list comprehensions
- Strings and dictionaries
- Working with external libraries
If you feel like you need an even simpler intro to programming before diving into Python, you can check out the intro to programming course.
Because the subsequent courses on Pandas and data visualization require you to be comfortable with the contents of this course, you should not skip the Python course if you are new to programming with Python.
Link: Learn Python
Once you’re familiar with basic Python you can learn pandas, a powerful data analysis and manipulation library.
Through a series of short lessons and hands-on coding exercise, the pandas will help you learn to perform the following operations on pandas dataframes:
- Creating, reading, and writing
- Indexing, selecting, and assigning
- Renaming and combining
- Summary functions and maps
- Grouping and sorting
- Data types and missing values
Link: Learn Pandas
Now that you know how to analyze data with Python and pandas, it’s time to build on that by learning how to visualize your data.
The Data Visualization course covers the fundamentals of creating helpful plots and charts using the Python library Seaborn. The course covers the following:
- Line charts
- Bar charts and heat maps
- Scatterplots
- Histograms and density plots
- Choosing plot types
You also need to work on a final project to apply what you learned.
Link: Learn Data Visualization
SQL is the single most essential data science skill that you can learn. To understand why SQL is super important for data science, read “Why SQL is the Language to Learn for Data Science” by KDnuggets contributor Nate Rosidi.
The Intro to SQL course will teach you how to you query data ets with SQL using the BigQuery Python client and covers SQL fundamentals, filtering, and writing readable SQL queries:
- Getting started with SQL and BigQuery
- Select, from, and where
- Group by, having, and count
- Order by
- As and with
- Joining data
Link: Learn Intro to SQL
Now that you are comfortable with SQL basics, you can take the Advanced SQL course to develop your SQL skills further. This course builds on the intro to SQL course and covers the following topics on combining data from multiple tables and performing more complex operations:
- Joins and unions
- Analytic functions
- Nested and repeated data
- Writing efficient queries
Link: Learn Advanced SQL
If you’ve already worked your way through the above courses, you should be comfortable with programming and data analysis with Python and SQL. You’re now ready to get started with machine learning.
The Intro to Machine Learning course covers:
- How ML models work
- Basic data exploration
- Model validation
- Underfitting and overfitting
- Random forests
You can also make a submission to a beginner-friendly Kaggle competition.
Link: Learn Intro to Machine Learning
The Intermediate Machine Learning course builds on the Intro to Machine Learning course and teaches you how to handle missing values, categorical variables, and avoid the tricky problem of data leakage when training machine learning models.
The topic covered include:
- Missing values
- Categorical variables
- ML pipelines
- Cross validation
- XGBoost
- Data leakage
Link: Intermediate Machine Learning
I hope you found this round-up of courses helpful.
As mentioned, they’re all free. And it only takes a few hours to learn an essential data science skill. So you can start out on your data science journey one micro-course at a time. Happy learning!
Bala Priya C is a developer and technical writer from India. She likes working at the intersection of math, programming, data science, and content creation. Her areas of interest and expertise include DevOps, data science, and natural language processing. She enjoys reading, writing, coding, and coffee! Currently, she’s working on learning and sharing her knowledge with the developer community by authoring tutorials, how-to guides, opinion pieces, and more.