The previous nine tips covered the more apparent bits of advice to beginner data scientists.
The next round of interview tips deals with more nuanced aspects of painting yourself as the best candidate for the job.
Building on my previous article, these additional tips will further your chances at the beginner data science job interviews.
Creating a data science project portfolio is one of the best ways to showcase how good a data scientist you are.
It can be difficult for beginners to choose suitable projects for their portfolio. Here are some data science project ideas for a start. You can also dig into Datacamp’s suggestions or data projects we have on StrataScratch.
Domain knowledge means you’re knowledgeable of a specific industry, sector, or subject area. This knowledge includes intricacies, challenges, terminologies, processes, and nuances of that particular domain.
It has to be reflected in your coding skills, as you’ll use them to solve problems for a particular company within a particular industry.
When you practice coding, it would be ideal if you did that on actual questions by the company you’re interviewing for. I mentioned StrataScratch and LeetCode in the first article.
Of course, you can practice on challenges not coming from the interviews directly. But when you choose them, try to find data science interview questions and datasets from the relevant industry. Say you’re interviewing for Meta (tech industry) and Pfizer (pharmaceutical industry). These companies work with completely different data, which behaves differently. Naturally, the questions will then be different, too. So, for Meta, use tech/social media data, and for Pfizer, pharmaceutical data.
That way, you’re also ensuring that you’re improving your domain knowledge. You’ll possibly run into specific data you’re unfamiliar with, so you’ll have to learn about it and its importance within the industry.
Now, you’re connecting coding with domain knowledge!
Data storytelling means you can communicate the insights from your data projects clearly and understandably. Think about why you started a certain project and what you achieved; there’s always a story in there.
By creating a story from your project, you’ll make data more accessible for non-technical people. In return, you’ll have more influence on the decision-making.
Here are some tips for showcasing this skill.
Create a Narrative: Any good story has an arc: exposition, a problem, rising action, climax, falling action, and resolution. Include this in telling your story using data.
You could start with the business context, e.g., “The company launched five new products in the last three years.” Then, the problem. You noticed that sales are increasing, but customer retention is not. You’ll now act by going deeper into the data and trying to find the reasons for the retention issue. Here, your story should dig deep into the technical aspects of the project: what you did and why. The climax is when you find one product with high sales but also high return rates. The falling action is when you discuss potential reasons for high returns. In the resolution, you make a recommendation for product improvements. In the resolution, directly relate to what your project did, and quantify its achievements. Don’t let your story finish with you giving product improvement recommendations, but tell about the sales increase of that product, how much money it brought to the company, etc.
Use Clear Visualizations: Use visualizations that support your story.
In your project on sales trends over the past years, don’t just show a table with monthly sales figures. Instead, use a line graph to visually represent the ups and downs in sales. This way, the audience will grasp the trend. For significant spikes in sales, use a bar chart to break down the sales by product or product categories, highlighting which products drove the spike.
Avoid Jargon And Simplify Complex Concepts: Use technical terminology only when necessary. The point is to ‘sell’ (sometimes even literally) your idea and project to business people, so simplify complex concepts for them. Don’t say, “The heteroscedasticity in the residuals indicated that our linear regression model might not be the best fit.” Instead, say, “The patterns in our data suggested that our initial model might not be capturing all the information effectively.” Much better!
We all make mistakes. They are necessary in the learning process. Interviewers don’t seek a perfect candidate; they’re looking for someone who wants and can learn.
Let the interviewer get to know that side of you. If you honestly share your failures and what you learned from them, it will build trust and demonstrate your setbacks resilience.
Here are some tips on how to talk about this.
Avoid the Blame Game: Avoid blaming everyone and everything for your mistakes. Of course, give the context of circumstances out of your control, but don’t play a victim. Take responsibility for your part, show what you learned from these circumstances, and talk about what you should’ve done differently.
Emphasize the Learning, Not the Failure Part: Talking about the failures should only serve to present how and what you learned from them, so focus on that.
Talk From Experience: Find a real example from your previous job. Even if it’s not in data science, it may be applicable if it shows your focus on learning and self-awareness. If you don’t have working experience, talk about mistakes you made in your data projects and what you learned.
Talk About the Steps You Took: This relates to what you did to correct your mistake or minimize its impact, e.g., changing the data, tweaking the algorithm, or altogether ditching the project and starting a new.
Here’s what the conversation between you (Y) and the interviewer (I) might look like.
I: “Can you tell me about a time when a project or task didn’t go as planned and how you handled it?”
Y: Of course! During my previous role as a data scientist, I was responsible for a project aimed at predicting customer churn. I chose the k-nearest neighbor algorithm based on my initial understanding and ran with it. However, the results were not as accurate as I had hoped.
I: How come? What did you do when you realized that?
Y: There were some data inconsistencies, and the deadline was very tight, so my EDA wasn’t very thorough. Despite that, I realize now that I should’ve done a more detailed EDA. After I found out about the inconsistencies, I collaborated with the data quality team to understand them better. I also explored other algorithms and evaluated them. Finally, I switched to the XGBoost algorithm, which improved the model’s prediction accuracy significantly. I learned not to underestimate the importance of EDA. I’m also glad that I wasn’t afraid to admit mistakes and start from scratch having in mind that we would have no use for a model we couldn’t trust.
Data science is more than brainless data handling and code writing. This involves being able to translate your work into mere mortals’ language via data storytelling and visualizations.
You will showcase this by talking about it in the interview. You need to make sure you can talk the talk but also prove that you can walk the walk. The best way to do this is by having a solid data project portfolio, where your coding, storytelling, and visualization skills will be apparent.
While working on the projects, you’ll make mistakes. Don’t hide them. Talk about them openly and seek feedback from your interviewer.
It boils down to two simple things: be competent and honest about how you achieved that. Easier said than done!
But with some tips I gave you in this article, I’m sure you’ll do well in your next data science interview!
Nate Rosidi is a data scientist and in product strategy. He’s also an adjunct professor teaching analytics, and is the founder of StrataScratch, a platform helping data scientists prepare for their interviews with real interview questions from top companies. Connect with him on Twitter: StrataScratch or LinkedIn.