Browsing Tag

Karun

Top 5 FAQs on Writing a Great Data Science Resume | by Karun Thankachan | Sep, 2022

Jessie Hobb Sep 28, 2022 0

The answers to most common question on writing a DS/ML resumeFig 1. Photo by Christina @ wocintechchat.com on UnsplashFollowing are some of the most common questions I have received on writing a Data Science Resume. The suggestions provided will help you tailor your resume to match better with a set of job requirements, and get past the initial resume screening more often.The following suggestions have personally helped me put together a concise resume that's custom-made for each role that I have applied to, and increased…

Three Must Dos for a Successful Data Science Career | by Karun Thankachan | Aug, 2022

Jessie Hobb Aug 15, 2022 0

OpinionKey things to do to fast-track your data science careerPhoto by Joshua Earle on UnsplashData Science was declared the sexiest job of the 21st Century. With the excitement around AI, and publicity data science received, there has been an increasing number of folks entering the field every year. With such competition, it might seem difficult to stand out and achieve success in Data Science. This post covers some tried and tested actions to fast track your Data Science career, and no, it's not another technical…

What? When? How?: ExtraTrees Classifier | by Karun Thankachan | Aug, 2022

Jessie Hobb Aug 9, 2022 0

What is ExtraTrees Classifier? When to use it? How to implement it?Photo by Eunice Lituañas on UnsplashTree based models have increased in popularity over the last decade, primarily due to their robust nature. Tree-based models can be used on any type of data (categorical/continuous), can be used on data that is not normally distributed, and require little if any data transformations (can handle missing value/scale issues etc.)While Decision Trees and Random Forest are often the go to tree-based models, a lesser known one…

Every statistical test to check feature dependence | by Karun Thankachan | Jul, 2022

Jessie Hobb Jul 11, 2022 0

Correlation and hypothesis tests for different datatypes and assumptionsThis post covers the statistical tests to detect dependence between pairs of feature, irrespective of their datatype, and the which test to use based on the properties of your data.Photo by Nicholas Cappello on UnsplashYou would often find yourself using statistical tests during exploratory data analysis. In a supervised setting, it could be to see if there is a dependence between feature and target variables, so as to decide if the dataset can be…

Should you use pandas correlation function? | by Karun Thankachan | Jul, 2022

Jessie Hobb Jul 7, 2022 0

Limitation of pandas corr() and how to combat itPhoto by Chris Liverani on UnsplashCorrelation is defined as the association between two random variables. In statistics it normally refers to the degree a pair of variables are linearly related.Aside: A mandatory warning that must be mentioned when talking about correlation is “Correlation does not imply causation”. Check out this article for more about this.You would often find yourself using correlation during exploratory data analysis. In a supervised setting, it could…

Introduction to Adaptive Learning | by Karun Thankachan | Jul, 2022

Jessie Hobb Jul 7, 2022 0

Using machine learning and data science to personalize educationPhoto by Alexandre Van Thuan on UnsplashData Science (DS) and Machine Learning (ML) are often leveraged to build personalised products. The application of personalization to education is what encompasses the field of Adaptive Learning. It’s a relatively less commercialised application of DS/ML that has attracted top researchers in the field including Tom Mitchell (yes, THE Tom Mitchel) to run companies dedicated to solving it.In this post we will dive into…

Rise of Spark for Big Data Science | by Karun Thankachan | Jun, 2022

Jessie Hobb Jun 25, 2022 0

Apache Spark has become the go to solution when dealing with big data. Lets have a look at three reasons behind the popularity of Spark.As the amount of data available for processing and analytics increased we saw a slow but definite shift to distributed systems (check out my article on rise of distributed systems, specifically Hadoop here). However, data science and machine learning for ‘big data’, as of early 2000s, still proved challenging. The then cutting edge solutions such as Hadoop relied on Map Reduce, which fell…