Techno Blender
Digitally Yours.
Browsing Tag

Cornellius

Synthetic Data to Help Fraud Machine Learning Modelling | by Cornellius Yudha Wijaya | Sep, 2022

Synthetic data could help mitigate fraud casesPhoto by Maxim Berg on UnsplashFraud cases are common in any business industry and cause massive financial loss. Small or Big, every business would face the fraud problem whether they like it or not — as long as there are people with bad intentions.Many efforts have been exerted in machine learning fraud detection research to mitigate the fraud problem, yet there is still no perfect solution. It’s understandable because every business has different requirements, and data is…

3 Unique Python Packages for Time Series Forecasting | by Cornellius Yudha Wijaya | Sep, 2022

Some of the time series packages you could add to your arsenalPhoto by Ralph Hutter on UnsplashTime series forecasting is a method in the statistics field to analyze historical data with a time component and create a prediction based on it.Some classic examples of time series forecasting methods are Moving Average, ARIMA, and Exponential Smoothing. These methods have been used for a long time and are still useful now because of how easy it is for users to explain the result — although with less accurate prediction.In…

Top Python Packages for Feature Engineering | by Cornellius Yudha Wijaya | Sep, 2022

Know these packages to improve your data workflowPhoto by Markus Spiske on UnsplashFeature engineering is the process of creating new features from the existing data. Whether we made a simple addition of two columns or combined more than a thousand features, the process is already considered feature engineering.The feature engineering process is inherently different from data cleaning. While feature engineering creates additional features, data cleaning might change or decrease the existing feature.Feature engineering is…

Integrate Bias Detection in Your Data Science Skill Set | by Cornellius Yudha Wijaya | Aug, 2022

Don’t forget that bias could impact your projectPhoto by Christian Lue on UnsplashWhen we talk about bias in the data science world, it will refer to the machine learning error in learning the input data and being unable to give the prediction objectively. If we make a human analogy, bias in machine learning could mean the model favours specific predictions/conditions over others. Why do we need to be concerned with bias?Model bias is a potential problem in our data science project because the relationship between the…

31 Uniques Python Packages To Improve Your Data Workflow | by Cornellius Yudha Wijaya | Jul, 2022

Various Python packages for data peoplePhoto by Alexander Schimmeck on UnsplashData is a vast field with big community support the technology development. Furthermore, Python has avid supporter that helps the data world become more accessible and brings value to the data workflow.Various Python packages have been developed to help data people in their works. In my experience, many useful data Python packages lack recognition or still growing in popularity.That is why, in this article, I want to introduce you to several…

3 Python Packages for Automatic Dataset-Labeling Process | by Cornellius Yudha Wijaya | Jun, 2022

Data labeling is crucial for the machine learning project's successPhoto by Murat Onder on UnsplashData science projects involve a lot of data collection, cleaning, and processing. We did all the steps to ensure that the dataset quality was good for the machine learning training. Although, there is one specific important part of the data science project that could make or break the project: labeling.Every data science project is developed to solve a specific business problem, for example, churn, propensity-to-buy, fraud,…

4 Python Packages to Learn Causal Analysis | by Cornellius Yudha Wijaya | Jun, 2022

Learn cause and effect analysis with these packagesPhoto by fabio on UnsplashCausal Analysis is a field within experimental statistics to prove and establish the cause and effect relationship. In statistics, using statistical algorithms to infer causality within the dataset under the strict assumption is called Exploratory causal analysis (ECA).ECA, in turn, is a way to prove causation with more controllable experimentations and not only based on the correlation. We often need to prove the Counterfactual — A different…

3 Python Packages for Interactive Data Analysis | by Cornellius Yudha Wijaya | Jun, 2022

Explore data in a more interactive wayPhoto by Towfiqu barbhuiya on UnsplashData analysis is a staple activity for any data person and is required to understand what we are working on. To help the data analysis process, we have used the Python language for an easier workflow. However, sometimes we want a more interactive way to explore data. Some have developed Python packages to interactively… Explore data in a more interactive wayPhoto by Towfiqu barbhuiya on UnsplashData analysis is a staple activity for any data…

4 Python Packages to Create Interactive Dashboards | by Cornellius Yudha Wijaya | May, 2022

Use these packages to improve your data science projectPhoto by Luke Chesser on UnsplashA data science project is inherently a storytelling project for your audience. It doesn’t matter how good your project is; if the other party doesn’t understand your data insight and findings, no action can be taken.One way to present your project to the audience is by creating an interactive dashboard. Why interactive? Because the action was significantly remembered by the audience more than a static insight. That is why, if possible,…

Top 5 Browser Extensions for Data Scientists | by Cornellius Yudha Wijaya | May, 2022

These extensions would help your work tremendouslyPhoto by Glenn Carstens-Peters on UnsplashMost data scientist work was done in our browser via Jupyter Notebook or another similar browser-based notebook in our modern era. There are portions where the work could be done outside the browser, but we keep returning to the browser-based notebook.Because we spend most of the time in our internet browser, I want to introduce my top browser extensions that would help the data scientist work. What were the extensions? Let’s get…