Techno Blender
Digitally Yours.
Browsing Tag

Deaconu

How Data Leakage affects model performance claims | by Georgia Deaconu | Jan, 2023

This year has seen several important scientific advancements enabled by machine learning driven research. Along with the enthusiasm came also some worry related to the reproducibility issues encountered in ML-based science. Several methodological problems have been identified, out of which data leakage seems to be the most widespread. Generally, data leakage can skew results and lead to overly optimistic conclusions.There are several different ways in which data leakage can occur. The objective of this post is to present…

Monitoring Databricks jobs through calls to the REST API | by Georgia Deaconu | Oct, 2022

Monitoring jobs that run in a Databricks production environment requires not only setting up alerts in case of failure but also being able to easily extract statistics about jobs running time, failure rate, most frequent failure cause, and other user-defined KPIs.The Databricks workspace provides through its UI a fairly easy and intuitive way of visualizing the run history of individual jobs. The matrix view, for instance, allows for a quick overview of recent failures and shows a rough comparison in terms of run times…