Techno Blender
Digitally Yours.
Browsing Tag

Feb

February Edition: Let’s Talk About the Chatbot | by TDS Editors | Feb, 2023

(Yes, you know the one)Photo by Fleur on UnsplashTwo months after OpenAI released ChatGPT into the world, here’s one thing we can probably all agree on: it’s been a lot.How have we at TDS fared since the chatbot’s arrival? Why, thanks for asking! Imagine a roller coaster ride, set within a hall of mirrors, and lit by strobe lights: that’ll give you a rough idea of the experience from an editor’s perspective.We’ve never seen a single topic grab the collective attention of our community with as much intensity, and it’s been…

Prediction Performance Drift: The Other Side of the Coin | by Valeria Fonseca Diaz | Feb, 2023

We know the causes, let’s talk about the typesTwo sides of prediction performance drift (Image by author)The world of machine learning has moved and grown so fast that in less than two decades we are already at the next stage. The models are built, and now we need to know if they provide accurate predictions in the short, medium, and long term. So many methods, theoretical approaches, schools of thought, paradigms, and digital tools are in our pockets when it comes to building our models. Now then, we want to understand…

Understanding KL Divergence. A guide to the math, intuition, and… | by Aparna Dhinakaran | Feb, 2023

Image by authorA guide to the math, intuition, and practical use of KL divergence — including how it is best used in drift monitoringKullback-Leibler divergence metric (relative entropy) is a statistical measurement from information theory that is commonly used to quantify the difference between one probability distribution from a reference probability distribution.While it is popular, KL divergence is sometimes misunderstood. In practice, it can also sometimes be difficult to know when to use one statistical distance…

Think in SQL — Avoid Writing SQL in a Top to Bottom Approach | by Chengzhi Zhao | Feb, 2023

Write Clear SQL By Comprehend Logical Query Processing OrderPhoto by Jeffrey Brandjes on UnsplashYou might find writing SQL challenging due to its declarative nature. Especially for engineers familiar with imperative languages like Python, Java, or C, SQL is gear-switching and mind shifts to many people. Thinking in SQL is different than any imperative language and should not be learned and developed the same way.When working with SQL, do you write in the top to bottom approach? Do you start developing in SQL with the…

How to Use Map Functions for Data Science in R | by Rory Spanton | Feb, 2023

Learn powerful functional programming tools from the tidyversePhoto by Z on UnsplashAll data scientists need to repeat code. Whether you’re fitting a model to multiple datasets or changing many values at once, running the same code many times over is essential.There are many ways to repeat code. But while most programmers use loops, there are more succinct, readable, and efficient alternatives. Enter, the map family of functions from the purrr package.In this article, I’ll explain what mapping means, and how to use the…

Variance Reduction in Experiments — Part 1: Intuition | by Murat Unal | Feb, 2023

The intuition behind variance reduction and why it is important in randomized experiments.Photo by Mars Plex on UnsplashThis is the first part in a series of two articles where we are going to dive deep into variance reduction in experiments. In this article we are going to discuss why variance reduction is necessary and build an intuition behind its mechanism. In the second part we are going to evaluate the latest method in this space: MLRATE, as well as compare it to other well-established methods such as, CUPED.Let’s…

Anomaly Detection using Sigma Rules (Part 2) Spark Stream-Stream Join | by Jean-Claude Cote | Feb, 2023

A class of Sigma rules detect temporal correlations. We evaluate the scalability of Spark’s stateful symmetric stream-stream join to perform temporal correlations.Photo by Naveen Kumar on UnsplashFollowing up on our previous article, we evaluate Sparks ability to join a start-process event with it’s parent start-process event.In this article, we evaluated how Spark stream-stream join can scale. Specifically, how many events can it hold in in the join window.During our research, we evaluated a few approaches:Full joinDoing…

How Industry Data Scientists Make Their Work Count | by TDS Editors | Feb, 2023

It wasn’t that long ago that business leaders would earn nods of admiration merely by referring to their companies as “data-driven” or “data-informed.” These days, leveraging data in the decision-making process of your organization is no longer cutting edge; it’s the default.Still, translating all those terabytes (petabytes?) of information into concrete strategies and measurable decisions remains a challenge for many. This is where entrepreneurial data practitioners can make a real contribution to the success of their…

7 amazing Bay Area things to do this weekend, Feb. 3-5

February already? Where has the year gone? Fortunately, we have so many fun things to do this weekend, it’ll feel like time is slowing down. Hilarious musicals and hot wings, anyone? As with everything these days, be sure to double check websites for any last-minute health guidelines. Meanwhile, if you’d like to have this Weekender lineup delivered to your inbox every Thursday morning for free, just sign up at www.mercurynews.com/newsletters or www.eastbaytimes.com/newsletters. 1. SEE & HEAR: ‘Mean Girls’ hits town…

Data Sharing Challenges: Privacy and Security Concerns | by Louise de Leyritz | Feb, 2023

Navigating privacy and security when implementing data sharingPrivacy & security: the biggest challenges for data sharing — Image from CastorData sharing can bring many benefits to a company but also comes with its own set of problems. Two major issues that companies often struggle with are Privacy & Security. We will discuss these concepts in this third article of a series dedicated to data sharing.No one really likes to talk about these topics. I’ll be the first to admit that they’re not the most exciting things…