Browsing Tag

Aparna

Microsoft appoints Aparna Gupta as global delivery center leader

Stella Barnes Nov 21, 2023 0

Applying Large Language Models to Tabular Data to Identify Drift | by Aparna Dhinakaran | Apr, 2023

Jessie Hobb Apr 25, 2023 0

Image created by author using Dall-E 2Can LLMs reduce the effort involved in anomaly detection, sidestepping the need for parameterization or dedicated model training?Follow along with this blog’s accompanying colab.This blog is a collaboration with Jason Lopatecki, CEO and Co-Founder of Arize AI, and Christopher Brown, CEO and Founder of Decision PatternsRecent advances in large language models (LLM) are proving to be a disruptive force in many fields (see: Sparks of Artificial General Intelligence: Early Experiments…

Boosting Tabular Data Predictions with Large Language Models | by Aparna Dhinakaran | Apr, 2023

Jessie Hobb Apr 6, 2023 0

Image by authorWhat happens when you unleash GPT-4 on a tabular Kaggle competition to predict home prices?Follow along with this blog’s accompanying Colab.This blog is a collaboration with Jason Lopatecki, CEO and Co-Founder of Arize AI, and Christopher Brown, CEO and Founder of Decision PatternsThere are two distinct groups in the ML ecosystem. One works with highly organized data collected in tables — the tabular-data-focused data scientist. The other works on deep learning applications including vision, audio, large…

How to Understand and Use Jensen Shannon Divergence | by Aparna Dhinakaran | Mar, 2023

Jessie Hobb Mar 2, 2023 0

Image by authorA primer on the math, logic, and pragmatic application of JS Divergence — including how it is best used in drift monitoringIn machine learning systems, drift monitoring can be critical to delivering quality ML. Some common use cases for drift analysis in production ML systems include:Detect feature changes between training and production to catch problems ahead of performance dipsDetect prediction distribution shifts between two production periods as a proxy for performance changes (especially useful in…

Understanding KL Divergence. A guide to the math, intuition, and… | by Aparna Dhinakaran | Feb, 2023

Jessie Hobb Feb 2, 2023 0

Image by authorA guide to the math, intuition, and practical use of KL divergence — including how it is best used in drift monitoringKullback-Leibler divergence metric (relative entropy) is a statistical measurement from information theory that is commonly used to quantify the difference between one probability distribution from a reference probability distribution.While it is popular, KL divergence is sometimes misunderstood. In practice, it can also sometimes be difficult to know when to use one statistical distance…

Demystifying NDCG. How to best use this important metric… | by Aparna Dhinakaran | Jan, 2023

Jessie Hobb Jan 25, 2023 0

Image by authorHow to best use this important metric for monitoring ranking modelsRanking models underpin many aspects of modern digital life, from search results to music recommendations. Anyone who has built a recommendation system understands the many challenges that come from developing and evaluating ranking models to serve their customers.While these challenges start in data preparation and model training and continue through model development and model deployment, often what tends to give data scientists and…

A Quickstart Guide To Uprooting Model Bias | by Aparna Dhinakaran | Jan, 2023

Jessie Hobb Jan 19, 2023 0

Image by authorIn today’s world, it is all too common to read about AI acting in discriminatory ways. From real estate valuation models that reflect the continued legacy of housing discrimination to models used in healthcare that amplify inequities around access to care and health outcomes, examples are unfortunately easy to find. As machine learning (ML) models get more complex, the true reach of this issue and its impact on marginalized groups is not likely fully known. Fortunately, there are a few simple steps that ML…

Measuring Embedding Drift. Approaches for measuring… | by Aparna Dhinakaran | Dec, 2022

Jessie Hobb Dec 8, 2022 0

Approaches for measuring embedding/vector drift for unstructured data, including for computer vision and natural language processing modelsImage by authorData drift in unstructured data like images is complicated to measure. The metrics typically used for drift in structured data — such as population stability index (PSI), Kullback-Leibler divergence (KL divergence), and Jensen-Shannon divergence (JS divergence) — allow for statistical analysis on structured labels, but do not extend to unstructured data. The general…

Three Pitfalls To Avoid With Embeddings | by Aparna Dhinakaran | Jul, 2022

Jessie Hobb Jul 20, 2022 0

Image by authorWritten in collaboration with Francisco Castillo Carrasco, data scientist at Arize AI.IntroductionLet’s say that you have read a very helpful post demystifying embeddings and you’re really excited. Your social media company can certainly use them, so you fire up your notebook and start typing away. As the clock ticks, excitement turns to frustration and you wonder: how do people even do this?There are a few gotcha moments with embeddings. No post could ever cover every scenario, but this one will attempt to…

The Three Types of Observability Your System Needs | by Aparna Dhinakaran | Jun, 2022

Jessie Hobb Jun 16, 2022 0

Image by authorThis article is written in partnership with Kyle Kirwan, Co-founder and CEO at BigeyeIn 1969, humans first stepped on the moon thanks to a lot of clever engineering and 150,000 lines of code. Among other things, this code enabled engineers at mission control to have the full view of the mission and make near-real-time decisions. The amount of code was so small that engineers were able to thoroughly debug the software, and its performance was nearly flawless. Today’s search engines, on the other hand,…

1 2 Next