Techno Blender
Digitally Yours.

Building a Smart Wordle Solver with Java | by Daniel García Solla | Dec, 2022

An advanced String processing strategy to efficiently solve wordlePhoto by Brett Jordan on UnsplashIntroductionWordle is a videogame created recently in October 2021 by Josh Wardle where the user has six attempts to guess a word that usually changes daily. In each attempt, you can enter a word and receive feedback from the program in color-coded format.This format assigns a specific color out of three possible to every character in your word try. From left to right, if the letter is not contained in the word you want to…

Measuring Embedding Drift. Approaches for measuring… | by Aparna Dhinakaran | Dec, 2022

Approaches for measuring embedding/vector drift for unstructured data, including for computer vision and natural language processing modelsImage by authorData drift in unstructured data like images is complicated to measure. The metrics typically used for drift in structured data — such as population stability index (PSI), Kullback-Leibler divergence (KL divergence), and Jensen-Shannon divergence (JS divergence) — allow for statistical analysis on structured labels, but do not extend to unstructured data. The general…

PRegEx: Regular Expressions in Plain English in Python | by Frank Andrade | Dec, 2022

Creating regular expressions in Python has never been easierPhoto by Pixabay on PexelsMemorizing metacharacters in regular expressions (regex) isn’t hard, but building one that matches a complex text pattern is sometimes challenging.What if we could build regex using plain English?Well, now you can write easy-to-understand regular expressions using a Python library named PRegEx. This library can gently introduce beginners to the world of regex and even help out those who already know regex.Here’s how it works.Installing…

The Ins and Outs of Clustering Algorithms | by TDS Editors | Dec, 2022

Solving a data science problem often starts with asking the same simple questions over and over again, with the occasional variation: Is there a relationship here? Do these data points belong together? What about those other ones over there? How do the former relate to the latter?Things can (and do) become complicated very quickly—especially when we try to detect subtle patterns and relationships while dealing with large datasets. This is where clustering algorithms come in handy with their power to divide a messy pile of…

Main Differences & Use Cases Comparison

Have you ever wondered how Alexa, ChatGPT, or a customer care chatbot can understand your spoken or written comment and respond appropriately? NLP and NLU, two subfields of artificial intelligence (AI), facilitate understanding and responding to human language. Both of these technologies are beneficial to companies in various industries. Although NLP and NLU can be confused with each other, they are not the same and the differences between them make one capacity more essential than another for specific use cases (see…

A Callable Float? Python Fun and Creativity | Kozak

PYTHON PROGRAMMINGTo learn being creative, we’ll implement callable floating-point numbers in PythonPhoto by Kai Gradert on UnsplashAmong the built-in data types in Python, we have a number of types representing numbers, the most important being int and float. As everything in Python, their instances are objects; and as objects, they have their own attributes and methods. For example, this is what instances of the float type offer:As you see, float numbers offer many different methods to use. What they do not offer is a…

4 Data Preprocessing Operations with Scikit-learn | by Soner Yıldırım | Dec, 2022

Help the algorithm by making the data properPhoto by Harry Grout on UnsplashData preprocessing is a fundamental step in a machine learning pipeline. It depends on the algorithm being used but, in general, we cannot or should not expect algorithms to perform well with the raw data.Even well-structured models might fail to produce acceptable results if the raw data is not processed properly.Some might consider using the term data preparation to cover data cleaning and data preprocessing operations. The focus of this article…

Strength in Numbers: Why Does a Gradient Boosting Machine Work So Well? | by Paul Hiemstra | Dec, 2022

Where we learn why solving complex problems using simple basis functions is such a powerful conceptGradient Boosting algorithms such as xgboost are among the best performing models for tabular data. Together with other models such as Random Forests, gradient boosting fall under the category of ensemble models. The name derives from a core feature of this category: they do not fit a single large model, but a whole ensemble of models that together comprise the model. Ensemble models are strongly linked to the concept of a…

4 Things to Do When Applying Cross-Validation with Time Series | by Vitor Cerqueira | Dec, 2022

A few practical recommendations for getting better forecasting performance estimatesPhoto by Thought Catalog on UnsplashThis article is about evaluating forecasting models using cross-validation. You’ll learn a few good practices for applying cross-validation with time series.Primer on Cross-ValidationYou shouldn’t use the same data to train and test a model.Why is that?A model learns as many patterns as it can. Some patterns capture the true relationship between past and future observations. But, the model also learns…

Understanding Probability Distributions using Python | by Reza Bagheri | Dec, 2022

An intuitive and comprehensive guide to probability distributionsImage source: https://pixabay.com/vectors/bayesian-statistics-bell-curve-2889576/A probability distribution describes the probabilities of the values that a random variable can take. It is an important concept in statistics and probability theory, and every book on this topic discusses probability distributions along with their properties. However, they emphasize the mathematical properties of these distributions rather than the intuition behind them, and…