Techno Blender
Digitally Yours.

Unlock the Latest Transformer Models with Amazon SageMaker | by Heiko Hotz | Dec, 2022

A quick tutorial on extending and customising AWS’ Deep Learning ContainersImage by authour using MidjourneyAWS Deep Learning Containers (DLCs) have become a popular choice for training and deploying Natural Language Processing (NLP) models on Amazon SageMaker (SM), thanks to their convenience and ease of use. However, sometimes the latest versions of the Transformers library are not available in the prebuilt DLCs. In this blog post, we will extend these DLCs to train & deploy the latest Hugging Face models on AWS.…

Spectral Entropy — An Underestimated Time Series Feature | by Ning Jia | Dec, 2022

Time series are everywhere. As data scientists, we have various time series tasks, such as segmentation, classification, forecasting, clustering, anomaly detection, and pattern recognition.Depending on the data and the approaches, feature engineering could be a crucial step in solving those time series tasks. Well-engineered features can help understand the data better and boost models’ performance and interpretability. Feeding raw data to a black-box deep learning network may not work well, especially when data is…

A Reinforcement Learning-Based Inventory Control Policy for Retailers | by Guangrui Xie | Dec, 2022

Build a Deep Q Network (DQN) model to optimize the inventory operations for a single retailerPhoto by Don Daskalo on UnsplashInventory optimization is an important aspect of supply chain management, which is concerned with optimizing the inventory operations of businesses. It uses mathematical model to answer key questions like when to place a replenishment order to fulfill customers’ demand for a product and how much quantity to order. The major inventory control policies adopted in supply chain industry nowadays are…

Image Classification with No Data? | by Clement Wang | Dec, 2022

Theoretical review of Machine learning algorithms using less dataPhoto by R. Makhecha on UnsplashYou want to build a Machine learning model without much data? Machine learning is known to be data-hungry while gathering and annotating data requires time and is expensive. This article presents some methods to build an efficient image classifier with much less data!Introduction1. Transfer learning2. Leveraging unlabeled data3. Few-shot learning4. Weakly supervised learning and text-based zero-shot…

Functional Data Analysis: A Solution to the Curse of Dimensionality | by Donato Riccio | Dec, 2022

Using gradient boosting and FDA to classify ECG data in PythonPhoto by Markus Spiske on UnsplashThe curse of dimensionality refers to the challenges and difficulties that arise when dealing with high-dimensional datasets in machine learning. As the number of dimensions (or features) in a dataset increases, the amount of data required to accurately learn the relationships between the features and the target variable grows exponentially. This can make it difficult, to train a high-performing machine learning model on a…

How Can an Aspiring Data Scientist Find and Work on Real-World Projects? | by Arunn Thevapalan | Dec, 2022

I’ve done some courses online; what next?Image by Author via CanvaAs an aspiring data scientist, one of the best ways to gain experience and improve your skills is to work on real-world projects. This will not only allow you to apply what you have learned in a practical setting, but it will also allow you to learn new things and develop a portfolio that you can use to showcase your skills to potential employers.The truth is almost all of us know the value of real-world projects. You’ll always work on these when you get a…

10 Quick Pandas Tricks to Energize your Analytics Project

The data analytics task starts with importing the dataset into pandas DataFrame and the required dataset is often available in .csv format.However, when you read a csv file with a large number of columns in the pandas DataFrame, you can see only few column names followed by . . . . . and few more column names as shown below.import pandas as pddf = pd.read_csv("Airbnb_Open_Data.csv")df.head()Display limited number of columns in Jupyter-Notebook | Image by AuthorThe main purpose behind .head() is to take a sneak-peek into…

Data Science Anti-Patterns You Should Know | by Samuel Flender | Dec, 2022

Eliminate your recurring pain points by understanding the underlying patternsPhoto by Jeremy Bishop on UnsplashAnti-patterns are common yet counter-productive responses to recurring problems. Because they’re ineffective, they perpetuate recurring pain points without ever resolving the underlying, systematic issues. Anti-patterns exist exist pretty much anywhere people come together to solve problems, in software development, project management, and yes, also in data science.Knowledge about anti-patterns is the best way to…

Building a simple AI-powered, human-in-the-loop system to manage wildlife camera trap images & annotations | by Abhay Kashyap | Oct,…

Towards sustainable tech solutions for nonprofits with purpose and empathyCougar in Purisima Creek Redwoods Preserve by Felidae Conservation FundIn this post, I briefly chronicle my journey designing and building an AI-powered, human-in-the-loop system to manage Felidae Conservation Fund’s camera trap images & annotations. If you’re interested in the role of technology in wildlife conservation, building computer vision applications or working with nonprofits, you might find this post useful. If you want a high-level…

Large-Scale Knowledge Graph Completion on Graphcore IPUs | by Daniel Justus | Dec, 2022

Winner of the OGB-LSC Knowledge Graph competition at NeurIPS 2022Accurate predictions through fast experiments, careful tuning, and a large ensembleRendering of the computational graph of a Knowledge Graph Embedding model. Image by author.Machine learning methods for representing graph-structured data keep growing in importance. One of the central challenges that researchers in the field are facing is the scalability of models to large datasets. As part of the NeurIPS 2022 Competition Track Programme the Open Graph…