Techno Blender
Digitally Yours.

How to Explore a Dataset of Images with Graph Theory | by Anthony Cavin | May, 2022

0 101


Combine feature extraction, similarity measure, and nearest neighbor graph

Example of a k-nearest neighbor graph (image by author)

When you start working on a dataset that consists of pictures, you’ll probably be asked such questions as: can you check if the pictures are good? Is there any anomaly? A quick-and-dirty solution would be to manually look at the data one by one and try to sort them out, but that might be tedious work depending on how many pictures you get.

For example, in manufacturing, you could get a sample with thousands of pictures from a production line consisting of batteries of different types and sizes. You’ll have to manually go through all pictures and arrange them by type, size, or even color.

The other and more efficient option, on the other hand, would be to go the computer vision route and find an algorithm that can automatically arrange and sort your images — this is the goal of this article.

But how can we automate what a person does, i.e. compare pictures two by two with one another and sort them based on similarities? There is no universal answer to that question. Still, we can divide the problem into easier sub-problems and tackle them one by one: extract relevant features out of each image, define a measure of similarity between features, and sort them by similarity.

To approach each of the sub-problems, we will use the following methods in this article:

  1. The Histogram of Oriented Gradient (HOG) to extract features
  2. The Wasserstein distance to measure similarity between histograms
  3. The K-Nearest Neighbor Graph (K-NNG) in order to sort images by similarity

The first step to any computer vision problem is extracting useful features from pictures. In our case, we want to compare images and sort them based on similarities. So we want our features to be able to distinguish between similar and not-so-similar images. A feature that is often used in object detection is the Histogram of Oriented Gradient (HOG).

The HOG feature is a representation of an image as a vector, where each element in the vector is a histogram of oriented gradients. I recommend reading this article to understand the working principle of the HOG algorithm.

To extract the HOG feature from an image, we can use the hog() function from the Scikit-image library. Here is a visual example of the gradients extracted on a black-and-white picture:

Example of histogram of oriented gradient

Now that we have extracted features from our images, we need to define a measure of similarity between those features. A common similarity measure used in machine learning is the Euclidean distance, but, in our case, we are going to use the Wasserstein metric that gives a distance measure between two distributions.

The Wasserstein metric, also known as the earth mover’s distance, is a distance metric between two probability distributions. It is based on the idea of measuring the amount of “work” required to transform one distribution into the other.

Example of Wasserstein Distance (Image by author)

There are many reasons to use the Wasserstein metric. One reason is that it is a very natural metric in the space of probability measures. It is a very versatile metric and can be used to compare distributions that are not necessarily identical.

Finding a good distance measure between feature vectors is an essential step for a lot of machine learning algorithms. In our case, the Wasserstein metric measures the distance between two histograms and therefore provides a similarity measure between two pictures.

The HOG extracts feature from pictures, and the Wasserstein metric offers the distance between each feature. The next step would be to plot the data and explore the dataset. For this task, the k-nearest neighbor graph is a natural choice for its simplicity and explainability.

In simple terms, the K-nearest neighbor graph is a graph where each node is connected to its k nearest neighbors. The K-NNG is helpful to find patterns in data. For example, if you have a dataset of people’s heights and weights, you can use the K-Nearest Neighbor Graph to find out if there are any patterns in the data. For this article, we will use the cup dataset that consists of 500 pictures with different positions and rotations.

Cup dataset with different positions and rotations (image by author)

To discover the global structure of the data, we can use a “spring” layout in which a spring force between two adjacent nodes is proportional to the distance between the nodes. With this layout, we obtain a graph that separates the nodes as in the picture that follows.

30-nearest neighbor graph of the cup dataset (image by author)

We can easily separate different types of data with this method, but we can also explore the underlying structure of the dataset. For the cup example, we are able to distinguish five horizontal positions but also the rotation of the cup in the picture.

Encoded rotation of the first position (image by author)

The HOG feature is a powerful feature descriptor that can be used in various computer vision applications. However, it is also a high-dimensional feature vector, which can be difficult to work with.

Nevertheless, we used the Wasserstein metric to compute the distance between each histogram which is computationally expensive but a natural choice when comparing two distributions.

To improve this approach, we could reduce the dimension of the HOG feature. For example, we can use a smaller number of bins, a coarser binning scheme, or a smaller range of values, but we will cover this in another article.

Finally, with the K-Nearest Neighbor Graph, we get to know the data. This includes understanding what the data represents and the overall structure of the dataset which is a good starting point to iterate and refine an exploratory analysis.


Combine feature extraction, similarity measure, and nearest neighbor graph

Example of a k-nearest neighbor graph (image by author)

When you start working on a dataset that consists of pictures, you’ll probably be asked such questions as: can you check if the pictures are good? Is there any anomaly? A quick-and-dirty solution would be to manually look at the data one by one and try to sort them out, but that might be tedious work depending on how many pictures you get.

For example, in manufacturing, you could get a sample with thousands of pictures from a production line consisting of batteries of different types and sizes. You’ll have to manually go through all pictures and arrange them by type, size, or even color.

The other and more efficient option, on the other hand, would be to go the computer vision route and find an algorithm that can automatically arrange and sort your images — this is the goal of this article.

But how can we automate what a person does, i.e. compare pictures two by two with one another and sort them based on similarities? There is no universal answer to that question. Still, we can divide the problem into easier sub-problems and tackle them one by one: extract relevant features out of each image, define a measure of similarity between features, and sort them by similarity.

To approach each of the sub-problems, we will use the following methods in this article:

  1. The Histogram of Oriented Gradient (HOG) to extract features
  2. The Wasserstein distance to measure similarity between histograms
  3. The K-Nearest Neighbor Graph (K-NNG) in order to sort images by similarity

The first step to any computer vision problem is extracting useful features from pictures. In our case, we want to compare images and sort them based on similarities. So we want our features to be able to distinguish between similar and not-so-similar images. A feature that is often used in object detection is the Histogram of Oriented Gradient (HOG).

The HOG feature is a representation of an image as a vector, where each element in the vector is a histogram of oriented gradients. I recommend reading this article to understand the working principle of the HOG algorithm.

To extract the HOG feature from an image, we can use the hog() function from the Scikit-image library. Here is a visual example of the gradients extracted on a black-and-white picture:

Example of histogram of oriented gradient

Now that we have extracted features from our images, we need to define a measure of similarity between those features. A common similarity measure used in machine learning is the Euclidean distance, but, in our case, we are going to use the Wasserstein metric that gives a distance measure between two distributions.

The Wasserstein metric, also known as the earth mover’s distance, is a distance metric between two probability distributions. It is based on the idea of measuring the amount of “work” required to transform one distribution into the other.

Example of Wasserstein Distance (Image by author)

There are many reasons to use the Wasserstein metric. One reason is that it is a very natural metric in the space of probability measures. It is a very versatile metric and can be used to compare distributions that are not necessarily identical.

Finding a good distance measure between feature vectors is an essential step for a lot of machine learning algorithms. In our case, the Wasserstein metric measures the distance between two histograms and therefore provides a similarity measure between two pictures.

The HOG extracts feature from pictures, and the Wasserstein metric offers the distance between each feature. The next step would be to plot the data and explore the dataset. For this task, the k-nearest neighbor graph is a natural choice for its simplicity and explainability.

In simple terms, the K-nearest neighbor graph is a graph where each node is connected to its k nearest neighbors. The K-NNG is helpful to find patterns in data. For example, if you have a dataset of people’s heights and weights, you can use the K-Nearest Neighbor Graph to find out if there are any patterns in the data. For this article, we will use the cup dataset that consists of 500 pictures with different positions and rotations.

Cup dataset with different positions and rotations (image by author)

To discover the global structure of the data, we can use a “spring” layout in which a spring force between two adjacent nodes is proportional to the distance between the nodes. With this layout, we obtain a graph that separates the nodes as in the picture that follows.

30-nearest neighbor graph of the cup dataset (image by author)

We can easily separate different types of data with this method, but we can also explore the underlying structure of the dataset. For the cup example, we are able to distinguish five horizontal positions but also the rotation of the cup in the picture.

Encoded rotation of the first position (image by author)

The HOG feature is a powerful feature descriptor that can be used in various computer vision applications. However, it is also a high-dimensional feature vector, which can be difficult to work with.

Nevertheless, we used the Wasserstein metric to compute the distance between each histogram which is computationally expensive but a natural choice when comparing two distributions.

To improve this approach, we could reduce the dimension of the HOG feature. For example, we can use a smaller number of bins, a coarser binning scheme, or a smaller range of values, but we will cover this in another article.

Finally, with the K-Nearest Neighbor Graph, we get to know the data. This includes understanding what the data represents and the overall structure of the dataset which is a good starting point to iterate and refine an exploratory analysis.

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Leave a comment