Image Recommender
A big data project using transfer learning method to output a recommended image based on the similarity of characteristics
Concept
Since images are large vectors with 3 colour channels, it requires high computational resources to calculate the mathematical operations such as Euclidean or Cosine distance to calculate similarity. Hence, we need to downscale its complexity but still retain the information of it, so that we can still measure the similarity of the images in their simplified form.
Code design:
Datasets
This project used images datasets from Coco dataset.
Image Preprocessing
- Preprocess the image using opencv to get the colour histogram of every pixel in the images.
- Use a transfer learning model such as MobileNet to get the image embeddings.
- Save the information including the image path in SQLite database and .csv file as a backup.
Find image similarity
- After all the images are preprocessed, the input image can be given so that it will output 5 similar images from the database in the ImageRecommender function. The similarity is calculated using Euclidean distance.
- Used a dimension reductional technique such as t-SNE or UMAP to visualize high dimensional data to see the clustering patterns between all images.
Result
-
ImageRecommender with 5 similar output images
-
t-SNE plot for all the images in the database
-
t-SNE plot for human face images vs all other istances