Machine Learning代写:INFO3406 Image Similarity Matching and Classification

Requirement

对图像做聚类处理,图像预处理后,算法用k-means算法来处理即可。

Instructions

Part I

Download the CIFAR-10 dataset. It consists of 60000 32x32 colour images in 10 physical object classes. The dataset is divided into five training batches and one test batch, each with 10000 images. The test batch contains exactly 1000 randomly-selected images from each class. You can use the test batch ONLY for testing, NOT for training. Your algorithm should be able to classify a query image of the same size.

Part II

Download the CIFAR-100 dataset. This dataset is similar to the CIFAR-10, except it has 100 classes containing 600 images each. There are 500 training images and 100 testing images per class. You can use the testing images ONLY for testing, NOT for training. For a query image of the same size, your classifier should be able to.

  1. Infer the superclass (e.g. aquatic mammals)
  2. Infer the class (e.g. beaver)

Submission instructions

You need to submit an Electronic version (report + code + plagiarism cover sheet) via eLearning. All files should be zipped together in a single file. The zip file should be named 0123456.zip, where 0123456 is your SID. In case of a pair submission, put both SIDs separated by an underscore: 0123456_0789123.zip. Only one of the two students needs to submit.

Programming language

You are encouraged to write the program in Python. Alternatively, you can also use Matlab, Java, or C++ but we need to be able to test your code on the University machines. You need to include instructions on how to compile (if necessary) and run your code.
You should write your own code to calculate the similarity scores and classification. However, if you are running an optimization algorithm, you can use off-the-shelf libraries such as nlopt or scipy.optimize. You are NOT allowed to use sophisticated libraries such as scikit-learn.

Report

  • Similarity metrics you used
  • Your attempts to make the classifier robust (invariant to translation, mirror, rotation, etc.)
  • What images will be/not be properly classified? For instance, a bird perched on a tree will be accurately classified while a flying bird will not. Identify the reasons for misclassification and plausible corrective measures.
  • Accuracy score and confusion matrix or precision/recall to evaluate the accuracy of your classification. Use test images for this purpose.
  • Speed-accuracy trade-off
  • References in the IEEE style.

Evaluation

Query images (height×width×colour_channels=32×32×3 pixels, unit8 RGB, png, image files will be used) for evaluation. For instance. However, these images may NOT be from the test set or training set.
The evaluators will name files as “img00”, “img01”, “img02”, etc. and save in a folder named “INFO3406_assignment1_query”. Your program should be able to query all images and output a single csv file that only contains the output labels. For example, the output may be “0, 2, 1, 3, 6, 6, etc”, where each number corresponds to a class.