Enhanced Fashion Attribute Learning Framework

So as to know more useful information related to fashion and clothes such as prices or salesrooms, customers can put fashion image into a system which can help to search similar images. Hence, customers not only find out fashion items they want but also receive additional favorable information. Due to the characteristics of fashion data, it is necessary to develop a system which can understand fashion image in fine-grained attribute level. Thus, attribute learning becomes an important part in making an effective searching system. In the traditional way, each attribute will be often trained individually. http://www.aleeshainstitute.com/ However, the fashion attributes are always renewed with a large variety in categories and they have inner-group correlations. To improve the performance, we propose a deep local multi-task learning with imbalanced data problem solver framework (LMTL-IDPS framework) on given attribute groups which can exploit advantage of inner-group correlations, give out a solution for imbalanced data problem and flexible system update. To evaluate the effectiveness of this framework, we experimented on DeepFashion Attribute dataset with the pre-trained Nasnet model. The experiments show that our local multi-task learning (local MTL) framework achieves a better performance (14,2% higher in recall rate) than single-task learning (STL) in fashion attribute learning. Experiments also show that our model considering imbalanced data achieved better performance (5,14% higher in recall rate for fewer data attributes) than models that do not account for this.

A. Attribute learning Attribute learning aims to describe the learning method used for recognition tasks at fine-grained level. There are two main methods: STL and MTL. Attribute single-task learning. In attribute STL, attributes have individual learning models. Thus, it leads to the number of models equals to the amount of attribute. Moreover, each attribute is treated separately so the inner-group correlations are underused. Many works use STL such as [14][17][15] and [16]. At that time, there were many challenges in MTL. Thanks to [4], a proposed shared CNN paves a way in this learning method. Attribute multi-task learning. In MTL, samples will be reconstructed by merging datasets into one-hot binary vector demonstration. Although MTL can give out better performance than STL, the model cannot be reused when number of attributes change. Lacking a re-usability make MTL method become a bad choose for dynamic attributes like fashion ones. Based on these challenges, we propose attribute local MTL learning which can be treated as a grouping method for MTL so as to improve the re-usability of it.

B. Multi-task learning approaches Throughout years, there have been many approaches which show different ways in MTL (see Fig.1, 2 and 3). Features with SVM classifiers approaches. These ones have features trained and used as independent SVMs’ inputs for prediction. For example, Kumar et al. [3] used AdaBoost to select a feature space for each attribute and et al. [7] proposed off-the-shelf CNN features learning by FaceNet and VGG- 16 architecture then applied SVM Classifier per attribute.

These approaches apply separately trained DCNNs for attributes followed by deep layers (called shared block). In facial detection, [6] used a modified AlexNet network as a shared block and VGG-16 for each separate facial attribute. In fashion field, Abdulnabi et al [2] use shared feature learning for all the attributes. They then divided attributes into smaller groups, applied an ImageNet pre-trained CNN model for each. Although these approaches give out better performance than previous ones, it consumes resources due to the expansion !”

C. Imbalanced data problem Imbalanced data is the problem where the total number of a class of data (positive) is far less than the total number of another class of data (negative). In attribute learning situation, the imbalance appears if number of instances in some attributes are especially low in quantity comparison to others. Data sampling. Sampling-based approaches like data augmentation is considered to be a solution for imbalanced data problem. The best approach as far as we know is SMOTE [9] which can solve the situation by automatically generating more data (upsampling) based on the original dataset. However, more or less, these methods increase overfitting when training. Architecture, loss function, and metric configuration. Another group of approaches, [5] is an example, makes a change in architecture, loss function or metric to deal with imbalanced data problem. This group of approaches give out better performance however, they are usually hard to implement as well as configurate in future. Thus, they are not always a best option in dynamic retrieval system where the attributes have a large variety in categories. Threshold and output based configuration. These methods find out best thresholds based on the output. SVM is suggested. However, Quan Zou et al. [10] proposed using Matthew Correlation Coefficient (MCC) to deal with imbalanced data in classification. Although SVM shows better performance, MCC takes less resource and processing time. Inspired by [10] and [5], we find out a solution for MTL adapted with retrieval system by using end-to-end DCNN for training and MCC for quantization precision score output vectors to get final predictions. D. Deep CNN architecture Deep CNNs show their performance over handcrafted features (SIFT, HOG,…) in large-scale data. With the help of transfer learning, we can reduce time used for training process [18]. However, transfer learning can only be applied in some available architecture which are not designed for solving imbalanced data problem. Thus, the overall performance will decrease when these kinds of dataset is put into use. Pre-trained VGG and AlexNet are used in many attribute learning systems which can be found in FaceNet[7] or Han et al’s[6] for facial attribute learning. In fashion field, Abdulnabi et al [2] use ImageNet [1] pre-trained CNN model for solving grouped multi-task attribute learning. However, there are many high perform architectures (like Resnet [12] or Nasnet[13]) which have not been applied yet. In our method, Nasnet architecture is use as an end-to-end shared block architecture.

Enhanced Fashion Attribute Learning Framework

Enhanced Fashion Attribute Learning Framework

Recent Posts

Recent Comments

Archives

Categories

Meta