123 Street, NYC, US 0123456789 info@example.com

web hosting

Forecasting Visual Style in Fashion

Forecasting Visual Style in Fashion

Aleesha Institute Fashion Designing

What is the future of fashion? Tackling this question from a data-driven vision perspective, we propose to forecast visual style trends before they occur. We introduce the first approach to predict the future popularity of styles discovered from fashion images in an unsupervised manner. Using these styles as a basis, we train a forecasting model to represent their trends over time. http://www.aleeshainstitute.com/ The resulting model can hypothesize new mixtures of styles that will become popular in the future, discover style dynamics (trendy vs. classic), and name the key visual attributes that will dominate tomorrow’s fashion. We demonstrate our idea applied to three datasets encapsulating 80,000 fashion products sold across six years on Amazon. Results indicate that fashion forecasting benefits greatly from visual analysis, much more than textual or meta-data cues surrounding products.

Retrieval and recommendation There is strong practical interest in matching clothing seen on the street to an online catalog, prompting methods to overcome the street-to-shop domain shift [28, 20, 18]. Beyond exact matching, recommendation systems require learning when items “go well” together [19, 38, 33] and capturing personal taste [7] and occasion relevance [27]. Our task is very different. Rather than recognize or recommend garments, our goal is to forecast the future popularity of styles based on visual trends. Attributes in fashion Descriptive visual attributes are naturally amenable to fashion tasks, since garments are often described by their materials, fit, and patterns (denim, polka-dotted, tight). Attributes are used to recognize articles of clothing [5, 29], retrieve products [18, 13], and describe clothing [9, 11]. Relative attributes [32] are explored for interactive image search with applications to shoe shopping [24, 44]. While often an attribute vocabulary is defined manually, useful clothing attributes are discoverable from noisy meta-data on shopping websites [4] or neural activations in a deep network [40]. Unlike prior work, we use inferred visual attributes as a conduit to discover fine-grained fashion styles from unlabeled images.

Learning styles Limited work explores representations of visual style. Different from recognizing an article of clothing (sweater, dress) or its attributes (blue, floral), styles entail the higher-level concept of how clothing comes together to signal a trend. Early methods explore supervised learning to classify people into style categories, e.g., biker, preppy, Goth [21, 38]. Since identity is linked to how a person chooses to dress, clothing can be predictive of occupation [35] or one’s social “urban tribe” [26, 31]. Other work uses weak supervision from meta-data or co-purchase data to learn a latent space imbued with style cues [34, 38]. In contrast to prior work, we pursue an unsupervised approach for discovering visual styles from data, which has the advantages of i) facilitating large-scale style analysis, ii) avoiding manual definition of style categories, iii) allowing the representation of finer-grained styles , and iv) allowing a single outfit to exhibit multiple styles. Unlike concurrent work [16] that learns styles of outfits, we discover styles for individual garments and, more importantly, predict their popularity in the future.

Discovering trends Beyond categorizing styles, a few initial studies analyze fashion trends. A preliminary experiment plots frequency of attributes (floral, pastel, neon) observed over time [41]. Similarly, a visualization shows the frequency of garment meta-data over time in two cities [33]. The system in [39] predicts when an object was made.The collaborative filtering recommendation system of [15] is enhanced by accounting for the temporal dynamics of fashion, with qualitative evidence it can capture popularity changes of items in the past (i.e., Hawaiian shirts gained popularity 389 after 2009). A study in [10] looks for correlation between attributes popular in New York fashion shows versus what is seen later on the street. Whereas all of the above center around analyzing past (observed) trend data, we propose to forecast the future (unobserved) styles that will emerge. To our knowledge, our work is the first to tackle the problem of visual style forecasting, and we offer objective evaluation on large-scale datasets.

Text as side information Text surrounding fashion images can offer valuable side information. Tag and garment type data can serve as weak supervision for style classifiers [34, 33]. Purely textual features (no visual cues) are used to discover the alignment between words for clothing elements and styles on the fashion social website Polyvore [37]. Similarly, extensive tags from experts can help learn a representation to predict customer-item match likelihood for recommendation [7]. Our method can augment its visual model with text, when available. While adding text improves our forecasting, we find that text alone is inadequate; the visual content is essential.

In the fashion industry, predicting trends, due to its complexity, is frequently compared to weather forecasting: sometimes you get it right and sometimes you get it wrong. In this work, we show that using our vision-based fashion forecasting model we get it right more often than not. We propose a model that discovers fine-grained visual styles from large scale fashion data in an unsupervised manner. Our model identifies unique style signatures and provides a semantic description for each based on key visual attributes. Furthermore, based on user consumption behavior, our model predicts the future popularity of the styles, and reveals their life cycle and status (e.g. in- or out of fashion). We show that vision is essential for reliable forecasts, outperforming textual-based representations. Finally, fashion is not restricted to apparel; it is present in accessories, automobiles, and even house furniture. Our model is generic enough to be employed in different domains where a notion of visual style is present.

ARMS10