Topic Modeling Amazon Product Reviews
For Diesel S.p.A.
Freddie


By applying topic modeling, this report analyzed review dataset of Diesel from Amazon to give detailed marketing insights and improve the business performance of Diesel on Amazon. Relative Documents could be found at Github.
Preprocessing Steps

To start with, I tokenized my review data from sentences to words since I want to remove punctuations and unnecessary characters among words. Besides, I found that lemmatization is a useful process that converts words to its root word, like shoes to shoe, purchased or purchases to purchase, so that I can reduce the total number of unique words in the dictionary. And then, I used the vectorizer to convert all words to lowercase, remove built-in English stop-words like am, is, are, also including my brand name diesel, that hardly carry any information but mostly clutter.

Reviews in my data only contain words that have occurred at least three times (also filtered most of ASINs) and words with numbers or alphabets of at least length three to be qualified as a word. Finally, I segment my data by keeping reviews only from super reviewers who reviewed more than 95 percent of people. At this time, everything is ready to build a topic model.
Model steps

I used the method of Latent Dirichlet Allocation (LDA) for this topic modeling project since it is a popular statistical model and I believed that it could help me attribute each review to one of many potentially observable groups. Besides, inspired by my professor, I found that LDA in python could generate an interactive visualization that allows me to explore the most frequent words in each topic efficiently.

During the modeling process, I wondered how to judge a model is good or not, so I searched it on Google. I found that a model, if neglecting semantic associations between words, with higher log-likelihood and lower perplexity is considered to be good. A critical parameter for LDA model is the number of topics, but I do not want to change it because I think fifteen is fair value based on my prior knowledge about the brand. Another one is Learning-decay, which control the learning rate in the online learning method. Its value could be set between 0.5 and 1, so I used GridSearch to find best LDA model by changing Learning-decay value. Therefore, for fifteen topics, the model with the highest performance has a Learning-decay value of 0.7.
Interactive Topic Model Visualization

Topic Description

Topic 1: Customers say jeans product fit too well. Some of them feel they are small for sitting.
Topic 2: Generally, customers feel all kinds of product such as watches, jeans, jacket from Diesel are look great, and with good prices.
Topic 3: Diesel watch indeed designed bigger than usual watch such as Diesel Watches Men's Black Not-So-Basic Basic Analog Black Dial Watch.
Topic 4: Customers who purchased watch love its looking, and they would like to recommend it to others.
Topic 6: Returned customers who claimed that The money you paid for it is worth every dollar.
Topic 7: Customers who bought or got Diesel shoes as present are just love them.
Topic 8: Diesel has products such as jackets (Diesel Big Girls' Jimella) and boots (Diesel Men's The Pit Boot) that have soft skin or leather.
Topic 9: Customers say they like the design of shirt from Diesel and the size of Large is a perfect fit.
Topic 10: Some Customers say Diesel bags can be used by years. Some Customers say Diesel bags should have smaller pockets.
Topic 11: Diesel watch bands are hard but still watches look sexy.
Topic 12: Small group of people even buy underwear from Diesel like Diesel Men's Divine Rainbow Striped Boxer Trunk. Their comments are Very comfortable or the fabric breaths.
Topic 13: Customers like the packaging from Amazon, arrived with several boxes.
Topic 14: Customers like denim jeans and belt because they are simply cool.
Topic 15: Quality and color were as described, and the wife loves it! Uses it for casual occasions/weekend or just alternate it for work. Very spacious for all her needs
Marketing & Product Insights

All reviews that I analyzed are from Super Reviewers. Contrast to regular reviewers, I think super reviewers are more capable of commenting on a product rationally. And indeed, by filtering my product reviews and only look at top reviewers, these comments about a product are more integrated, more reasonable and less sentimental.
Attributes that people like about our products

Watches are most impressed by Diesel customers. Diesel watches are designed bigger than usual watches, as topic 3 mentioned. But that is what makes them cool, attractive, or even collectible. Topic 7 and Topic 8 shows Diesel shoes and boots are soft and comfortable. Well-fitted shirts and jeans illustrated by topic 5 and topic 9. In general, the second biggest cluster, topic 2, shows that Diesel provides customers all kinds of product such as jeans, jackets, boots with fashion design, high quality, durable materials, and affordable prices.
Attributes that people dislike about our products

Biggest cluster, topic 1, shows that customers think Diesel jeans fit too well. This is not a merit in some cases since I've seen a lot of reviews complaining that it feels uncomfortable or too tight for sitting. Based on my knowledge, Diesel is an Italian retail clothing company. Since there might be a slight difference in body shape between Americans and Europeans, I would recommend merchant claim size suggestions on Amazon.
Topic 11 suggests that the band of the watch is hard, which probably make wearer uncomfortable, so I'd like to recommend Diesel use leather band.
Purchase Occasions

From topic 6, I can see that there are a lot of regular customers or returning customers who love Diesel products. It's always great to keep current customers and, in the meantime, attract new customers. I'd suggest that Diesel could launch and advertise new products around the Holidays such as Halloween, Day of the Dead while keeping the major color of Diesel product still be black.
Product Development/Improvement Ideas

Topic 10 suggests that Diesel should make some smaller pockets on bags, specifically: Diesel Men's Scuba Messenger Bag.
Topic 11 shows that consumers would like a better material for watch bands by using leather or metal instead of hard rubber.
Pricing Suggestions

Topic 2 shows that in general, customers think Diesel products are worthy to purchase since Diesel products are fashion, durable and good-looking. There are a significant number of reviews emphasize the Diesel products can not only attract around people but also last forever. Besides, someone would like to compare it with Levi's but, no doubt, Diesel has higher quality with similar price. From my perspective, the true value of Diesel products deserves a higher price. Therefore, raising its price to an entry lux level could be a wise business decision.
What's Not in the Data

For a clothing company, the most common topics I expected is color. Clothing colors represent individuals' emotions and mood. Hence, clothing colors are pivotal when it comes to selecting clothes. Of particular importance are the clothing colors customers pick when dressing up for specific situations, such as interview, party, exercise and so on. Diesel could exploit more clothing styles for different purposes.
Besides these regular customers who purchase Diesel as daily wear, we still need more information about other potential purposes of purchasing. It is helpful for the manufacture to know if customers are purchasing Diesel products as birthday gifts or anniversary gifts. We also need more information about the recipients and buyers. For example, parents might like to purchase shoes and clothes for sons or daughters, wives might be willing to buy an expensive outfit for husbands, boyfriends might purchase bags and appeal for girlfriends. This information could aid Diesel attracting their potential customers and expanding their marketing share.
Diving Further into the Data

All reviews from super reviewers should be considered seriously. From this data, Diesel has an excellent performance on Amazon. There are just some tiny suggestions I could make: First, labeling recommend size on the product page for jeans and pants. Second, using metal or leather band for Diesel watch instead of hard rubber. Third, launching and advertising new products with more colors and more styles for specific situations or holidays. Finally, Diesel should raise the price of its products to an entry lux level to increase market share.
Made by Freddie