Classifying Web Search Queries in Order to Identify High-revenue Generating Customers

Open Access
Ortiz-Cordova, Adan
Area of Honors:
Information Sciences and Technology
Bachelor of Science
Document Type:
Thesis Supervisors:
  • Jim Jansen, Thesis Supervisor
  • Xiaolong Zhang, Honors Advisor
  • web queries
  • web searching
  • sponsored search
  • k-means clustering
  • unsupervised machine learning
Traffic from search engines is important for most online businesses, with the majority of visitors to many websites being referred by search engines. Therefore, an understanding of this search engine traffic is critical to the success of these websites. Understanding search engine traffic means understanding the underlying intent of the query terms and the corresponding user behaviors of searchers submitting keywords. In this research, using 712,643 query keywords from a popular Spanish music website relying on contextual advertising as its business model, we use a k-means clustering algorithm to categorize the referral keywords with similar characteristics of onsite customer behavior, including attributes such as click through rate and revenue. We identified 6 clusters of consumer keywords. Clusters range from a large number of users who are low impact to a small number of high impact users. We demonstrate how online businesses can leverage this segmentation clustering approach to provide a more tailored consumer experience. Implications are that businesses can effectively segment customers to develop better business models to increase advertising conversion rates.