Clustering Netflix Shows Based on Features Using K-means and Hierarchical Algorithms to Identify Content Patterns
Main Article Content
Abstract
This study explores clustering patterns within Netflix's movie catalog by applying K-means and hierarchical clustering algorithms. The primary objective is to identify distinct content groups based on features such as movie duration, release year, and content ratings. The dataset, which includes 5,185 Movies, was preprocessed by handling missing values, one-hot encoding categorical variables, and standardizing numerical features. Four distinct clusters were identified, with each cluster exhibiting unique characteristics. Cluster 0 primarily consists of longer, family-friendly Movies rated TV-14, while Cluster 1 contains shorter, mature Movies with a TV-MA rating. Cluster 2 represents a diverse range of TV-MA Movies with moderate durations, and Cluster 3 focuses on adult-oriented, longer Movies with an 'R' rating. These findings offer valuable insights into Netflix's content strategy, highlighting the platform's ability to cater to different audience segments based on content type and viewer preferences. The results suggest that Netflix can leverage clustering patterns to improve its recommendation system and content acquisition strategy. However, the study is limited by the absence of user-specific data and the reliance on basic metadata features. Future research could explore the integration of additional features like user ratings and apply deep learning techniques for more sophisticated clustering.
Article Details

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with International Journal for Applied Information Management agree to the following terms: Authors retain copyright and grant the International Journal for Applied Information Management right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share (copy and redistribute the material in any medium or format) and adapt (remix, transform, and build upon the material) the work for any purpose, even commercially with an acknowledgement of the work's authorship and initial publication in International Journal for Applied Information Management. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in International Journal for Applied Information Management. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).