Clustering Netflix Shows Based on Features Using K-means and Hierarchical Algorithms to Identify Content Patterns

B Herawan Hayadi; Eko Priyanto

doi:10.47738/ijaim.v5i2.102

PDF

Published: Jul 1, 2025

DOI: https://doi.org/10.47738/ijaim.v5i2.102

Keywords:

Clustering Content Strategy K-Means Netflix Recommendation System

Citation Analysis:

B Herawan Hayadi

Primary School Teacher Education, Universitas Bina Bangsa, Serang, Indonesia

Eko Priyanto

Ma'arif University of Nahdlatul Ulama, Kebumen, Indonesia

Abstract

This study explores clustering patterns within Netflix's movie catalog by applying K-means and hierarchical clustering algorithms. The primary objective is to identify distinct content groups based on features such as movie duration, release year, and content ratings. The dataset, which includes 5,185 Movies, was preprocessed by handling missing values, one-hot encoding categorical variables, and standardizing numerical features. Four distinct clusters were identified, with each cluster exhibiting unique characteristics. Cluster 0 primarily consists of longer, family-friendly Movies rated TV-14, while Cluster 1 contains shorter, mature Movies with a TV-MA rating. Cluster 2 represents a diverse range of TV-MA Movies with moderate durations, and Cluster 3 focuses on adult-oriented, longer Movies with an 'R' rating. These findings offer valuable insights into Netflix's content strategy, highlighting the platform's ability to cater to different audience segments based on content type and viewer preferences. The results suggest that Netflix can leverage clustering patterns to improve its recommendation system and content acquisition strategy. However, the study is limited by the absence of user-specific data and the reliance on basic metadata features. Future research could explore the integration of additional features like user ratings and apply deep learning techniques for more sophisticated clustering.

How to Cite

[1]

B. H. Hayadi and E. Priyanto, “Clustering Netflix Shows Based on Features Using K-means and Hierarchical Algorithms to Identify Content Patterns”, Int. J. Appl. Inf. Manag., vol. 5, no. 2, pp. 98–110, Jul. 2025.

Issue

Vol. 5 No. 2 (2025): Regular Issue: July 2025

Section

Articles

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Authors who publish with International Journal for Applied Information Management agree to the following terms: Authors retain copyright and grant the International Journal for Applied Information Management right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share (copy and redistribute the material in any medium or format) and adapt (remix, transform, and build upon the material) the work for any purpose, even commercially with an acknowledgement of the work's authorship and initial publication in International Journal for Applied Information Management. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in International Journal for Applied Information Management. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).

2776-8007 (Online)
Published by	:	Bright Institute
Website	:	ijaim.net
Email	:	agung@ijaim.net (managing editor)
		support@ijaim.net (technical issues)

Article Sidebar

Main Article Content

Abstract

Article Details