Hybrid cluster analysis of customer segmentation of sea transportation users.

AutorCahyana, Bambang Eka
  1. Introduction

    As a maritime country and the biggest archipelago country in the world, Indonesia possesses a vast sea area that offers a great number of resources. On one hand, marine resources are valuable assets that can be used for the prosperity of the nation. On the other hand, lands that are separated by sea become challenges for the citizen of Indonesia to travel across islands. As most of the Indonesian community belongs to the middle-low economic level, affordable sea transport modes are more preferred compared to air transport modes, which are relatively higher in price.

    Responding to the high demand for sea transport, PT Pelindo I offers sea transportation services across islands including South Sumatera, Riau and Aceh. The company has 16 ports including passenger ports and container ports ranging from a prime class up to Class V. It is obvious that PT Pelindo I holds key roles to the economic and social activities among the surrounding society. As a service provider, it is important for PT Pelindo I to continue improving its service quality. In this research, the level of customer satisfaction toward the service of this company was measured using a set of questionnaires on customer satisfaction.

    In addition, customer segmentation has been considered as the major focus in this research. Therefore, the segmentation analysis was also performed. This view goes in line with the theory of market segmentation regarding the importance of analyzing consumer characteristics (Kotler and Keller, 2013). Customer segmentation is necessary to be evaluated as results provide valuable insights for PT Pelindo I in making managerial decisions.

    The characteristics of the population this research had to be comprehensively determined. As only a few research studies included only a single variable (Solimun, 2010), multivariate analysis was performed in analyzing the population characteristics in this research in the form of cluster analysis.

    Cluster analysis is a multivariate statistical procedure that groups entities into relatively homogeneous groups (Latan, 2014). The segmentation facilitates the exploration and interpretation of the results of characteristic measurements. This develops the results of the questionnaire to become a representative reference in measuring the customer characteristics of PT Pelindo I.

    Prior research has intrigued the researcher to apply hybrid cluster analysis in determining the customer segmentation of PT Pelindo I. This method allowed the researcher to simultaneously perform cluster hierarchy analysis and non-hierarchy analysis (Sukbekti, 2017). Hence, clusters can be determined from the results of the analysis based on customer characteristics. Further, the results of this analysis can be the benchmark of PT Pelindo I in improving its service quality to match the characteristics and needs of its customers.

  2. Materials and methods

    2.1 Cluster analysis

    Cluster analysis is a statistical technique that groups research objects or variables in groups. Each object or variable is considered to have similarities in properties and characteristics (Hair et al, 2010). Practically, cluster analysis is administered to segment some consumers (respondents) into several groups (clusters) based on the similarity of their attributes.

    Cluster analysis in this research was done to group research objects based on their characteristic similarities. An ideal clustering offer:

    Intern homogeneity (within a cluster); similarities among members of a cluster.

    Extern homogeneity (between cluster); differences between clusters.

    There are several methods of segmentation in cluster analysis as follows:

    Hierarchy method: segmentation starts from two or more objects that have the closest similarities before analyzing other objects to form a kind of "tree" that shows a clear level (hierarchy) between objects, from the ones that have close similarities to the least similar ones. A tool that is used in performing the hierarchical process is called "dendrograms."

    Non-hierarchy method: starting from determining the intended number of clusters (two, three, etc). After the number of clusters is determined, the clustering process is carried out without following the hierarchy process. This method is commonly called the "K-means cluster." K-means cluster is effectively and efficiently used to group many objects beyond 100 objects.

    Hybrid method: hybrid method is a combination of hierarchical and non-hierarchical methods that extracts the advantages of both methods to determine the best cluster.

    2.2 Hybrid cluster method

    2.2.1 K-mean method. The materials K-mean cluster is a statistical analysis that is useful for grouping many objects into groups based on certain variables while background characteristics of the objects are not clearly defined (Hair et al., 2010).

    K-means is one of the clustering algorithms that divides research data into groups. This algorithm accepts input in data without class labels. The steps of the clustering algorithm are summarized as follows:

    Randomly determining the centroid based on the intended number of final clusters.

    Determining the clusters based on the closeness to the centroid.

    After grouping all objects into clusters, the gap between centroid to each cluster is measured.

    The measurement is repeated (iteration) from Steps 1-3 until the centroid of every cluster is convergent.

    The clustering process in this research was conducted by measuring the closeness of certain data to its centroid point. Minkowski gap measurement could be performed to measure the gap between two data as follows:

    [mathematical expression not reproducible] (1)

    where:

    g = 1, to measure the Manhattan gap; g = 2, to measure the Euclidean gap; g = , to measure the Chebychev gap; xi, xj = two data, which gap is to be measured; and p = dimension of a data.

    The centroid point can be renewed using the following formula.

    [mathematical expression not reproducible] (2)

    where:

    [u.sub.k] = centroid point of cluster-K;

    [N.sub.k] = the number of data in cluster-K; and

    [x.sub.q] = data number-q in cluster-K.

    2.2.2 Hierarchy method. Hierarchical clustering is a well-known clustering algorithm. The hierarchical clustering technique allows sequences of partitions to be made by:

    considering all research objects as an initial cluster;

    measuring the gap between the initial cluster;

    combining two initial clusters with the closest gap into one cluster; and repeating step number three until all objects are grouped into one single cluster.

    The overall results of the hierarchical clustering algorithm are described in a tree-shaped graphic called a dendrogram. Dendrogram is obtained by combining the second line clusters with the clusters at the closest distance to form a single cluster. Meanwhile, in this research, the average linkage method was performed.

    Average linkage method considers the distance between two clusters as the average distance among all members in a cluster and the members of other clusters based on this following formula (Hair et al, 2010):

    [mathematical expression not reproducible] (3)

    where:

    [d.sub.ab] = gap between objects in cluster-I and objects in cluster-K; [N.sub.I] = number of items in cluster-I; and [N.sub.K] = number of items in cluster-I and K.

    2.2.3 Hybrid method. The...

Para continuar leyendo

Solicita tu prueba

VLEX utiliza cookies de inicio de sesión para aportarte una mejor experiencia de navegación. Si haces click en 'Aceptar' o continúas navegando por esta web consideramos que aceptas nuestra política de cookies. ACEPTAR