(untagged)

C#.NET Implementation of K-Means Clustering Algorithm to Produce Recommendations

Arthur V. Ratz

0.00/5 (No votes)

25 Dec 2016

In this article we'll demonstrate the implementation of k-means clustering algorithm to produce recommendations.

The PDF-version of the following article is also available for download at https://www.pdf-archive.com/2016/12/13/kmeansre/ or http://www.filehosting.org/file/details/625473/KMeansRE.pdf

“K-means is a widely used method in cluster analysis. In my understanding, this method does NOT require ANY assumptions, i.e., give me a data set and a pre-specified number of clusters, k, then I just apply this algorithm which minimize the SSE, the within cluster square error.” – David Robinson, Data Scientist at StackOverflow.

Introduction

K-Means clustering is one of the most reliable unsupervised learning data mining algorithms used as a workaround to the famous clustering problem that had previously been considered to be non-computational. The following algorithm was initially proposed by Hugo Steinhaus and Stewart Lloyd in 1950 and improved by James McQueen in 1967. Since that time, K-means clustering algorithm is used to solve a variety of computational problems such as sum of squares error (SSE) minimization in regression analysis, graphical image segmentation, recommender engines development, maintaining efficient software features that provide public safety over the Internet and anti-theft protection, etc.

In this article, we’ll discuss about how to use K-means clustering algorithm to implement a simple recommender engine that performs user-to-item collaborative filtering to recommend articles published in a certain social media website to a community of users according to their personal preferences.

Background

In this section of the following article we’ll provide an understanding of the very basic representation of the classical k-means clustering algorithm used to solve many computational problems. In particular, we’ll thoroughly discuss about the essentials of the k-means clustering procedure as well as

K-Means Clustering Algorithm Fundamentals

In general, k-means algorithm provides a solution to the trivial classification problem by splitting up a certain dataset into - clusters, each one containing a number of the most similar data items (or just “observations”) arranged into a cluster based on a minima distance to the nearest “mean”, which, in turn, is being a “prototype” of the following cluster. In the other words, k-means clustering algorithm selects items from the entire dataset that have a closest distance to a particular nearest mean and builds a new cluster. According to the following algorithm, k-means procedure selects the number of the most similar items for each nearest mean, the value of which is previously computed for each particular new cluster being created.

In this case, for each nearest mean’s value we’re performing a computation of a separate cluster containing those data items that are the most similar to the current nearest mean. Another important condition of the clustering process is that each item being selected by k-means procedure from the given dataset must have the closest distance (e.g. must be the most similar) to the other items included in the same cluster. The main idea of performing k-means clustering is that if two different items being selected have the minima distance (i.e. similarity) to the same nearest mean, then the actual distance between these items also tends to the minima.

The value of each item in the given dataset is basically represented by a set of attributes that describe the following item. The set of attributes for the given item normally has a numerical representation, in which each attribute is a real value. That’s why, the entire set of attributes associated with a data item is denoted as a certain vector in - dimensional Euclidean space, where is an actual number of attributes that describes the data item ().

In turn, the value of the nearest mean computed for each new cluster is also represented as a numerical real vector of attributes having a particular distance to the vector of attributes for each item in the - dimensional Euclidean space. Actually, the vector of attributes for a certain nearest mean, contains those attributes which values are the most common to each item in the cluster. As we’ve already mentioned above, the nearest mean, itself, provides a generalization for a certain number of items being selected into a cluster, and serves as a prototype of the entire cluster. For example, if we have two items belonging to the same cluster and each one having a certain vector of attributes , then the vector of attributes , associated with the nearest mean for these two items, would consist of those values that have a closest distance to the values of attributes that are contained in both vectors for one item and another item respectively.

The distance between a particular data item in the given dataset and the nearest mean is computed by using an objective function, such as Euclidean, Manhattan or Taxicab distance, which value normally represents similarity between an item and the nearest mean based on the distance between the vectors of attributes of either the nearest mean or a particular item in a cluster. According to the k-means clustering algorithm it’s the most preferable to use Euclidean distance function rather than the other functions that allow to compute the following distance. The Euclidean distance is normally computed as a sum of squares of partial distances between particular attributes that describe the nearest mean and an item, and can be formulated as follows:

(1)

, where - the distance between two vector of attributes of either the nearest mean and an item in a cluster; - the value of attribute of the item ; - the value of attribute of the current nearest mean; - the actual number of attributes;

Basically, there’s a variety of methods by using which we can compute the nearest mean’s vector of attributes for each particular cluster. In the most cases, the nearest mean’s attributes vector is computed as the vector that represents a center of mass for a certain subset of items. The following center of mass is also called a “centroid” of a newly built cluster. Particularly, the centroid’s vector of attributes is obtained by computing an average (“mean”) of attribute values of all items within a cluster. In order to compute the average vector of attributes, we actually need to perform by-component addition of multiple vectors of attributes associated with the all items in a cluster. The following formula illustrates the computation of a single centroid’s attribute value which is a component of vector for the given centroid:

(2)

, where - the value of attribute in the vector of attributes for the given centroid; - the value of attribute of the item ; - the actual number of items in a cluster;

Generally, the step-by-step k-means clustering procedure can be formulated as follows:

0. Build an initial cluster from the entire dataset of items: Before we begin we need to assign all items in the given dataset to an initial cluster and append it to the array of clusters. Also we’ll have to select a set of multiple centroids for the initial cluster. Notice that, unlike the other succeeding new clusters that will be further produced by the k-means procedure, to perform an efficient clustering, the initial cluster should contain more than one centroid;

1. Randomly initialize centroids of the initial cluster by using the either Forgy or Random Partition methods: To perform initialization we actually have to generate a vector of attributes for each current centroid by assigning random values to each attribute , of the vector . Also, at this step, we assign the initial values to the loop counter variables and ;

2. For the current cluster retrieved from the array of clusters find the items with the minima distance to each centroid and produce the new clusters: During this step we retrieve a cluster from the array and proceed with the next steps to create one or more new clusters based on the set of centroids belonging to the current cluster;

3. For the current centroid , compute the actual distance to each item by using Euclidean distance formula (1): At this point we need to compute the distances between the vector of attributes associated with the current centroid and the vector of attributes for each item;

4. Choose items having the closest distance to the current centroid and produce a new cluster by assigning each item with the minima distance to the newly created cluster: To do this, we basically use the following formulato find all items with vector of attributes that satisfies the minima distance criteria to the current centroid’s vector of attributes ;

5. Compute the new value of centroid based on the formula (2) and assign it to the new cluster produced at the step 4: During this essential step we will compute the new centroid’s vector of attributes as a center of mass represented as a by-component vectors of attributes , summation for each item contained within the newly built cluster. After that, we’ll assign the new centroid to the following cluster. In case, if we cannot compute the new value of centroid for the new cluster, go to the next step of the algorithm, otherwise proceed with step 7;

6. Output the newly built cluster: At this point we assume that there’s no more clustering possible for that particular cluster. That’s why, we pass the following cluster to the output. Similarly, we’ll perform the output for each new cluster that exactly meets the condition applied in the previous step 5. After we’ve produced the output, go to step 8 and proceed with the next centroid ;

7. Append the new cluster obtained at the previous steps to the array of clusters and proceed with the next step of the following algorithm: At this step we will insert the new cluster obtained into the array of clusters, which is basically used as clusters queue while performing the recursive clustering, during which, each one cluster in the array is separated into the number of new clusters based on the values of one or many of its centroids. Finally, we proceed with the next step of the algorithm;

8. Select the next centroid for which we’ll be performing the actual clustering, and perform a check if this is the final centroid of the current cluster () retrieved from the array of clusters: If so, proceed with the step 9, otherwise perform steps 3-7 to produce the rest of new clusters for each centroid of the current cluster;

9. Select the next cluster from the array of clusters and perform the check if this is the final cluster in the clusters queue () or there’s no ability to re-compute the new centroid for the currently processed cluster: If so, terminate the k-means clustering process, otherwise return to the step 2;

Fig. 1. Illustrates the k-means clustering process that aims to produce three new clusters based on centroids. Each item is represented by a vector of attributes () in two-dimensional Euclidean space (,):

Drawing1

Using K-Means Clustering to Produce Recommendations

Besides the classical k-means clustering algorithm, in this article, we will provide a detailed explanation of the k-means clustering algorithm based on an example of implementing a simple recommender engine used to recommend articles to the users that visit a social media website.

To produce recommendations, k-means procedure basically performs clustering based on the two types of objects – the either “users” or “items” (i.e. articles), associated with the two vectors of either the user or item attributes. Particularly, the k-means recommender engine implements user-to-item collaborative filtering approach and recommends a certain number of items to a specific user according to its personal preferences. By performing clustering, the k-means procedure actually performs “filtering” through a large number of articles described by the number of attributes which values exactly correspond to the set of preferences of a particular user. In this case, the main idea of the k-means clustering procedure is that it attempts to find the similarity between user and item objects that itself have different semantics, but reside in the same Euclidean - dimensional space, the correlation between which is represented by the actual distance between the users and items vectors of attributes. In this case, those users and items vectors of attributes are used as sets of trivial latent parameters by using which we establish the relationship between the user and item objects. In this case, the similarity measure, represented by the Euclidean distance between the either user or item’s vector of attributes, is computed based on the most similar attributes (e.g. having the closest distances between one another). In the other words, we actually obtain the similarity between users and items by computing the sum of squares of partial distances between particular attributes. The following process generally implies the either sum of squares error minimization (SSE) performed by using the ordinary least squares (OLS) method in regression analysis, or singular value decomposition (SVD) computation in latent sematic analysis, capable of providing the user-to-item recommendations.

To do this, k-means clustering algorithm aims to find the number of items that are the most similar to a specific user having its own preferences represented by a vector of attributes , and computes clusters in which a specific user, associated with the number of items, serves as the nearest “mean” (or “centroid”) of the entire cluster. In this particular case, the similarity of users and items is typically represented by Euclidian distance (i.e. “metric”) between two vectors of both a user’s and item’s attributes respectively. For that purpose, we use slightly different formulas to perform the similarity computation:

(3)

, where - the weighted measure of similarity value between a user and an item; - the value of attribute of the item ; - the value of attribute for a specific user; - the coefficient of relevance between attribute of the user’s and item’s vectors of attributes;- the actual number of attributes;

Unlike the conventional Euclidean distance formula that allows to compute the actual distance between two vector of attributes, discussed above, the measure of similarity is obtained as a weighted distance value represented as a sum of squares of partial distances between the either user’s or item’s particular attributes, each one is multiplied by a specific weight coefficient . In this case we basically deal with the distance-based similarity which value depends on the more than one parameter. A set of multiple extra parameters of the following formula (3) is represented as a dividend that includes the sum of those extra parameters for a specific attribute: , where - is an extra parameter of attribute for a specific user or item.

In this case, besides the partial distance between user’s and item’s attributes we also use a single parameter – the relevance between those attributes. Since we implement the contextual data model to be used to produce recommendations, we will employ the measure of lexicographical relevance as the parameter of the formula (3).

(4)

, where - the value of lexicographical relevance between attribute of a specific user and item; - the character in the string representation of attribute of a specific item; - the character in the string representation of attribute of a specific user; - the actual length of the smallest of two strings that represent the either user or item attribute ;

To compute the relevance of the either user or item attributes, we normally use a trivial linear search algorithm by iterating over the two strings that represent attribute and for each position we’re performing the comparison of characters located at the same position . If those characters and exactly match, then we add the value of to the totally estimated value of the relevance . The parameter is the actual length of the smallest string the represent the either user’s or item’s attribute . The more about the lexicographical representation of users and items attributes as an essential part of contextual data representation model is discussed in the next section of the following article.

Also, to provide more flexible distance-based similarity computation using formulas (3,4), it’s recommended to modify formula (3) by using another extra parameter, that represents the actual number of either user’s or item’s attributes that exactly match having the value of relevance . The main reason why we use the following parameter in formula (3) is that the similarity of specific users and items, represented by the value of actual distance between the either user’s or item’s vector of attributes, largely depends on the number of attributes that are lexicographically equal. In this case, formula (3) has the following representation:

(5)

A large number of attributes that lexicographically match ensures that the Euclidean distance between the two vectors of either user’s and item’s attributes is typically small and obviously the measure of similarity between a specific user and item normally grows. However, there’s a special case when a specific user and item selected are merely dissimilar (e.g. the either vectors of user’s or item’s attributes don’t have any attributes that exactly match) and the value of extra parameter . In this case, we assume the value of parameter to be equal to 0.01, which means that the actual distance between the either user’s or item’s vector of attributes tends to be large. In the other words, we use the value of parameter , which in turn provides a typically large distance between dissimilar user and item to avoid the case in which the most dissimilar items have a closer distance to a specific user rather than the other of items having a greater value of distance-based similarity compared to any of existing dissimilar items.

Using the code

using System;
using System.Collections;
using System.Collections.Generic;
using System.Text.RegularExpressions;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace KMeans
{
    class Attrib : ICloneable
    {
        private string m_Name;
        private double m_Value;
        public Attrib(string name, double value)
        {
            Name = name; Value = value;
        }
        public object Clone()
        {
            Attrib TargetAttrib = (Attrib)this.MemberwiseClone();
            TargetAttrib.Name = Name; TargetAttrib.Value = Value;
            return TargetAttrib;
        }
        public string Name
        {
            get { return m_Name;  }
            set { m_Name = value; }
        }
        public double Value
        {
            get { return m_Value; }
            set { m_Value = value; }
        }
    }
    class IAttribList : ICloneable, IEnumerable<Attrib>
    {
        private List<Attrib> m_AttribList = null;
        public IAttribList()
        {
            if (m_AttribList == null)
                m_AttribList = new List<Attrib>();
        }
        public void Add(Attrib attrib_item)
        {
            m_AttribList.Add((Attrib)attrib_item.Clone());
        }
        public object Clone()
        {
            IAttribList TargetAttribList = new IAttribList();
            foreach (Attrib attrib in m_AttribList)
                TargetAttribList.Add(attrib);
        
            return TargetAttribList.Count() > 0 ?
               (IAttribList)TargetAttribList.Clone() : null;
        }
        public Attrib this[int iIndex]
        {
            get { return m_AttribList[iIndex]; }
            set { m_AttribList[iIndex] = value; }
        }
        public int Count() { return m_AttribList.Count(); }
        public IEnumerator GetEnumerator()
        {
            return m_AttribList.GetEnumerator();
        }
        IEnumerator<Attrib> IEnumerable<Attrib>.GetEnumerator()
        {
            return (IEnumerator<Attrib>)this.GetEnumerator();
        }
    }
    class Item : ICloneable
    {
        private bool m_IsUser;
        private bool m_Exists;
        private string m_ItemText;
        private double m_Distance;
        private IAttribList m_AttribList = null;
        public Item(string item_text, IAttribList attributes,
            double distance, bool is_user, bool exists)
        {
            if (AttribList == null)
                AttribList = (IAttribList)attributes;

            IsUser = is_user; Exists = exists;
            ItemText = item_text; Distance = distance;
        }
        public IAttribList GetAttribList()
        {
            return m_AttribList;
        }

        public object Clone()
        {
            Item TargetItem = (Item)this.MemberwiseClone();
            IAttribList TargetAttribList = new IAttribList();
            foreach (Attrib attrib in this.AttribList)
                TargetAttribList.Add(attrib);

            if (TargetAttribList.Count() > 0)
                TargetItem.AttribList = (IAttribList)TargetAttribList.Clone();

            TargetItem.IsUser = this.IsUser; TargetItem.Exists = this.Exists;
            TargetItem.ItemText = this.ItemText; TargetItem.Distance = this.Distance;

            return TargetItem;
        }

        private double Relevance(string attrib1, string attrib2)
        {
            double nRelevance = 0;
            // Assigning the nLength variable the value of the smallest string length
            int nLength = attrib1.Length < attrib2.Length ? attrib1.Length : attrib2.Length;
            // Iterating through the two strings of character, comparing the pairs of items
            // from either attrib1 and attrib2. If the two characters are lexicographically equal
            // we're adding the value 1 / nLength to the nRelevance variable
            for (int iIndex = 0; iIndex < nLength; iIndex++)
                nRelevance += (attrib1[iIndex] == attrib2[iIndex]) ? (double)1 / nLength : 0;

            return nRelevance;
        }
        public double EuclDW(Item item)
        {
            int nCount = 0;
            int iIndex = 0; double nDistance = 0;
            // Iterating through the array of attributes and for each pair of either users or items
            // attributes computing the distance between those attributes. Then, each distance values
            // is added to the nDistance variable. During the computation we're also obtaining the
            // value of releavance between the lexicographical representations of those attributes
            while (iIndex < item.AttribList.Count() && iIndex < AttribList.Count())
            {
                // Compute the relevance between names of the pair of attributes
                double nRel = Relevance(item.AttribList[iIndex].Name, AttribList[iIndex].Name);
                if (nRel == 1) nCount++;

                // Computing the Eucledean distance between the pair of current attributes
                nDistance+= Math.Pow(item.AttribList[iIndex].Value - AttribList[iIndex].Value, 2.0) *
                    ((double)((nRel > 0) ? nRel : 1));

                iIndex++;
            }

            // Returning the value of the distance between two vectors of attributes
            return Math.Sqrt(nDistance) * ((double)1 / ((nCount > 0) ? nCount : 0.01));
        }

        public string ItemText
        {
            get { return m_ItemText;  }
            set { m_ItemText = value; }
        }
        public double Distance
        {
            get { return m_Distance;  }
            set { m_Distance = value; }
        }
        public bool IsUser
        {
            get { return m_IsUser;  }
            set { m_IsUser = value; }
        }
        public bool Exists
        {
            get { return m_Exists;  }
            set { m_Exists = value; }
        }
        public IAttribList AttribList
        {
            get { return m_AttribList;  }
            set { m_AttribList = value; }
        }
    }
    class IItemsList : ICloneable, IEnumerable<Item>
    {
        private List<Item> m_ItemsList = null;
        public IItemsList()
        {
            if (m_ItemsList == null)
                m_ItemsList = new List<Item>();
        }
        public void Add(Item item)
        {
            m_ItemsList.Add(item);
        }
        public object Clone()
        {
            IItemsList TargetItems = new IItemsList();
            foreach (Item item in m_ItemsList)
            {
                IAttribList TargetAttribList = new IAttribList();
                foreach (Attrib attrib in item.GetAttribList())
                    TargetAttribList.Add(new Attrib(attrib.Name, attrib.Value));

                if (TargetAttribList.Count() > 0)
                    TargetItems.Add(new Item(item.ItemText, TargetAttribList,
                        item.Distance, item.IsUser, item.Exists));
            }

            return TargetItems;
        }
        public Item this[int iIndex]
        {
            get { return m_ItemsList[iIndex]; }
            set { m_ItemsList[iIndex] = value;  }
        }
        public int Count() { return m_ItemsList.Count(); }
        public void RemoveAt(int iIndex) { m_ItemsList.RemoveAt(iIndex); }
        public IEnumerator GetEnumerator()
        {
            return m_ItemsList.GetEnumerator();
        }
        IEnumerator<Item> IEnumerable<Item>.GetEnumerator()
        {
            return (IEnumerator<Item>)this.GetEnumerator();
        }
    }
    class ICluster : ICloneable
    {
        private IItemsList m_Items;
        private IItemsList m_Centroids;
        public ICluster(IItemsList cnts_list, IItemsList items_list)
        {
            Items = (IItemsList)items_list;
            Centroids = (IItemsList)cnts_list;
        }
        public object Clone()
        {
            IItemsList TargetItemsList = new IItemsList();
            IItemsList TargetCentroidsList = new IItemsList();
            ICluster TargetCluster = (ICluster)this.MemberwiseClone();

            foreach (Item centroid in Centroids)
                TargetCentroidsList.Add(centroid);

            foreach (Item item in Items)
                TargetItemsList.Add(item);

            TargetCluster.Items = TargetItemsList;
            TargetCluster.Centroids = TargetCentroidsList;

            return TargetCluster;
        }
        public IItemsList Items
        {
            get { return (IItemsList)m_Items; }
            set { m_Items = (IItemsList)value; }
        }
        public IItemsList Centroids
        {
            get { return (IItemsList)m_Centroids; }
            set { m_Centroids = (IItemsList)value; }
        }
    }
    class IClustersList : ICloneable, IEnumerable<ICluster>
    {
        private List<ICluster> m_ClustersList = null;
        public IClustersList()
        {
            if (m_ClustersList == null)
                m_ClustersList = new List<ICluster>();
        }
        public void Add(ICluster cluster)
        {
            m_ClustersList.Add(cluster);
        }
        public object Clone()
        {
            IClustersList TargetClustersList = new IClustersList();
            foreach (ICluster cluster in m_ClustersList)
            {
                IItemsList TargetCentroidsList = new IItemsList();
                foreach (Item centroid in (IItemsList)cluster.Centroids.Clone())
                    TargetCentroidsList.Add(new Item(centroid.ItemText, (IAttribList)centroid.AttribList.Clone(),
                        centroid.Distance, centroid.IsUser, centroid.Exists));

                IItemsList TargetItemsList = new IItemsList();
                foreach (Item item in (IItemsList)cluster.Items.Clone())
                    TargetItemsList.Add(new Item(item.ItemText, (IAttribList)item.AttribList.Clone(),
                        item.Distance, item.IsUser, item.Exists));

                TargetClustersList.Add(new ICluster((IItemsList)TargetCentroidsList.Clone(),
                    (IItemsList)TargetItemsList.Clone()));
            }

            return TargetClustersList;
        }
        public ICluster this[int iIndex]
        {
            get { return m_ClustersList[iIndex]; }
        }
        public int Count() { return m_ClustersList.Count(); }
        public IEnumerator GetEnumerator()
        {
            return m_ClustersList.GetEnumerator();
        }
        IEnumerator<ICluster> IEnumerable<ICluster>.GetEnumerator()
        {
            return (IEnumerator<ICluster>)this.GetEnumerator();
        }
    }
    class KMeans
    {
        private IItemsList m_Items;
        private IItemsList m_Users;
        private IClustersList m_Clusters;
        private IClustersList m_TargetClusters;
        private readonly System.Random rnd = new System.Random();

        private double m_MinVal = 0;
        private double m_MaxVal = 0;
        private void Swap(ref IAttribList attribs, int indexA, int indexB)
        {
            Attrib temp = attribs[indexA];
            attribs[indexA] = attribs[indexB];
            attribs[indexB] = temp;
        }
        private void Normalize(IItemsList items_list, int n_attribs,
            bool is_users, ref double min_val, ref double max_val)
        {
            // Performing a check if the minimum and maximum value are equal to 0
            if (min_val == 0 && max_val == 0)
            {
                // Assigning the initial values to min_val and max_val variable,
                // which represent the boundaries at which the value of each attribute is normalized
                min_val = (double)1 / n_attribs;
                max_val = (double)n_attribs / (n_attribs + 1);
            }

            // Iterating through the array of items and for each item items_list[iItem]
            // performing normalization by distributing the value of each attribute in
            // the range of [0;1] using local extremum formula
            for (int iItem = 0; iItem < items_list.Count(); iItem++)
            {
                // For the current item items_list[iItem].AttribList retriving the array of attributes
                IAttribList AttribsTarget = items_list[iItem].AttribList;
                // Iterating through the array of attributes and for each attribute perform normalization
                // by converting its value into the value from the range [0;1] using the following formula
                for (int iAttrib = 0; iAttrib < AttribsTarget.Count(); iAttrib++)
                    // Performing a check if the value of the current attribute AttribsTarget[iAttrib].Value
                    // exceeding the [0;1] range and this is not the user's attribute
                    if (AttribsTarget[iAttrib].Value > 1 || is_users == false)
                        // If so, applying the following formula to normalize the current attribute value
                        AttribsTarget[iAttrib].Value = ((AttribsTarget[iAttrib].Value /
                            (n_attribs + 1)) - min_val) / (max_val - min_val) + 0.01;
            }
        }

        public int LoadItemsFromFile(string filename)
        {
            // Intializing the file stream object and opening the file with the name being specified
            using (System.IO.FileStream fsFile = new System.IO.FileStream(filename,
              System.IO.FileMode.Open, System.IO.FileAccess.Read, System.IO.FileShare.Read))
            {
                // Initializing the stream reader object
                using (System.IO.StreamReader fsStream = new System.IO.StreamReader(
                  fsFile, System.Text.Encoding.UTF8, true, 128))
                {
                    int iItem = 0;
                    int nAttrib = 1; string textBuf = "\0";
                    // Retrieving each line from the file until we reach the end-of-file
                    while ((textBuf = fsStream.ReadLine()) != null)
                    {
                        // Performing if the line is not empty and contains the data on a specific item
                        if (!String.IsNullOrEmpty(textBuf))
                        {
                            // If so, initializing the array of attributes TargetAttribList for the current item
                            IAttribList TargetAttribList = new IAttribList();
                            // Tokenizing the string according to the regular expression pattern assigned to the sItemPatern variable
                            string sItemPattern = " => "; string[] sItemTokens;
                            if ((sItemTokens = Regex.Split(textBuf, sItemPattern)) != null)
                            {
                                // Iterating through the array of tokens and for each string
                                // perform another tokenization to obtain the set of attributes name for the current item
                                for (int iToken = 0; iToken < 2; iToken++)
                                {
                                    // For each string sItemTokens[iToken] we're performing tokenization to obtain the set of attributes names
                                    string sPattern = " "; string[] sTokens;
                                    if ((sTokens = Regex.Split(sItemTokens[iToken], sPattern)) != null)
                                    {
                                        // For each particular attribute name token we're performing encoding
                                        // to obtain each attribute value associated with its name
                                        foreach (string token in sTokens)
                                        {
                                            // At this point, we're performing a check if the attribute with similar name
                                            // for a specific item has not been already indexed into the array of attributes
                                            bool bExists = false; int nToken = 0;
                                            int nIndex = iItem; double nCoeff = 0;
                                            // Iterating the array of items to find those items that have
                                            // the attribute with the name which is equal to the name of
                                            // the current attribute attribs[nToken].Name.Equals(token) being processed
                                            while (--nIndex >= 0 && bExists == false)
                                            {
                                                nToken = 0;
                                                // Iterating through the array of attributes of the current item m_Items[nIndex]
                                                // and performing a check if a certain atribute's name of the current item is not equal
                                                // the name of the current attributed being retrieved from the file. If so, we're ending
                                                // the loop execution by assinging the the bExists variable value of true
                                                IAttribList attribs = m_Items[nIndex].AttribList;
                                                while (nToken < attribs.Count() && bExists == false)
                                                    bExists = (attribs[nToken++].Name.Equals(token)) ? true : false;
                                            }

                                            // Computing the coefficient value for the current attribute retrieved from the file.
                                            // If an item with the similar name of attribute has already been indexed into the array
                                            // of items we're assigning the its attribute's value to the nCoeff variable, which is
                                            // actually the value of the attribute for the current item fetched from the file, otherwise
                                            // we're assigning the actual index value for the current attributes using nAttrib counter
                                            // variable value
                                            nCoeff = (bExists == true) ?
                                                m_Items[nIndex + 1].AttribList[nToken - 1].Value : nAttrib;

                                            bool bInAttribList = false; int iAttrib = 0;
                                            // Iterating through the array of target attributes and performing a check if the
                                            // attribute with the similar name has not yet been indexed to the following array for current item
                                            while (iAttrib < TargetAttribList.Count() && !bInAttribList)
                                                bInAttribList = (token.Equals(TargetAttribList[iAttrib++].Name)) ? true : false;

                                            // If the current attribute has not yet been indexed, inserting the new attribute
                                            // represented as a pair of two value of either token name or coefficient nCoeff into
                                            // the array of attributes for the current item being retrieved from the file
                                            if (bInAttribList == false)
                                                TargetAttribList.Add(new Attrib(token, nCoeff));

                                            nAttrib++; // Incrementing the value of the attributes loop counter variable by 1
                                        }
                                    }
                                }
                            }

                            // Inserting the current item retrieved from the file into the array of items m_Items
                            m_Items.Add(new Item(textBuf, TargetAttribList, 0, false, false));

                            iItem++; // Incrementing the value of the items loop counter variable by 1
                        }
                    }

                    // Performing normalization of the attributes values for each item the array of items
                    Normalize(m_Items, nAttrib, false, ref m_MinVal, ref m_MaxVal);

                    // Deallocating the stream reader object
                    fsStream.Close();
                }

                // Deallocating the file stream object
                fsFile.Close();
            }

            // Returning the actual value of the number of items retrieved from the file
            return m_Items.Count();
        }
        public int LoadUsersFromFile(string filename)
        {
            // Intializing the file stream object and opening the file with the name being specified
            using (System.IO.FileStream fsFile = new System.IO.FileStream(filename,
              System.IO.FileMode.Open, System.IO.FileAccess.Read, System.IO.FileShare.Read))
            {
                // Initializing the stream reader object
                using (System.IO.StreamReader fsStream = new System.IO.StreamReader(
                  fsFile, System.Text.Encoding.UTF8, true, 128))
                {
                    int iItem = 0;
                    int nAttrib = 1; string textBuf = "\0";
                    // Retrieving each line from the file until we reach the end-of-file
                    while ((textBuf = fsStream.ReadLine()) != null)
                    {
                        // Performing if the line is not empty and contains the data on a specific user
                        if (!String.IsNullOrEmpty(textBuf))
                        {
                            // If so, initializing the array of attributes TargetAttribList for the current user
                            string sPattern = " => "; string[] sTokens;
                            IAttribList TargetAttribList = new IAttribList();
                            // Tokenizing the string according to the regular expression pattern assigned to the sItemPatern variable
                            if ((sTokens = Regex.Split(textBuf, sPattern)) != null)
                            {
                                // For each particular attribute name token we're performing encoding
                                // to obtain each attribute value associated with its name
                                foreach (string token in sTokens[1].Split(new char[] { ' ' }))
                                {                                            
                                    // At this point, we're performing a check if the attribute with similar name
                                    // for a specific user has not been already indexed into the array of attributes
                                    bool bExists = false; int nToken = 0;
                                    int nIndex = 0; double nCoeff = 0;
                                    // Iterating the array of users to find those users that have
                                    // the attribute with the name which is equal to the name of
                                    // the current attribute attribs[nToken].Name.Equals(token) being processed
                                    while (nIndex < m_Items.Count() && bExists == false)
                                    {
                                        nToken = 0;
                                        // Iterating through the array of attributes of the current user m_Users[nIndex]
                                        // and performing a check if a certain atribute's name of the current user is not equal
                                        // the name of the current attributed being retrieved from the file. If so, we're ending
                                        // the loop execution by assinging the the bExists variable value of true
                                        while (nToken < m_Items[nIndex].AttribList.Count() && bExists == false)
                                            bExists = (m_Items[nIndex].AttribList[nToken++].Name.Equals(token)) ? true : false;

                                        nIndex++;
                                    }

                                    // If the users with hat have the attribute with the name which is equal to the name of
                                    // the current attribute attribs[nToken].Name.Equals(token) don't exists, then we're
                                    // iterating through the array of users performing a check if the attribute with similar
                                    // has already been indexed for a particular user into the array m_Users.
                                    if (bExists == false)
                                    {
                                        int nItem = iItem - 1;
                                        bool bUserAttrib = false;
                                        // Iterating through the set of previous users in the array of users m_Users
                                        while (nItem >= 0 && bUserAttrib == false)
                                        {
                                            nToken = 0;
                                            // For each user, iterating through the array of attributes, and for each attribute's
                                            // name we're performing a check if the name of the current attribute of the current user
                                            // is not equal to the name of the current token retrieved from the file.
                                            while (nToken < m_Users[nItem].AttribList.Count() && !bUserAttrib)
                                                bUserAttrib = (m_Users[nItem].AttribList[nToken++].Name.Equals(token)) ? true : false;

                                            nItem--;
                                        }

                                        // Computing the coefficient value for the current attribute retrieved from the file.
                                        // If a user with the similar name of attribute has already been indexed into the array
                                        // of users we're assigning the its attribute's value to the nCoeff variable, which is
                                        // actually the value of the attribute for the current user fetched from the file, otherwise
                                        // we're assigning the actual index value for the current attributes using nAttrib counter
                                        // variable value
                                        nCoeff = (bUserAttrib == true) ? m_Users[nItem + 1].AttribList[nToken - 1].Value : nAttrib;
                                    }
                                   
                                    // Otherwise, assigning the nCoeff variable to the value of the attribute of a specific user that
                                    // has already been indexed into the array of users
                                    else nCoeff = m_Items[nIndex - 1].AttribList[nToken - 1].Value;

                                    // Inserting the new attribute represented as a pair of two value of either token name
                                    // or coefficient nCoeff into the array of attributes for the current item being retrieved from the file
                                    TargetAttribList.Add(new Attrib(token, nCoeff));

                                    nAttrib++; // Incrementing the value of the attributes loop counter variable by 1
                                }

                                // Inserting the current user retrieved from the file into the array of users m_Users
                                m_Users.Add(new Item(textBuf, TargetAttribList, 0, true, false));

                                iItem++; // Incrementing the value of the users loop counter variable by 1
                            }
                        }
                    }

                    // Performing normalization of the attributes values for each user the array of users
                    Normalize(m_Users, nAttrib, true, ref m_MinVal, ref m_MaxVal);

                    // Deallocating the stream reader object
                    fsStream.Close();
                }

                // Deallocating the file stream object
                fsFile.Close();
            }

            // Returning the actual value of the number of users retrieved from the file
            return m_Users.Count();
        }
        public KMeans()
        {
            m_Items = new IItemsList();
            m_Users = new IItemsList();

        }
        public void Compute(int nInitialCentroids, int nItemsPerCluster)
        {
            // Initializing the array of target clusters for which we'll produce the new clusters
            m_TargetClusters = new IClustersList();
            // Performing the iteration until we've obtained the array of target clusters
            while (m_TargetClusters.Count() < m_Users.Count())
            {
                // Initializing the array of clusters
                m_Clusters = new IClustersList();
                // Performing a check if the number of centroids within the initial cluster is not equal to the number of users
                if (nInitialCentroids != m_Users.Count())
                {
                    // Obtaining the array of initial centroids based on the values
                    // retrieved from the array of users by performing k-means++ procedure
                    
                    // Initializing the array of centroid indexes
                    List<int> CentroidIndexes = new List<int>();
                    // Randomly generate the index of the first intial centroid
                    int nInitialCentroid = rnd.Next(0, m_Users.Count());
                    // Performing iteration until we've obtained the n-initial centroids
                    while (CentroidIndexes.Count() < nInitialCentroids)
                    {
                        double nDistance = 0, nDistanceSum = 0;
                        double nDistanceMin = 0; int nCntMin = -1;
                        // Iterating through the array of users and for each user compute the distance
                        // to the initial centroid being previously selected
                        for (int nItem = 0; nItem < m_Users.Count(); nItem++)
                        {
                             // Performing a check if the index of the current user is not equal to
                             // the index of the intial centroid (i.e. user) in the array of users
                             if (nItem != nInitialCentroid)
                             {
                                // If so, computing the actual distance between the two vectors
                                // of either the current user m_Users[nItem] or initial centroid's m_Users[nInitialCentroid] attributes.
                                if ((nDistance = Math.Pow(m_Users[nItem].EuclDW(m_Users[nInitialCentroid]), 2.0)) >= 0)
                                 {
                                    // If the following distance is less than the smallest distance to the initial centroid
                                    // m_Users[nInitialCentroid], then we're performing a check if the index of the current
                                    // user has not yet been inserted into the array of the centroids indexes.
                                    if (nDistance < nDistanceMin || nCntMin == -1)
                                     {   
                                         bool bFound = false; int iCntIndex = 0;
                                        // Iterating through the array of centroids indexes and for each index CentroidIndexes[iCntIndex]
                                        // in the array we're performing a check if it's not equal to the index of the current user nItem,
                                        // if so, we're ending the loop execution by assigning the value true to the variable bFound.
                                        while (iCntIndex < CentroidIndexes.Count() && bFound == false)
                                             bFound = (CentroidIndexes[iCntIndex++] == nItem) ? true : false;

                                         // If the current user's index is not in the array of the centroids indexes, then
                                         // we're assigning the variable nDistanceMin the value of the previously computed
                                         // distance nDistance. Also, we're assigning the index value of the current user to nCntMin variable
                                         if (bFound == false)
                                         {
                                             nDistanceMin = nDistance; nCntMin = nItem;
                                         }
                                      }

                                      // Computing the sum of the distances to the initial centroid for each user
                                      nDistanceSum += nDistance;
                                 }
                             }
                        }

                        // Modify the value of nDistanceSum variable multiplying it by the randomly generate number
                        nDistanceSum = rnd.NextDouble() * nDistanceSum;

                        int nIndex = 0; double nSum = 0;
                        // Iterating through the array of users until the sum of distances between the vectors of attributes of
                        // each user and the initial centroid doesn't exceed the total value on the sum of distances for all users
                        while (nIndex < m_Users.Count() && nSum < nDistanceSum)
                        {
                            int iTargetIndex = 0; bool bFound = false;
                            // For the current user m_Users[nIndex] computing the distance
                            // to the users that has been previously selected as an initial centroid
                            double nDist = Math.Pow(m_Users[nIndex++].EuclDW(m_Users[nCntMin]), 2.0);
                            // Performing a check if the index of the current user m_Users[nIndex] is not in the array CentroidIndexes.
                            while (iTargetIndex < CentroidIndexes.Count() && !bFound)
                                bFound = (CentroidIndexes[iTargetIndex++] == nIndex) ? true : false;

                            // If not, summing the distance value for the current user nDist with the nSum variable
                            if (bFound == false)
                                nSum += nDist;
                        }

                        // Performing a check if the value of the nCntMin variable representing the actual index
                        // of the user with the smallest distance to initial centroid is not equal to -1
                        if (nCntMin != -1)
                            // If not, inserting the index nCntMin to the array of centroids indexes
                            CentroidIndexes.Add(nCntMin);
                    }

                    // Initializing the array of initial centroids
                    IItemsList CentroidItems = new IItemsList();
                    // Iterating through the array of users and inserting each user
                    // m_Users[CentroidIndexes[iIndex] with index CentroidIndexes[iIndex] to the array of centroids
                    for (int iIndex = 0; iIndex < CentroidIndexes.Count(); iIndex++)
                        CentroidItems.Add(m_Users[CentroidIndexes[iIndex]]);

                    // Inserting the new current initial cluster to the array of clusters.
                    m_Clusters.Add(new ICluster(CentroidItems, m_Items));
                }

                // Inserting the initial cluster into the array of clusters
                else m_Clusters.Add(new ICluster(m_Users, m_Items));

                // Iterating through the array of clusters, retrieving each cluster
                // to obtain the new clusters by performing k-means procedure
                for (int iCluster = 0; iCluster < m_Clusters.Count(); iCluster++)
                {
                    // Clonning the array of items belonging to the current
                    // cluster m_Clusters[iCluster] by copying them into array Items
                    IItemsList Items = (IItemsList)m_Clusters[iCluster].Items.Clone();
                    // Clonning the array of centroids belonging to the current
                    // cluster m_Clusters[iCluster] by copying them into array Centroids
                    IItemsList Centroids = (IItemsList)m_Clusters[iCluster].Centroids.Clone();
                    // Iterating through the array of centroids (i.e. users) of the current cluster m_Clusters[iCluster]
                    for (int iCentroid = 0; iCentroid < Centroids.Count(); iCentroid++)
                    {
                        // For each centroid Centroids[iCentroid] of the current cluster
                        // m_Clusters[iCluster] retriving the set of attributes and copy it into array attribsA
                        IAttribList attribsA = Centroids[iCentroid].AttribList;
                        // Normalizing the set of attributes of the current centroid Centroids[iCentroid]
                        // Iterating through the array of attributes of the current centroid Centroids[iCentroid]
                        for (int iAttrib = 0; iAttrib < attribsA.Count(); iAttrib++)
                            // For each attribute retrieved from the set of attributes of the current centroid
                            // Centroids[iCentroid], we're performing a linear search in the array of items Items
                            // to find those items that have one or more attributes with similar name (e.g. which name
                            // is lexicographically equal to the name of the current attribute attribsA[iAttrib] of the
                            // centroid Centroids[iCentroid]).
                            // Iterating through the array of items and for each item Items[iItem] retrieving a set of attributes
                            for (int iItem = 0; iItem < Items.Count(); iItem++)
                            {
                                // Copying the array of attributes for the current item Items[iItem] into the array attribsB
                                IAttribList attribsB = Items[iItem].AttribList;
                                // Iterating through the array of attributes attribB of the current item Items[iItem] and perform a check
                                // if the name of the current attribute attribsB[nAttrib] is lexicographically equal to the name of the
                                // current attribute attribsA[iAttrib] of the current centroid (e.g. user)
                                for (int nAttrib = 0; nAttrib < attribsB.Count(); nAttrib++)
                                    // If the name of current item's attribute attribsB[nAttrib] is lexicographically equal to
                                    // the name of the current centroid's attribute attribsA[iAttrib], then we're performing a
                                    // swap to ensure that the particular attributes of either the centroid (user) or item are
                                    // located at the same position in the arrays representing the vectors of either user or
                                    // items attributes, relative to the position of the centroid's (user) of a particular attributes
                                    // in the array of attributes of the current centroid Centroids[iCentroid]
                                    if (attribsB[nAttrib].Name.Equals(attribsA[iAttrib].Name))
                                        if (iAttrib < attribsB.Count()) Swap(ref attribsB, iAttrib, nAttrib);
                            }

                        // Initializing the variable nDistanceAvg to store the value of the average distance
                        // between the current centroid Centroids[iCentroid] and each item within the current cluster m_Clusters[iCluster]
                        double nDistanceAvg = 0;
                        // Initializing the list of indexes of those items that have the smallest distance
                        // to the current centroid Centroids[iCentroid]
                        List<int> ItemsIndexes = new List<int>();
                        int nNeighbors = Items.Count() < nItemsPerCluster ? Items.Count() : nItemsPerCluster;

                        bool bDone = false;
                        // Performing the linear search to find all items in the array Items that have the smallest
                        // distance (e.g. the most similar items) to the current centroid Centroids[iCentroid] until
                        // we've exactly found nNeightbors items that exactly meet the following creteria
                        while (ItemsIndexes.Count() < nNeighbors && !bDone)
                        {
                            // Initializing the variable nDistanceAvg to store the average distance within a cluster
                            nDistanceAvg = 0;
                            // Initializing the nDistanceMin variable used to store the value of the smallest distance
                            // between an item and the current centroid Centroids[iCentroid]
                            // Initializing the nItemMin variable to store the index value of the item from the array Items
                            // having the smallest distance to the current centroid Centroids[iCentroid]
                            double nDistanceMin = 0; int nItemMin = -1;
                            // Iterating through the array of items of the current cluster m_Clusters[iCluster]
                            for (int iItem = 0; iItem < Items.Count(); iItem++)
                            {
                                double nDistance = 0;
                                // For each item being retrieved we're performing a check if the distance for the current item
                                // has not already been computed and the current item is not the item having the smallest distance
                                // to the current centroid Centroids[iCentroid]
                                if (m_Clusters[iCluster].Items[iItem].Exists == false)
                                {
                                    // If not, computing the distance between two vectors of attributes of either
                                    // the current centroid attribA or the current item Items[iItem]
                                    // Initializing the temporary array of the current centroid's attributes by copying
                                    // the array attribsA to the temporary array item
                                    Item temp = new Item(null, attribsA, 0, false, false);
                                    // Computing the actual distance between the vector of the currentr centroid's attributes
                                    // stored to the array of attributes item, and the vector of attributes for the current item Items[iItem]
                                    if ((nDistance = Items[iItem].EuclDW(temp)) >= 0)
                                    {
                                        // If the value of distance between the either the current centroid's or item's vectors of attributes
                                        // is less than the smallest distance to the current centroid's vector of attributes, then we're performing
                                        // a check if the index of the current item's Items[iItem] has not already been stored into the array ItemsIndexes
                                        if (((nDistance < nDistanceMin) ||
                                             (nItemMin == -1)) && Items[iItem].ItemText != null && nDistance <= 1.0)
                                        {
                                            bool bExists = false;
                                            int nItem = ItemsIndexes.Count() - 1;
                                            // Performing the linear search to check if the index of the current item for which we've previously
                                            // computed the distance value, doesn't exists in the array of indexes.
                                            while (nItem >= 0 && bExists == false)
                                                bExists = (ItemsIndexes[nItem--] == iItem) ? true : false;

                                            // If not, assigning the variable nDistanceMin the distance value between the vector of attributes
                                            // of the current centroid Centroids[iCentroid] and the vector of attributes of the current item Items[iItem]
                                            // Also, we're assigning the index value iItem of the current item Items[iItem] to the nItemMin variable
                                            if (bExists == false)
                                            {
                                                nDistanceMin = nDistance; nItemMin = iItem;
                                            }
                                        }
                                    }
                                }

                                // Computing the avarage distance between the vector of attributes of each item
                                // Items[iItem] and the vector of attributes of the current centroid Centroids[iCentroid]
                                nDistanceAvg += nDistance / Items.Count();
                            }

                            // If nItemMin variable is not equal to -1 and the appropriate value of the item with the smallest distance
                            // has been found, inserting the value of index nItemMin into the array ItemsIndexes.
                            if (nItemMin > -1)
                                ItemsIndexes.Add(nItemMin);
                            // Otherwise, terminating the process of finding the items with the
                            // smallest distance to the current centroid Centroids[iCentroid]
                            else bDone = true;
                        }

                        // Iterating through the array of items of the current cluster m_Clusters[iCluster]
                        // and for each item Items[ItemsIndexes[iIndex] with the index value ItemsIndexes[iIndex]
                        // stored in the array ItemsIndexes, we're assigning the variable Exists to true, which
                        // means that the actual distance value for the current item has already been computed
                        // and the following item has already been included into the current the newly built cluster
                        for (int iIndex = 0; iIndex < ItemsIndexes.Count(); iIndex++)
                            m_Clusters[iCluster].Items[ItemsIndexes[iIndex]].Exists = true;

                        // Updating the value of the variable Centroids[iCentroid].Distance by assigning it
                        // the value of the current averaged distance nDistanceAvg being computed
                        Centroids[iCentroid].Distance = nDistanceAvg;

                        // Initializing the array of target items for the current newly built cluster
                        IItemsList TargetItems = new IItemsList();
                        // Iterating through the array of indexes ItemsIndexes being obtained
                        for (int iItem = 0; iItem < ItemsIndexes.Count(); iItem++)
                        {
                            // For the current cluster we're performing a check if the current cluster
                            // is not the initial cluster in the array of clusters (e.g. the following cluster
                            // has no more than one centroid).
                            if (m_Clusters[iCluster].Centroids.Count() <= 1)
                                // If not, re-compute the distance between the current item with index ItemsIndexes[iItem]
                                // to the only centroid of the current non-initial cluster and assign this value to the
                                // Items[ItemsIndexes[iItem]].Distance variable for the current item
                                Items[ItemsIndexes[iItem]].Distance =
                                    Items[ItemsIndexes[iItem]].EuclDW(m_Clusters[iCluster].Centroids[0]);

                            // Resetting the value of the exists variable for the current item in the newly built cluster
                            // by assigning it the variable of false.
                            Items[ItemsIndexes[iItem]].Exists = false;
                            // Inserting the current item with index ItemsIndexes[iItem] into the array of target items for the newly built cluster
                            TargetItems.Add(Items[ItemsIndexes[iItem]]);
                        }

                        int nMinAttribs = -1;
                        // Iterating through the array of target items for the newly built cluster to
                        // find the item with smallest number of attributes
                        for (int iItem = 0; iItem < TargetItems.Count(); iItem++)
                        {
                            // Obtaining the value of the actual number of attributes count of the current item TargetItems[iItem]
                            int nAttribCount = TargetItems[iItem].AttribList.Count();
                            // If the value of attributes count nAttribCount is less than the smallest value of the attributes count for
                            // the entire array TargetItems, then assigning the value of nAttribCount to the nMinAttribs variable
                            if (nAttribCount < nMinAttribs || nMinAttribs == -1)
                                nMinAttribs = nAttribCount;
                        }

                        // Initializing the attribs list to store the computed values of attributes
                        // for the centroid of the newly built cluster
                        IAttribList attribs = new IAttribList();
                        // Computing the new value for each attribute of the centroid for the newly build cluster by iterating
                        // through each attrbutes in each item, inserting each new value of attribute into the array attrib
                        for (int nAttrib = 0; nAttrib < nMinAttribs &&
                             nAttrib < Centroids[iCentroid].AttribList.Count(); nAttrib++)
                        {
                            // For each attribute of the current centroid Centroids[iCentroid] of the existing cluster
                            // obtaining the new value by computing the sum of each attribute AttribList[nAttrib] at the
                            // position nAttrib of the vector of attributes AttribList[nAttrib] for each target item being previously obtained
                            double nAttribAvg = 0; int nCount = 0;
                            // Iterating through the array of target items TargetItems
                            for (int iItem = 0; iItem < TargetItems.Count(); iItem++)
                                // For each item performing a check if the value of the attribute
                                // located at the current position nAttrib is greater than zero
                                if (TargetItems[iItem].AttribList[nAttrib].Value > 0)
                                {
                                    // If so, adding up the value of the attribute located at the position nAttrib
                                    // with the value of the nAttribAvg variable. Also computing the count of the target
                                    // items that exactly meet the following condition by incrementing the count variable by 1
                                    nAttribAvg += TargetItems[iItem].AttribList[nAttrib].Value;
                                    nCount++;
                                }

                            // Since we've obtained the new value of attribute AttribList[nAttrib].Value with index nAttrib,
                            // for the current centroid Centroids[iCentroid] we're computing the average value by performing
                            // the division of the nAttribAvg variable's value by the actual number of target items that
                            // satisfied the criteria commented above. After that, we're compacting the following value by
                            // performing normalization. Finally we're inserting the average value  into the array of
                            // atributes attrib along with the name of the new attribute which value has been obtained
                            attribs.Add(new Attrib(Centroids[iCentroid].AttribList[nAttrib].Name,
                              ((nAttribAvg / (double)(nCount + 1)) - m_MinVal) / (m_MaxVal - m_MinVal) + 0.01));
                        }

                        bool bDiff = false; int nIndex = 0;
                        // Iterating through the new vector of attributes attribs to determine if the following vector
                        // of attributes being obtained has the different values of attributes than the vector of attributes attribsA
                        // of the current centroid Centroids[iCentroid]
                        while (nIndex < attribs.Count() && nIndex < attribsA.Count() && bDiff == false)
                            bDiff = (attribs[nIndex].Value != attribsA[nIndex++].Value) ? true : false;

                        if (bDiff == true)
                        {
                            // If so, initializing the array of new centroids
                            IItemsList TargetCentroids = new IItemsList();
                            // Inserting the new centroid with vector of attributes
                            // attribs into the array of centroids for the new cluster
                            TargetCentroids.Add(new Item(Centroids[iCentroid].ItemText,
                                    attribs, nDistanceAvg, Centroids[iCentroid].IsUser, false));

                            // Inserting newly built cluster represented as a pair
                            // of sets of either TargetCentroids or TargetItems to the array of clusters
                            m_Clusters.Add(new ICluster(TargetCentroids, TargetItems));
                        }

                        else
                        {
                            // If not, iterating through the array of newly computed target clusters
                            // and for each cluster perfoming a check if there's a cluster with centroid which name
                            // value is equal to the name value of the current centroid Centroids[iCentroid]
                            bool bExists = false; int iTargetCluster = 0;
                            while (iTargetCluster < m_TargetClusters.Count() && !bExists)
                                bExists = m_TargetClusters[iTargetCluster++].Centroids[0].
                                   ItemText.Equals(Centroids[iCentroid].ItemText) ? true : false;

                            if (bExists == false)
                            {
                                // If so, initializing the array of centroids TargetUsers and insert the
                                // new centroid into the following array.
                                IItemsList TargetUsers = new IItemsList();
                                TargetUsers.Add(new Item(Centroids[iCentroid].ItemText,
                                    Centroids[iCentroid].AttribList, Centroids[iCentroid].Distance, true, true));
                                // Inserting the cluster into the array of target clusters for which we'll further
                                // be producing the new clusters during the next iteration of the outermost loop
                                // This fragment of code actually performs filtering to avoid the existance
                                // of the clusters with similar centroid's vectors of attributes in the array of clusters.
                                m_TargetClusters.Add(new ICluster((IItemsList)TargetUsers.Clone(),
                                   (IItemsList)TargetItems.Clone()));
                            }
                        }
                    }
                }
            }

            double nDiAvg = 0;
            // Computing the sum of the distances values for each target cluster
            for (int iCluster = 1; iCluster < m_TargetClusters.Count(); iCluster++)
                nDiAvg += m_TargetClusters[iCluster].Centroids[0].Distance / (m_TargetClusters.Count() - 1);

            double nD0Avg = 0; int nClustersCount = 0;
            // Computing the average distance value between each newly built cluster
            for (int iCluster = 1; iCluster < m_TargetClusters.Count(); iCluster++)
                for (int nCluster = iCluster + 1; nCluster < m_TargetClusters.Count(); nCluster++)
                {
                    nD0Avg += m_TargetClusters[iCluster].Centroids[0].EuclDW(
                            m_TargetClusters[nCluster].Centroids[0]);

                    nClustersCount++;
                }

            nD0Avg /= nClustersCount; // Computing the average distance between clusters
            // Computing the prediction quality value of the array of the newly built clusters.
            double nQ = ((m_TargetClusters.Count() - 1) * nDiAvg) / nD0Avg;

            // Performing the k-means clustering results verbose output
            for (int iCluster = 0; iCluster < m_TargetClusters.Count(); iCluster++)
            {
                IItemsList ItemsList = m_TargetClusters[iCluster].Items;
                IItemsList Centroids = m_TargetClusters[iCluster].Centroids;
                Console.WriteLine("\nCluster={0}, Centroid=\"{1}\"", iCluster, Centroids[0].ItemText);
                Console.WriteLine("-----------------------------------------------------------");

                for (int iAttrib = 0; iAttrib < Centroids[0].AttribList.Count(); iAttrib++)
                    Console.WriteLine("\"{0,-30}\" => u({1},{2}) = {3}", Centroids[0].AttribList[iAttrib].Name,
                        iCluster, iAttrib, Centroids[0].AttribList[iAttrib].Value);

                Console.WriteLine("-----------------------------------------------------------");

                for (int iItem = 0; iItem < ItemsList.Count(); iItem++)
                {
                    Console.WriteLine("\n(cluster={0}, item={1}, distance={2})\n{3}",
                        iCluster, iItem, ItemsList[iItem].Distance, ItemsList[iItem].ItemText);
                    Console.WriteLine("-----------------------------------------------------------");
                    for (int iAttrib = 0; iAttrib < ItemsList[iItem].AttribList.Count(); iAttrib++)
                        Console.WriteLine("\"{0,-30}\" => i({1},{2}) = {3}", ItemsList[iItem].AttribList[iAttrib].Name,
                            iCluster, iAttrib, ItemsList[iItem].AttribList[iAttrib].Value);

                    Console.WriteLine("-----------------------------------------------------------");
                }
            }

            Console.WriteLine("\n===========================================================");
            Console.WriteLine("Recommendations:");
            Console.WriteLine("===========================================================\n");

            for (int iCluster = 0; iCluster < m_TargetClusters.Count(); iCluster++)
            {
                IItemsList ItemsList = m_TargetClusters[iCluster].Items;
                Console.WriteLine("{0}", m_TargetClusters[iCluster].Centroids[0].ItemText);
                Console.WriteLine("======================================================");

                for (int iItem = 0; iItem < ItemsList.Count(); iItem++)
                     Console.WriteLine("{0}", ItemsList[iItem].ItemText);

                Console.WriteLine();
            }

            Console.WriteLine("KMeans Statistics:");
            Console.WriteLine("===========================================================");
            Console.WriteLine("The total number of clusters nClustersCount = {0}\n", m_TargetClusters.Count());
            Console.WriteLine("Average distance between clusters nDiAvg = {0}", nDiAvg);
            Console.WriteLine("Average distance within a cluster nD0Avg = {0}\n", nD0Avg);
            Console.WriteLine("Average quality of KMeans clustering nQ = {0}\n", nQ);
        }
    }
    class Program
    {
        static void Main(string[] args)
        {
            string[] filenames = new string[2] { @"ARTICLES.TXT", @"USERS.TXT" };
            string[] sBanner = new string[2] { "K-Means++ Recommender Engine v.1.2.65535 (Euclidix) by Arthur V. Ratz",
                                               "=====================================================================\n" };

            for (int iBanner = 0; iBanner < 2; iBanner++)
                Console.WriteLine(sBanner[iBanner]);

            KMeans km = new KMeans();

            int nItemsCount = km.LoadItemsFromFile(filenames[0]);
            int nUsersCount = km.LoadUsersFromFile(filenames[1]);

            int nInitialUsers = 0;
            int nItemsPerCluster = 0, nItemsPerClusterMax = 0;
            if (nItemsCount > 0 && nUsersCount > 0)
            {
                do
                {
                    nItemsPerClusterMax = nItemsCount / nUsersCount;
                    Console.Write("Enter the number of articles per user [2-{0}]: ", nItemsPerClusterMax);
                    nItemsPerCluster = int.Parse(Console.ReadLine());
                } while (nItemsPerCluster < 2 || nItemsPerCluster > nItemsPerClusterMax);

                do
                {
                    Console.Write("\nEnter the number of users (initial centroids) [1-{0}]: ", nUsersCount);
                    nInitialUsers = int.Parse(Console.ReadLine());
                } while (nInitialUsers < 1 || nInitialUsers > nUsersCount);
            }

            km.Compute(nInitialUsers, nItemsPerCluster);

            Console.Read();
        }
    }
}

Output

K-Means++ Recommender Engine v.1.2.65535 (Euclidix) by Arthur V. Ratz
=====================================================================

Enter the number of articles per user [2-9]: 5

Enter the number of users (initial centroids) [1-3]: 3

Cluster=0, Centroid="User0 => C++ Linux"
-----------------------------------------------------------
"C++                           " => u(0,0) = 0,0641355115153619
"Linux                         " => u(0,1) = 0,135503210492002
-----------------------------------------------------------

(cluster=0, item=0, distance=0,0181865932348965)
ShaneMcDonald. Microsoft Visual C++ Static and Dynamic Libraries => C++ Visual-Studio Dev virtual-machine Virtualization
-----------------------------------------------------------
"C++                           " => i(0,0) = 0,0676382057531689
"Microsoft                     " => i(0,1) = 0,175408304563312
"Visual                        " => i(0,2) = 0,17791458593099
"ShaneMcDonald.                " => i(0,3) = 0,172902023195634
"Static                        " => i(0,4) = 0,182927148666345
"and                           " => i(0,5) = 0,15535805362189
"Dynamic                       " => i(0,6) = 0,187939711401701
"Libraries                     " => i(0,7) = 0,190445992769378
"Visual-Studio                 " => i(0,8) = 0,122776395842079
"Dev                           " => i(0,9) = 0,0400691107087137
"virtual-machine               " => i(0,10) = 0,200471118240089
"Virtualization                " => i(0,11) = 0,202977399607767
-----------------------------------------------------------

(cluster=0, item=1, distance=0,0216683871819057)
Nish Nishant. Using lambdas - C++ vs. C# vs. C++/CX vs. C++/CLI => C# C++ Windows Visual-Studio Dev QA Architect
-----------------------------------------------------------
"C++                           " => i(0,0) = 0,0676382057531689
"Nishant.                      " => i(0,1) = 0,0876884566945908
"Using                         " => i(0,2) = 0,0901947380622685
"lambdas                       " => i(0,3) = 0,0927010194299463
"-                             " => i(0,4) = 0,095207300797624
"Nish                          " => i(0,5) = 0,0851821753269131
"vs.                           " => i(0,6) = 0,10021986353298
"C#                            " => i(0,7) = 0,102726144900657
"C++/CX                        " => i(0,8) = 0,107738707636013
"C++/CLI                       " => i(0,9) = 0,112751270371368
"Windows                       " => i(0,10) = 0,0726507684885244
"Visual-Studio                 " => i(0,11) = 0,122776395842079
"Dev                           " => i(0,12) = 0,0400691107087137
"QA                            " => i(0,13) = 0,127788958577435
"Architect                     " => i(0,14) = 0,130295239945112
-----------------------------------------------------------

(cluster=0, item=2, distance=0,0216683871819057)
Nish Nishant. 7 reasons C++ devs will love the VS 14 CTP => Visual C++ C++11 C++14 VC14.0 C++ Windows Dev Architect
-----------------------------------------------------------
"C++                           " => i(0,0) = 0,0676382057531689
"Nishant.                      " => i(0,1) = 0,0876884566945908
"7                             " => i(0,2) = 0,243077901490611
"reasons                       " => i(0,3) = 0,245584182858289
"Nish                          " => i(0,4) = 0,0851821753269131
"devs                          " => i(0,5) = 0,250596745593644
"will                          " => i(0,6) = 0,253103026961322
"love                          " => i(0,7) = 0,255609308329
"the                           " => i(0,8) = 0,0576130802824579
"VS                            " => i(0,9) = 0,260621871064355
"14                            " => i(0,10) = 0,263128152432033
"CTP                           " => i(0,11) = 0,26563443379971
"Visual                        " => i(0,12) = 0,17791458593099
"C++11                         " => i(0,13) = 0,273153277902744
"C++14                         " => i(0,14) = 0,275659559270421
"VC14.0                        " => i(0,15) = 0,278165840638099
"Windows                       " => i(0,16) = 0,0726507684885244
"Dev                           " => i(0,17) = 0,0400691107087137
"Architect                     " => i(0,18) = 0,130295239945112
-----------------------------------------------------------

(cluster=0, item=3, distance=0,032726256995075)
Pablo Aliskevicius. 4350% Performance Improvement with the StringBuilder for C++! => C++ Linux Windows STL Eclipse MFC Dev
-----------------------------------------------------------
"C++                           " => i(0,0) = 0,0676382057531689
"Linux                         " => i(0,1) = 0,0701444871208466
"4350%                         " => i(0,2) = 0,0475879548117469
"Performance                   " => i(0,3) = 0,0500942361794247
"Improvement                   " => i(0,4) = 0,0526005175471024
"with                          " => i(0,5) = 0,0551067989147802
"the                           " => i(0,6) = 0,0576130802824579
"StringBuilder                 " => i(0,7) = 0,0601193616501357
"for                           " => i(0,8) = 0,0626256430178134
"C++!                          " => i(0,9) = 0,0651319243854911
"Pablo                         " => i(0,10) = 0,0425753920763915
"Aliskevicius.                 " => i(0,11) = 0,0450816734440692
"Windows                       " => i(0,12) = 0,0726507684885244
"STL                           " => i(0,13) = 0,0751570498562021
"Eclipse                       " => i(0,14) = 0,0776633312238798
"MFC                           " => i(0,15) = 0,0801696125915576
"Dev                           " => i(0,16) = 0,0400691107087137
-----------------------------------------------------------

(cluster=0, item=4, distance=0,0930085673936484)
Michael Chourdakis. One line template call for a dynamically loaded DLL function - Extensions to GetProcAddress() and Invoke() => C++ Win32
-----------------------------------------------------------
"C++                           " => i(0,0) = 0,0676382057531689
"Chourdakis.                   " => i(0,1) = 0,343329156197721
"One                           " => i(0,2) = 0,345835437565398
"line                          " => i(0,3) = 0,348341718933076
"template                      " => i(0,4) = 0,350848000300754
"call                          " => i(0,5) = 0,353354281668431
"for                           " => i(0,6) = 0,0626256430178134
"a                             " => i(0,7) = 0,358366844403787
"dynamically                   " => i(0,8) = 0,360873125771465
"loaded                        " => i(0,9) = 0,363379407139142
"DLL                           " => i(0,10) = 0,36588568850682
"function                      " => i(0,11) = 0,368391969874498
"-                             " => i(0,12) = 0,095207300797624
"Extensions                    " => i(0,13) = 0,373404532609853
"to                            " => i(0,14) = 0,375910813977531
"GetProcAddress()              " => i(0,15) = 0,378417095345209
"and                           " => i(0,16) = 0,15535805362189
"Invoke()                      " => i(0,17) = 0,383429658080564
"Michael                       " => i(0,18) = 0,340822874830043
"Win32                         " => i(0,19) = 0,38844222081592
-----------------------------------------------------------

Cluster=1, Centroid="User1 => C# .NET"
-----------------------------------------------------------
"C#                            " => u(1,0) = 0,0935222110939783
".NET                          " => u(1,1) = 0,15019656028131
-----------------------------------------------------------

(cluster=1, item=0, distance=0,0110986410505081)
Rahul Rajat Singh. A Beginner's Tutorial - Type Casting and Type Conversion in C# => C# .NET
-----------------------------------------------------------
"C#                            " => i(1,0) = 0,102726144900657
".NET                          " => i(1,1) = 0,170395741827956
"Singh.                        " => i(1,2) = 0,137814084048146
"A                             " => i(1,3) = 0,140320365415823
"Beginner's                    " => i(1,4) = 0,142826646783501
"Tutorial                      " => i(1,5) = 0,145332928151179
"-                             " => i(1,6) = 0,095207300797624
"Type                          " => i(1,7) = 0,150345490886534
"Casting                       " => i(1,8) = 0,152851772254212
"and                           " => i(1,9) = 0,15535805362189
"Conversion                    " => i(1,10) = 0,160370616357245
"in                            " => i(1,11) = 0,162876897724923
"Rahul                         " => i(1,12) = 0,13280152131279
"Rajat                         " => i(1,13) = 0,135307802680468
-----------------------------------------------------------

(cluster=1, item=1, distance=0,0110986410505081)
Faisal(mfrony). Access Modifiers for Beginners => C# .NET1.1 .NET2.0 .NET .NET1.0
-----------------------------------------------------------
"C#                            " => i(1,0) = 0,102726144900657
".NET                          " => i(1,1) = 0,170395741827956
"Modifiers                     " => i(1,2) = 0,295709810211843
"for                           " => i(1,3) = 0,0626256430178134
"Beginners                     " => i(1,4) = 0,300722372947199
"Faisal(mfrony).               " => i(1,5) = 0,290697247476488
".NET1.1                       " => i(1,6) = 0,305734935682554
".NET2.0                       " => i(1,7) = 0,308241217050232
"Access                        " => i(1,8) = 0,293203528844166
".NET1.0                       " => i(1,9) = 0,313253779785588
-----------------------------------------------------------

(cluster=1, item=2, distance=0,0110986410505081)
V.Lorz. Adding JavaScript scripting support to one application: One easy way => C# .NET
-----------------------------------------------------------
"C#                            " => i(1,0) = 0,102726144900657
".NET                          " => i(1,1) = 0,170395741827956
"JavaScript                    " => i(1,2) = 0,428542722698764
"scripting                     " => i(1,3) = 0,431049004066441
"support                       " => i(1,4) = 0,433555285434119
"to                            " => i(1,5) = 0,375910813977531
"one                           " => i(1,6) = 0,438567848169475
"application:                  " => i(1,7) = 0,441074129537152
"One                           " => i(1,8) = 0,345835437565398
"easy                          " => i(1,9) = 0,446086692272508
"way                           " => i(1,10) = 0,448592973640186
"V.Lorz.                       " => i(1,11) = 0,423530159963408
"Adding                        " => i(1,12) = 0,426036441331086
-----------------------------------------------------------

(cluster=1, item=3, distance=0,0110986410505081)
Florian Rappl. An advanced introduction to C# - Lecture Notes Part 1 of 4 => C# .NET
-----------------------------------------------------------
"C#                            " => i(1,0) = 0,102726144900657
".NET                          " => i(1,1) = 0,170395741827956
"An                            " => i(1,2) = 0,719271361349382
"advanced                      " => i(1,3) = 0,72177764271706
"introduction                  " => i(1,4) = 0,724283924084737
"to                            " => i(1,5) = 0,375910813977531
"Florian                       " => i(1,6) = 0,714258798614026
"-                             " => i(1,7) = 0,095207300797624
"Lecture                       " => i(1,8) = 0,734309049555448
"Notes                         " => i(1,9) = 0,736815330923126
"Part                          " => i(1,10) = 0,501224882361418
"1                             " => i(1,11) = 0,741827893658482
"of                            " => i(1,12) = 0,744334175026159
"4                             " => i(1,13) = 0,746840456393837
"Rappl.                        " => i(1,14) = 0,716765079981704
-----------------------------------------------------------

(cluster=1, item=4, distance=0,0110986410505081)
B. Ahmed (KU-00). Asynchronous programming and Threading in C# (.NET 4.5) => C# .NET4.5 .NET
-----------------------------------------------------------
"C#                            " => i(1,0) = 0,102726144900657
".NET                          " => i(1,1) = 0,170395741827956
"(KU-00).                      " => i(1,2) = 0,852104273836302
"Asynchronous                  " => i(1,3) = 0,85461055520398
"programming                   " => i(1,4) = 0,857116836571658
"and                           " => i(1,5) = 0,15535805362189
"Threading                     " => i(1,6) = 0,862129399307013
"in                            " => i(1,7) = 0,162876897724923
"B.                            " => i(1,8) = 0,847091711100947
"(.NET                         " => i(1,9) = 0,869648243410046
"4.5)                          " => i(1,10) = 0,872154524777724
".NET4.5                       " => i(1,11) = 0,87716708751308
"Ahmed                         " => i(1,12) = 0,849597992468624
-----------------------------------------------------------

Cluster=2, Centroid="User2 => Java Javascript"
-----------------------------------------------------------
"Java                          " => u(2,0) = 0,030550711996943
"Javascript                    " => u(2,1) = 0,5603509244
-----------------------------------------------------------

(cluster=2, item=0, distance=0,00499811776007409)
Dr. Song Li. A Simple Method to Upload Files by jQuery Ajax Calls => Java Javascript jQuery Ajax Dev Architect
-----------------------------------------------------------
"Java                          " => i(2,0) = 0,027537703870325
"Javascript                    " => i(2,1) = 0,556363072450329
"Li.                           " => i(2,2) = 0,639070357583694
"A                             " => i(2,3) = 0,140320365415823
"Simple                        " => i(2,4) = 0,64408292031905
"Method                        " => i(2,5) = 0,646589201686727
"to                            " => i(2,6) = 0,375910813977531
"Upload                        " => i(2,7) = 0,651601764422083
"Files                         " => i(2,8) = 0,654108045789761
"by                            " => i(2,9) = 0,656614327157438
"jQuery                        " => i(2,10) = 0,659120608525116
"Ajax                          " => i(2,11) = 0,661626889892794
"Calls                         " => i(2,12) = 0,664133171260472
"Dr.                           " => i(2,13) = 0,634057794848339
"Song                          " => i(2,14) = 0,636564076216016
"Dev                           " => i(2,15) = 0,0400691107087137
"Architect                     " => i(2,16) = 0,130295239945112
-----------------------------------------------------------

(cluster=2, item=1, distance=0,00803725509399426)
Sergey Zubarev. MPSC Lock Free Intrusive Linked Queue with state => Java Architect
-----------------------------------------------------------
"Java                          " => i(2,0) = 0,027537703870325
"Zubarev.                      " => i(2,1) = 0,581425886127106
"MPSC                          " => i(2,2) = 0,583932167494784
"Lock                          " => i(2,3) = 0,586438448862462
"Free                          " => i(2,4) = 0,588944730230139
"Intrusive                     " => i(2,5) = 0,591451011597817
"Linked                        " => i(2,6) = 0,593957292965495
"Queue                         " => i(2,7) = 0,596463574333173
"with                          " => i(2,8) = 0,0551067989147802
"state                         " => i(2,9) = 0,601476137068528
"Sergey                        " => i(2,10) = 0,578919604759428
"Architect                     " => i(2,11) = 0,130295239945112
-----------------------------------------------------------

(cluster=2, item=2, distance=0,0558027646300975)
U. Calakovic. Swing Based GUIs in JRuby => Ruby Java Windows Swing Dev
-----------------------------------------------------------
"Java                          " => i(2,0) = 0,027537703870325
"Calakovic.                    " => i(2,1) = 0,458618099110897
"Swing                         " => i(2,2) = 0,0350565479733582
"Based                         " => i(2,3) = 0,463630661846252
"GUIs                          " => i(2,4) = 0,46613694321393
"in                            " => i(2,5) = 0,162876897724923
"JRuby                         " => i(2,6) = 0,471149505949285
"Ruby                          " => i(2,7) = 0,473655787316963
"U.                            " => i(2,8) = 0,456111817743219
"Windows                       " => i(2,9) = 0,0726507684885244
"Dev                           " => i(2,10) = 0,0400691107087137
-----------------------------------------------------------

(cluster=2, item=3, distance=0,133988415037227)
Salar Hafezi. Understanding Recursion in Java through Basic Number Theory => Java Windows Design Dev
-----------------------------------------------------------
"Java                          " => i(2,0) = 0,027537703870325
"Hafezi.                       " => i(2,1) = 0,914761308028246
"Understanding                 " => i(2,2) = 0,917267589395924
"Recursion                     " => i(2,3) = 0,919773870763601
"in                            " => i(2,4) = 0,162876897724923
"Salar                         " => i(2,5) = 0,912255026660568
"through                       " => i(2,6) = 0,927292714866634
"Basic                         " => i(2,7) = 0,929798996234312
"Number                        " => i(2,8) = 0,93230527760199
"Theory                        " => i(2,9) = 0,934811558969668
"Windows                       " => i(2,10) = 0,0726507684885244
"Design                        " => i(2,11) = 0,629045232112983
"Dev                           " => i(2,12) = 0,0400691107087137
-----------------------------------------------------------

(cluster=2, item=4, distance=0,22911612749109)
Christopher Swiderski. JAX-WS: Using Apache CXF to Create a Bottom-Up Web Service, Web Service Client, and Securing the Web Service => Java Windows Apache Dev
-----------------------------------------------------------
"Java                          " => i(2,0) = 0,027537703870325
"Swiderski.                    " => i(2,1) = 0,789447239644359
"JAX-WS:                       " => i(2,2) = 0,791953521012036
"Using                         " => i(2,3) = 0,0901947380622685
"Apache                        " => i(2,4) = 0,796966083747392
"CXF                           " => i(2,5) = 0,79947236511507
"to                            " => i(2,6) = 0,375910813977531
"Create                        " => i(2,7) = 0,804484927850425
"a                             " => i(2,8) = 0,358366844403787
"Bottom-Up                     " => i(2,9) = 0,809497490585781
"Web                           " => i(2,10) = 0,812003771953458
"Service,                      " => i(2,11) = 0,814510053321136
"Service                       " => i(2,12) = 0,819522616056492
"Client,                       " => i(2,13) = 0,822028897424169
"and                           " => i(2,14) = 0,15535805362189
"Securing                      " => i(2,15) = 0,827041460159525
"the                           " => i(2,16) = 0,0576130802824579
"Christopher                   " => i(2,17) = 0,786940958276681
"Windows                       " => i(2,18) = 0,0726507684885244
"Dev                           " => i(2,19) = 0,0400691107087137
-----------------------------------------------------------

===========================================================
Recommendations:
===========================================================

User0 => C++ Linux
======================================================
ShaneMcDonald. Microsoft Visual C++ Static and Dynamic Libraries => C++ Visual-Studio Dev virtual-machine Virtualization
Nish Nishant. Using lambdas - C++ vs. C# vs. C++/CX vs. C++/CLI => C# C++ Windows Visual-Studio Dev QA Architect
Nish Nishant. 7 reasons C++ devs will love the VS 14 CTP => Visual C++ C++11 C++14 VC14.0 C++ Windows Dev Architect
Pablo Aliskevicius. 4350% Performance Improvement with the StringBuilder for C++! => C++ Linux Windows STL Eclipse MFC Dev
Michael Chourdakis. One line template call for a dynamically loaded DLL function - Extensions to GetProcAddress() and Invoke() => C++ Win32

User1 => C# .NET
======================================================
Rahul Rajat Singh. A Beginner's Tutorial - Type Casting and Type Conversion in C# => C# .NET
Faisal(mfrony). Access Modifiers for Beginners => C# .NET1.1 .NET2.0 .NET .NET1.0
V.Lorz. Adding JavaScript scripting support to one application: One easy way => C# .NET
Florian Rappl. An advanced introduction to C# - Lecture Notes Part 1 of 4 => C# .NET
B. Ahmed (KU-00). Asynchronous programming and Threading in C# (.NET 4.5) => C# .NET4.5 .NET

User2 => Java Javascript
======================================================
Dr. Song Li. A Simple Method to Upload Files by jQuery Ajax Calls => Java Javascript jQuery Ajax Dev Architect
Sergey Zubarev. MPSC Lock Free Intrusive Linked Queue with state => Java Architect
U. Calakovic. Swing Based GUIs in JRuby => Ruby Java Windows Swing Dev
Salar Hafezi. Understanding Recursion in Java through Basic Number Theory => Java Windows Design Dev
Christopher Swiderski. JAX-WS: Using Apache CXF to Create a Bottom-Up Web Service, Web Service Client, and Securing the Web Service => Java Windows Apache Dev

KMeans Statistics:
===========================================================
The total number of clusters nClustersCount = 3

Average distance between clusters nDiAvg = 0,0487435885265023
Average distance within a cluster nD0Avg = 41,496025364381

Average quality of KMeans clustering nQ = 0,0023493136076759

Points of Interest

The following C#.NET code has been especially developed to be used in ASP.NET web application projects to maintain the recommendation engine of online webstores and other websites that promote their commercially distributed products or digital content over the Internet.

History

December 11, 2016 - The first version of article has been published

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here