Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

sets

Member

11-03-2019 06:05 AM

Options

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report to a Moderator

Hello!

I have implemented K-Mean clustering algorithm and I want to bifurcate the data into 5 data groups.I have attached my required.png file and my code but the code doesn,t generate the desired output every time.I am initialization my centroids withing the span of data generated and it is all random. Terminating my loop when there is no change between the previous and currently calculated centers of clusters.

Download All

Virus scan in progress. Please wait to download attachments.

Rahulbala

Member

11-04-2019 12:45 AM

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report to a Moderator

Hello

This is the basic problem with K-means clustering, it lacks consistency and it is not repeatable.We might get different outputs eachtime.

Then why K-means clustering is popular ? - the answer is simple.It is faster and it is always an introduction for a course in unsupervised learning. Check this.

I was curious to know what was happening with the clustering method, So I have tried implementing the same, it works a bit differently and gives the desired output 8/10 times (cant give accurate figure)

**At the first iteration, we have to make sure that the random centroids are taken only once.**

I have attached the VIs (Please download OpenG Array toolkit if you are not using it). Try exploring different clustering methods.

-Rahul

Hit KUDOS for Thanks

Download All

Virus scan in progress. Please wait to download attachments.

11-23-2019 12:36 AM

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report to a Moderator

alexderjuengere

Active Participant

11-23-2019 04:26 PM - edited 11-23-2019 04:26 PM

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report to a Moderator

@Rahulbala wrote:

This is the basic problem with K-means clustering, it lacks consistency and it is not repeatable.We might get different outputs eachtime.

this is only true, if you use random values for initialisation.

@sets wrote:

I am initialization my centroids withing the span of data generated and

it is all random.

have you tried to reproduce your results using the same initial values?

Rahulbala

Member

11-23-2019 06:15 PM

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report to a Moderator

Traditionally,random initialisation is part of K-means clustering algorithm. Fixing the initial values will definitely give you the same result everytime. What you do mean by same initial values(like same index values whatever data is given as input) ?

We can make sure that we do a better selection of initial values by using techniques like Naive Sharding centroid algorithm. This will make sure that the initial values are good enough for clustering.

-Rahul

Hit KUDOS for Thanks

alexderjuengere

Active Participant

11-24-2019 06:32 AM

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report to a Moderator

@Rahulbala wrote:

Fixing the initial values will definitely give you the same result everytime. What you do mean by same initial values(like same index values whatever data is given as input) ?

k-means is going to converge to a solution or rather a local minimum. always.

but the quality of this solution may differ dramatically from trial to trial, because the found local minimum must not be the optimal local minimum.

this is because not every randomly picked starting point for a centroid will converge to the actual centroid.

here, the actual centroids are given in required.png 287 KB

I don't have the TSA Toolkit, so in

I had to change

to

furthermore, you should use a For-Loop here:

so, now we can look easily on the initial value vector, and how it affects he found solution:

this instance did converge in 3 steps to a not so well solution:

this instance did converge in 3 steps to the optimal solution:

attached .vi is back-saved to LabView 2010

Rahulbala

Member

11-24-2019 06:58 AM - edited 11-24-2019 06:59 AM

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report to a Moderator

Great but even now I am getting randomised outputs.

-Rahul

Hit KUDOS for Thanks

alexderjuengere

Active Participant

11-24-2019 10:01 AM

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report to a Moderator

@Rahulbala wrote:

Great but even now I am getting randomised outputs.

that's not the point.

set already figured out on his own, how to cope with this conduct:

@sets wrote: