[Campaign-news] Thank You!

Kai Kohlhoff kohlhoff at stanford.edu
Wed Jan 27 13:47:17 PST 2010


Hi Marc,

Yes, it was a good evening, you guys are a pleasant crowd!

I agree with your next steps, and I will see that I stick to the format that you have already created.  I have been trying to simplify the code that we have and am really eager to put it into the repository.  There is still another project that I have to work on until tomorrow, but then I'll get to it.  

I was thinking that we should pull the distance kernels out of the current clustering code.  For proper modularity, these should be called separately in each iteration and a distance matrix should be provided to the clustering kernel in each iteration.  Also, the I/O could be put into separate subroutines.  It might be useful, if ultimately a user could simply write C/C++-code and the GPU functionality would be hidden.

Something like:


#include "campaign.h"

campaign.checkPlatform();   // checks which, if any, GPU is present
data = campaign.readData(file, format);  // read data
data = campaign.preprocess(data, method);  //  use a selected method to preprocess data
clusters.init(data);		// extracts number of data points, dimensionality, copies data to GPU
for (i = 1:N)	// N iterations, data is kept on GPU between kernel calls; alternatively use convergence criterium
{
	distance = campaign.calcDists(data, clusters, metricType); // metricType = e.g. "manhattan", "euclidean"
	clusters = campaign.iterate(data, clusters, distance, algorithmType);  // One iteration of algorithmType = e.g. "kcenters", "kmeans", "birch"
}
campaign.printResults(clusters, format); // output clustering results


would be great to have.  If you like the idea, maybe we should start thinking about how we could get there.  I am not sure this could make it into our '0.5' version that Russ mentioned, but we could talk about this.

It makes sense to have something out asap.  It will be fun to increase the speed of our clustering code in subsequent iterations, but we should start getting people to use it.  I'll try to deposit the modules that I have at the end of the week.

Bill, I am looking forward to hearing about your profiling work during our next meeting.  Your findings will surely help me write more efficient code right from the onset.  

When should we have our next meeting?  Given that the last one has been awhile, I suggest not having it more than three weeks from now.  How does February 19 sound to you?

Cheers,
Kai




On Jan 26, 2010, at 10:21 AM, Marc Sosnick wrote:

> Kai:
> 
> It was great seeing you last night.  Thanks for helping me out round out the presentation at the meeting.  Sorry we didn't have more time to talk about our next steps during dinner, but it was quite convivial!
> 
> As we discussed, I had ideas as to what my next steps should be, and I just want to get your and Bill's agreement before I start.  These are in priority order:
> 
> 1) Now that we have a smoke test against which to test, take the current code and refactor each clustering method into a proper c++ class, with a .cpp and .cu file.  This would also include scrubbing the current code of comments and optimizing code (not including optimizing memory handling) as if we were presenting it to the outside world.  This would significantly help us work toward our first release as per Russ' comments last night.
> 2) Take any new clustering algorithms you  have and put them into the format that we've created up to now and as in (1).
> 3) Optimize memory handling and data structures.  This would be done in tandem with Bill's profiling work.
> 
> Let me know about those algorithms you have.  Don't worry about putting anything in the repository, we can always reorganize the repository as we see fit, so just go ahead.  Probably the best way would just to be to create a subdirectory off trunk/dev, put your work in there, and do an svn add directory_name from the parent directory of directory_name.
> 
> Again, many thanks!
> 
> Marc
> _______________________________________________
> Campaign-news mailing list
> Campaign-news at simtk.org
> https://simtk.org/mailman/listinfo/campaign-news

-----------------------------------------------------
Kai Kohlhoff, PhD
Stanford University
School of Medicine, Bioengineering
Stanford, CA 94305-5448, USA
T: ++1 (650) 724 1575
E: kohlhoff at stanford.edu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://simtk.org/pipermail/campaign-news/attachments/20100127/e0e753e4/attachment.html


More information about the Campaign-news mailing list