[Campaign-news] Thank You!

whsu at sfsu.edu whsu at sfsu.edu
Mon Feb 1 19:00:22 PST 2010


Hi Kai (and Marc),

Sorry about spacing out on getting back to you. I'm busy with a gig on
the 19th; we should probably meet before the big meeting on the 22nd.
We've been meeting on Fridays, but I have another meeting on the 12th
in the afternoon as well. Will say Wed the 10th work?

I'm getting stuck in the profiling; keep getting this cryptic error message
with the kcenters code, while the profiler works with some of the other
examples that I've tried. Perhaps it's because of the input redirection;
I'll try tweaking the code a bit.

Bill

Quoting Kai Kohlhoff <kohlhoff at stanford.edu>:

> Hi Marc,
>
> Yes, it was a good evening, you guys are a pleasant crowd!
>
> I agree with your next steps, and I will see that I stick to the   
> format that you have already created.  I have been trying to   
> simplify the code that we have and am really eager to put it into   
> the repository.  There is still another project that I have to work   
> on until tomorrow, but then I'll get to it.
>
> I was thinking that we should pull the distance kernels out of the   
> current clustering code.  For proper modularity, these should be   
> called separately in each iteration and a distance matrix should be   
> provided to the clustering kernel in each iteration.  Also, the I/O   
> could be put into separate subroutines.  It might be useful, if   
> ultimately a user could simply write C/C++-code and the GPU   
> functionality would be hidden.
>
> Something like:
>
>
> #include "campaign.h"
>
> campaign.checkPlatform();   // checks which, if any, GPU is present
> data = campaign.readData(file, format);  // read data
> data = campaign.preprocess(data, method);  //  use a selected method  
>  to preprocess data
> clusters.init(data);		// extracts number of data points,   
> dimensionality, copies data to GPU
> for (i = 1:N)	// N iterations, data is kept on GPU between kernel   
> calls; alternatively use convergence criterium
> {
> 	distance = campaign.calcDists(data, clusters, metricType); //   
> metricType = e.g. "manhattan", "euclidean"
> 	clusters = campaign.iterate(data, clusters, distance,   
> algorithmType);  // One iteration of algorithmType = e.g.   
> "kcenters", "kmeans", "birch"
> }
> campaign.printResults(clusters, format); // output clustering results
>
>
> would be great to have.  If you like the idea, maybe we should start  
>  thinking about how we could get there.  I am not sure this could   
> make it into our '0.5' version that Russ mentioned, but we could   
> talk about this.
>
> It makes sense to have something out asap.  It will be fun to   
> increase the speed of our clustering code in subsequent iterations,   
> but we should start getting people to use it.  I'll try to deposit   
> the modules that I have at the end of the week.
>
> Bill, I am looking forward to hearing about your profiling work   
> during our next meeting.  Your findings will surely help me write   
> more efficient code right from the onset.
>
> When should we have our next meeting?  Given that the last one has   
> been awhile, I suggest not having it more than three weeks from now.  
>   How does February 19 sound to you?
>
> Cheers,
> Kai
>
>
>
>
> On Jan 26, 2010, at 10:21 AM, Marc Sosnick wrote:
>
>> Kai:
>>
>> It was great seeing you last night.  Thanks for helping me out   
>> round out the presentation at the meeting.  Sorry we didn't have   
>> more time to talk about our next steps during dinner, but it was   
>> quite convivial!
>>
>> As we discussed, I had ideas as to what my next steps should be,   
>> and I just want to get your and Bill's agreement before I start.    
>> These are in priority order:
>>
>> 1) Now that we have a smoke test against which to test, take the   
>> current code and refactor each clustering method into a proper c++   
>> class, with a .cpp and .cu file.  This would also include scrubbing  
>>  the current code of comments and optimizing code (not including   
>> optimizing memory handling) as if we were presenting it to the   
>> outside world.  This would significantly help us work toward our   
>> first release as per Russ' comments last night.
>> 2) Take any new clustering algorithms you  have and put them into   
>> the format that we've created up to now and as in (1).
>> 3) Optimize memory handling and data structures.  This would be   
>> done in tandem with Bill's profiling work.
>>
>> Let me know about those algorithms you have.  Don't worry about   
>> putting anything in the repository, we can always reorganize the   
>> repository as we see fit, so just go ahead.  Probably the best way   
>> would just to be to create a subdirectory off trunk/dev, put your   
>> work in there, and do an svn add directory_name from the parent   
>> directory of directory_name.
>>
>> Again, many thanks!
>>
>> Marc
>> _______________________________________________
>> Campaign-news mailing list
>> Campaign-news at simtk.org
>> https://simtk.org/mailman/listinfo/campaign-news
>
> -----------------------------------------------------
> Kai Kohlhoff, PhD
> Stanford University
> School of Medicine, Bioengineering
> Stanford, CA 94305-5448, USA
> T: ++1 (650) 724 1575
> E: kohlhoff at stanford.edu
>
>





More information about the Campaign-news mailing list