[Campaign-news] Thank You!
whsu at sfsu.edu
whsu at sfsu.edu
Mon Feb 1 19:00:22 PST 2010
Hi Kai (and Marc),
Sorry about spacing out on getting back to you. I'm busy with a gig on
the 19th; we should probably meet before the big meeting on the 22nd.
We've been meeting on Fridays, but I have another meeting on the 12th
in the afternoon as well. Will say Wed the 10th work?
I'm getting stuck in the profiling; keep getting this cryptic error message
with the kcenters code, while the profiler works with some of the other
examples that I've tried. Perhaps it's because of the input redirection;
I'll try tweaking the code a bit.
Bill
Quoting Kai Kohlhoff <kohlhoff at stanford.edu>:
> Hi Marc,
>
> Yes, it was a good evening, you guys are a pleasant crowd!
>
> I agree with your next steps, and I will see that I stick to the
> format that you have already created. I have been trying to
> simplify the code that we have and am really eager to put it into
> the repository. There is still another project that I have to work
> on until tomorrow, but then I'll get to it.
>
> I was thinking that we should pull the distance kernels out of the
> current clustering code. For proper modularity, these should be
> called separately in each iteration and a distance matrix should be
> provided to the clustering kernel in each iteration. Also, the I/O
> could be put into separate subroutines. It might be useful, if
> ultimately a user could simply write C/C++-code and the GPU
> functionality would be hidden.
>
> Something like:
>
>
> #include "campaign.h"
>
> campaign.checkPlatform(); // checks which, if any, GPU is present
> data = campaign.readData(file, format); // read data
> data = campaign.preprocess(data, method); // use a selected method
> to preprocess data
> clusters.init(data); // extracts number of data points,
> dimensionality, copies data to GPU
> for (i = 1:N) // N iterations, data is kept on GPU between kernel
> calls; alternatively use convergence criterium
> {
> distance = campaign.calcDists(data, clusters, metricType); //
> metricType = e.g. "manhattan", "euclidean"
> clusters = campaign.iterate(data, clusters, distance,
> algorithmType); // One iteration of algorithmType = e.g.
> "kcenters", "kmeans", "birch"
> }
> campaign.printResults(clusters, format); // output clustering results
>
>
> would be great to have. If you like the idea, maybe we should start
> thinking about how we could get there. I am not sure this could
> make it into our '0.5' version that Russ mentioned, but we could
> talk about this.
>
> It makes sense to have something out asap. It will be fun to
> increase the speed of our clustering code in subsequent iterations,
> but we should start getting people to use it. I'll try to deposit
> the modules that I have at the end of the week.
>
> Bill, I am looking forward to hearing about your profiling work
> during our next meeting. Your findings will surely help me write
> more efficient code right from the onset.
>
> When should we have our next meeting? Given that the last one has
> been awhile, I suggest not having it more than three weeks from now.
> How does February 19 sound to you?
>
> Cheers,
> Kai
>
>
>
>
> On Jan 26, 2010, at 10:21 AM, Marc Sosnick wrote:
>
>> Kai:
>>
>> It was great seeing you last night. Thanks for helping me out
>> round out the presentation at the meeting. Sorry we didn't have
>> more time to talk about our next steps during dinner, but it was
>> quite convivial!
>>
>> As we discussed, I had ideas as to what my next steps should be,
>> and I just want to get your and Bill's agreement before I start.
>> These are in priority order:
>>
>> 1) Now that we have a smoke test against which to test, take the
>> current code and refactor each clustering method into a proper c++
>> class, with a .cpp and .cu file. This would also include scrubbing
>> the current code of comments and optimizing code (not including
>> optimizing memory handling) as if we were presenting it to the
>> outside world. This would significantly help us work toward our
>> first release as per Russ' comments last night.
>> 2) Take any new clustering algorithms you have and put them into
>> the format that we've created up to now and as in (1).
>> 3) Optimize memory handling and data structures. This would be
>> done in tandem with Bill's profiling work.
>>
>> Let me know about those algorithms you have. Don't worry about
>> putting anything in the repository, we can always reorganize the
>> repository as we see fit, so just go ahead. Probably the best way
>> would just to be to create a subdirectory off trunk/dev, put your
>> work in there, and do an svn add directory_name from the parent
>> directory of directory_name.
>>
>> Again, many thanks!
>>
>> Marc
>> _______________________________________________
>> Campaign-news mailing list
>> Campaign-news at simtk.org
>> https://simtk.org/mailman/listinfo/campaign-news
>
> -----------------------------------------------------
> Kai Kohlhoff, PhD
> Stanford University
> School of Medicine, Bioengineering
> Stanford, CA 94305-5448, USA
> T: ++1 (650) 724 1575
> E: kohlhoff at stanford.edu
>
>
More information about the Campaign-news
mailing list