Swimming in Social Media: papers

Saturday, February 10, 2007

How to collobrate with others?

I just read John's blog and found it is really very helpful. Thus, I just copy and past his stuff here.

The full article can be found via the link below:
http://hunch.net/?p=251

2/10/2007
Best Practices for Collaboration
Filed under: Papers, Research — jl @ 1:51 pm

Many people, especially students, haven’t had an opportunity to collaborate with other researchers. Collaboration, especially with remote people can be tricky. Here are some observations of what has worked for me on collaborations involving a few people.

1. Travel and Discuss Almost all collaborations start with in-person discussion. This implies that travel is often necessary. We can hope that in the future we’ll have better systems for starting collaborations remotely (such as blogs), but we aren’t quite there yet.
2. Enable your collaborator. A collaboration can fall apart because one collaborator disables another. This sounds stupid (and it is), but it’s far easier than you might think.
1. Avoid Duplication. Discovering that you and a collaborator have been editing the same thing and now need to waste time reconciling changes is annoying. The best way to avoid this to be explicit about who has write permission to what. Most of the time, a write lock is held for the entire document, just to be sure.
2. Don’t keep the write lock unnecessarily. Some people are perfectionists so they have a real problem giving up the write lock on a draft until it is perfect. This prevents other collaborators from doing things. Releasing write lock (at least) when you sleep, is a good idea.
3. Send all necessary materials. Some people try to save space or bandwidth by not passing ‘.bib’ files or other auxiliary components. Forcing your collaborator to deal with the missing subdocument problem is disabling. Space and bandwidth are cheap while your collaborators time is precious. (Sending may be pass-by-reference rather than attach-to-message in most cases.)
4. Version Control. This doesn’t mean “use version control software”, although that’s fine. Instead, it means: have a version number for drafts passed back and forth. This means you can talk about “draft 3″ rather than “the draft that was passed last tuesday”. Coupled with “send all necessary materials”, this implies that you naturally backup previous work.
3. Be Generous. It’s common for people to feel insecure about what they have done or how much “credit” they should get.
1. Coauthor standing. When deciding who should have a chance to be a coauthor, the rule should be “anyone who has helped produce a result conditioned on previous work”. “Helped produce” is often interpreted too narrowly—a theoretician should be generous about crediting experimental results and vice-versa. Potential coauthors may decline (and senior ones often do so). Control over who is a coauthor is best (and most naturally) exercised by the choice of who you talk to.
2. Author ordering. Author ordering is the wrong thing to worry about, so don’t. The CS theory community has a substantial advantage here because they default to alpha-by-author ordering, as is understood by everyone.
3. Who presents. A good default for presentations at a conference is “student presents” (or suitable generalizations). This gives young people a real chance to get involved and learn how things are done. Senior collaborators already have plentiful alternative methods to present research at workshops or invited talks.
4. Communicate by default Not cc’ing a collaborator is a bad idea. Even if you have a very specific question for one collaborator and not another, it’s a good idea to cc everyone. In the worst case, this is a few-second annoyance for the other collaborator. In the best case, the exchange answers unasked questions. This also prevents “conversation shifts into subjects interesting to everyone, but oops! you weren’t cced” problem.

These practices are imperfectly followed even by me, but they are a good ideal to strive for.

Thursday, December 14, 2006

Some interesting papers from NIPS 2006

NIPS'2006 has just released their online proceedings.
http://books.nips.cc/nips19.html

Here are some interesting papers:

------------------------------------------------------------
Dirichlet-Enhanced Spam Filtering based on Biased Samples
Steffen Bickel, Tobias Scheffer [ps.gz][pdf][bibtex]

Denoising and Dimension Reduction in Feature Space
Mikio Braun, Joachim Buhmann, Klaus-Robert Müller [ps.gz][pdf][bibtex]

Sparse Multinomial Logistic Regression via Bayesian L1 Regularisation
Gavin Cawley, Nicola Talbot, Mark Girolami [ps.gz][pdf][bibtex]

Modeling General and Specific Aspects of Documents with a Probabilistic Topic Model
Chaitanya Chemudugunta, Padhraic Smyth, Mark Steyvers [ps.gz][pdf][bibtex]

Learning from Multiple Sources
Koby Crammer, Michael Kearns, Jennifer Wortman [ps.gz][pdf][bibtex]

Optimal Single-Class Classification Strategies
Ran El-Yaniv, Mordechai Nisenson [ps.gz][pdf][bibtex]

Clustering Under Prior Knowledge with Application to Image Segmentation
Mario Figueiredo, Dong Seon Cheng, Vittorio Murino [ps.gz][pdf][bibtex]

Data Integration for Classification Problems Employing Gaussian Process Priors
Mark Girolami, Mingjun Zhong [ps.gz][pdf][bibtex][zip]

Correcting Sample Selection Bias by Unlabeled Data
Jiayuan Huang, Alex Smola, Arthur Gretton, Karsten M. Borgwardt, Bernhard Schölkopf [ps.gz][pdf][bibtex]

Cross-Validation Optimization for Large Scale Hierarchical Classification Kernel Methods
Matthias Seeger [ps.gz][pdf][bibtex]

---------------------------------------------------------------------
Interesting:
Image Retrieval and Classification Using Local Distance Functions
Andrea Frome, Yoram Singer, Jitendra Malik [ps.gz][pdf][bibtex][tgz]

Branch and Bound for Semi-Supervised Support Vector Machines
Olivier Chapelle, Vikas Sindhwani, Sathiya Keerthi [ps.gz][pdf][bibtex]

Max-margin classification of incomplete data
Gal Chechik, Geremy Heitz, Gal Elidan, Pieter Abbeel, Daphne Koller [ps.gz][pdf][bibtex]

Bayesian Ensemble Learning
Hugh Chipman, Edward George, Robert McCulloch [ps.gz][pdf][bibtex]

Map-Reduce for Machine Learning on Multicore
Cheng-Tao Chu, Sang Kyun Kim, Yi-An Lin, YuanYuan Yu, Gary Bradski, Andrew Ng, Kunle Olukotun [ps.gz][pdf][bibtex]

Hierarchical Dirichlet Processes with Random Effects
Seyoung Kim, Padhraic Smyth [ps.gz][pdf][bibtex]

Ordinal Regression by Extended Binary Classification
Ling Li, Hsuan-Tien Lin [ps.gz][pdf][bibtex]

Swimming in Social Media

Saturday, February 10, 2007

How to collobrate with others?

Thursday, December 14, 2006

Some interesting papers from NIPS 2006

My Blog List

Labels

Blog Archive

Contributors