It seems that different people interpret transfer learning in different terms:
Multitask learning, Domain Adaptation, learning from multiple sources, sample selection bias.
Here, I just tried to distinguish these terms and discuss about the difference.
Multitask learning has been studied a lot. Usually, the objective function is to learn a model for each task such that the overall performance is optimized. These models share some commonality.
A very strong limitation of this method is that
overall performance != performance on one specific task.
In my opinion, MTL actually solves the learning problem in an indirect way. There are lots of issues involved: how to find similar tasks, which tasks should I trust, how much should I trust for each task, what's the trade-off between data and tasks. All these problems requires additional knowledge from domain or data, which make it rather limited. Actually, MTL can only works if very few data is available for each task. If there's no training data for one task, MTL can not work.
Domain adaptation, is more like a transfer learning setting. Domain adaptation can be considered as involving two tasks: one is source domain(support task), one is target domain(target task). This term is coined(as I see) from NLP community. The goal is exactly the same as transfer learning, to improve the performance of target task.
Learning from multiple sources, can be a little tricky. There are actually two interpretations: one assume that all the sources are drawn from the same underlying distribution, but with various level noise or uncertainty(If some additional information like the bound of the noise can be obtained, then it's possible to take advantage of it); The other one assumes that different source have different underlying distribution. But they are related. Thus, the former is more like data selection, how to learn ONE model given some noisy data while the latter is exactly like Multitask learning (develope one model for each task(source)).
Finally, transfer learning can also be connected to sample selection bias as I mentioned in last post. How ever sample selection bias always deals with the case such that only one biased sample is available, how to obtain an unbiased model. This situation is more like domain adaptation. We can effectively apply the method in one field to the other.
"Our Days Are Numbered"
-
[image: Proofs are amenable to chess techniques. "Our Days are Numbered".]
Slide in Lev Reyzin's JMM talk "Problems in AI and ML for Mathematicians"
Reyz...
No comments:
Post a Comment