Hi Kang, firstly thank you for the interview. Let's start with your background Q - What is your 30 second bio? My research focuses on business analytics and social computing, especially in the context of social networks and social media. A - That dates back to my grad school days. I was involved in research projects that leveraged data from online social networks and social media. Such data not only reveals who is talking to whom i.
All these made me believe that the availability of such data will bring a brand new perspective to the study of people's social behaviors and interactions.
Q - What was the first data set you remember working with? What did you do with it? A - My first research project using a real-world dataset was about collecting and analyzing data about humanitarian agencies and their networks.
The scale of the data was actually "tiny" several mega bytes but the data did show us some interesting patterns on the topological similarities between different networks among these organizations e. Kang, very interesting background and context - thank you for sharing! A - It is about the opportunity to do better prediction. With larger-scale data from more sources on how people behave in a network context becoming available, there are a lot of opportunities to apply ML algorithms to discover patterns on how people behave and predict what will happen next.
It is also possible to derive new social science theories from dynamic data through computational studies. Besides, the education component is also exciting as industry needs a workforce with data analytics skills.
That's also why we at the University of Iowa have started a bachelor's program in Business Analytics and plan to roll out a Master's program in this area as well. A - I want to better understand and predict social networks dynamics at different scales. For example, dyadic link formation at the microscopic level, the flow of information and influence at the mesoscopic level, as well as how network topologies affect network performance at the macroscopic level.
Q - What Machine Learning methods have you found most helpful? A - It really depends on the context and it is hard to find a silver bullet for all situations. I usually try several methods and settle with the one with the best performance.
As for conferences, I found the following helpful for my own research: Improving our ability to make predictions is definitely very compelling! Now, let's discuss how this applies in some of your research Q - Your recent work on developing a "Netflix style" algorithm for dating sites has received a lot of press coverage A - We try to address user recommendation for the unique situation of reciprocal and bipartite social networks e.
The idea is to recommend dating partners who a user will like and will like the user back. In other words, a recommended partner should match a user's taste, as well as attractiveness. Q - How did Machine Learning help? A - In short, we extended the classic collaborative filtering technique commonly used in item recommendation for Amazon.
A - People's behaviors in approaching and responding to others can provide valuable information about their taste, attractiveness, and unattractiveness.
Our method can capture these characteristics in selecting dating partners and make better recommendations. Editor Note - If you are interested in more detail behind the approach, both Forbes' recent article and a feature in the MIT Technology Review are very insightful.
Here are a few highlights: Recommendation Engine from MIT Tech Review - These guys have built a recommendation engine that not only assesses your tastes but also measures your attractiveness. It then uses this information to recommend potential dates most likely to reply, should you initiate contact.
The dating equivalent [of the Netflix model] is to analyze the partners you have chosen to send messages to, then to find other boys or girls with a similar taste and recommend potential dates that they've contacted but who you haven't. In other words, the recommendations are of the form: The problem with this approach is that it takes no account of your attractiveness.
If the people you contact never reply, then these recommendations are of little use. So Zhao and co add another dimension to their recommendation engine. They also analyze the replies you receive and use this to evaluate your attractiveness or unattractiveness. Obviously boys and girls who receive more replies are more attractive. When it takes this into account, it can recommend potential dates who not only match your taste but ones who are more likely to think you attractive and therefore to reply.
Machine Learning from Forbes - "Your actions reflect your taste and attractiveness in a way that could be more accurate than what you include in your profile," Zhao says. The research team's algorithm will eventually "learn" that while a man says he likes tall women, he keeps contacting short women, and will unilaterally change its dating recommendations to him without notice, much in the same way that Netflix's algorithm learns that you're really a closet drama devotee even though you claim to love action and sci-fi.
Finally, for more technical details, the full paper can be found here. Editor Note - Back to the interview! A - We want to further improve the method with different datasets from either dating or other reciprocal and bipartite social networks, such as job seeking and college admission.
How to effectively integrate users' personal profiles into recommendation to avoid cold start problems without hurting the method's generalizability is also an interesting question we want to address in future research. That all sounds great - good luck with the next steps!
Here we directly measure one's influence, i. A - Sentiment analysis is the basis for our new metric. We developed a sentiment classifier using Adaboost specifically for OHCs among cancer survivors. We did not use off-the-shelf word list because sentiment analysis should be specific to the context.
Some words may have different sentiment in this context than usual. For example, the word "positive" may be a bad thing for a cancer survivor if the diagnosis is positive. A - When finding influential users, the amount of contributions one has made matters, but how others react to one's contributions is also extremely valuable, because it is through such reactions inter-personal influence is reflected and thus measured.
A - We would like to further investigate the nature of support in OHCs, so that we can build users' behavioral profiles and better design such communities to help their members. Very interesting - look forward to following all of your different research paths in the future! Finally, it is advice time! Q - What does the future of Machine Learning look like? A - This is a tough question. I don't know the exact answer but I guess ML will develop along two directions.
The first would be on the algorithm side--better and more efficient algorithms for big data, as well as machine learning that mimics human intelligence at a deeper level. The second would be on the application side - how to make ML understandable and available to the general public? Q - Any words of wisdom for Machine Learning students or practitioners starting out?
A - I am not sure whether my words are of real wisdom, but I'd say for a beginner, it is certainly important to understand ML algorithms. In other words, one must learn how to answer the question-- "Now we have the data, what can we do with it?
This is very valuable in the era of big data. Kang - Thank you so much for your time! Really enjoyed learning more about your research and its application to real-world problems.
Kang can be found online at his research home page and on twitter. Readers, thanks for joining us! If you enjoyed this interview and want to learn more about what it takes to become a data scientist what skills do I need what type of work is currently being done in the field then check out Data Scientists at Work - a collection of 16 interviews with some the world's most influential and innovative data scientists, who each address all the above and more!
You might also enjoy these interviews because you are awesome: