Computer Vision Resources

Published in

ACM-W Manipal

3 min readJan 29, 2018

Hey there!
We held an introductory session as a part of ACM-W Manipal, recently, where I spoke about Computer Vision. Here are the links of some of the resources, research papers, ongoing projects and and labs which will help you dive deeper into the very same!

Open CV

openCV Python: https://docs.opencv.org/3.0-beta/doc/py_tutorials/py_tutorials.html
openCV C++: https://docs.opencv.org/2.4/doc/tutorials/tutorials.html

Note: OpenCV Python is generally easier as it requires lesser lines of code than C++

Python Pre-requisites:

CodeAcademy — https://www.codecademy.com/learn/learn-python
Udacity — https://in.udacity.com/course/introduction-to-python--ud1110 (Course duration: approx. 5 weeks)
Edx- https://www.edx.org/course/introduction-computer-science-mitx-6-00-1x-11 (Course Duration : approx. 9 weeks)
Coursera — https://www.coursera.org/learn/python (Course Duration : 7 weeks)

Projects & Papers

MIT Vision Group : The group is working on Predictive vision algorithms to determine future visual events or scenes.
Project : http://carlvondrick.com/tinyvideo/
Paper: http://carlvondrick.com/tinyvideo/paper.pdf
Stanford Vision Lab: One of the projects is Dense-Captioning Events in Videos. Events (both short and long duration) are detected and described simultaneously, in natural language. Moreover, contextual data of events is used for analysing both past and future events, to jointly describe all the possible events.
Project Link: https://cs.stanford.edu/people/ranjaykrishna/densevid/
Paper Link:https://arxiv.org/pdf/1705.00754.pdf
Microsoft Research: Seeing Bot is a video captioning bot. A natural language (audio) description of what it sees/hears, is generated and sent to the user.
Project Link: https://www.microsoft.com/en-us/research/publication/seeing-bot/
Paper link: https://www.microsoft.com/en-us/research/wp-content/uploads/2017/06/dp18-panA.pdf
Stanford Vision Lab: Determining a neighbourhood’s political leanings byits cars! Based on continuous real-time data available via Google Street View, the probability of a locality voting for Democrat or Republican can be obtained.
Link:https://news.stanford.edu/2017/11/28/neighborhoods-cars-indicate-political-leanings/Paper Link:http://www.pnas.org/content/114/50/13108.full.pdf
Facebook Research: Bringing Portraits to Life.
They automatically animate a still portrait, making it possible for the subject in the photo to come to life and express various emotions.
Project Link: https://research.fb.com/publications/bringing-portraits-to-life/
Paper Link: https://research.fb.com/wp-content/uploads/2017/11/elor2017_bringingportraits-1.pdf?

Interesting Projects

Some of the projects really caught my eye! Do watch the videos associated with them.

A panorama of skies

A Panorama of the Skies :This is a personal favourite! Transforming a conference room to a panoramic skyscapes via the RoomAlive Toolkit.
Link: https://www.microsoft.com/en-us/research/project/a-panorama-of-the-skies/
IBM Research Takes Watson to Hollywood with the First “Cognitive Movie Trailer”.
Link:https://www.ibm.com/blogs/think/2016/08/cognitive-movie-trailer/
Facebook Research: Understanding the world around us and why Computer Vision?
Link: https://research.fb.com/category/computer-vision/

Understanding the world around us and why Computer Vision

Additional Links pertaining to Research Ventures

NVIDIA CV Research
IBM Research
Disney Research (Ever thought of being an imagineer?)

Well-known Professors

Fei-Fei Li :
Incharge of the SAIL and Stanford Vision Lab.
Research areas: AI and CV mainly.
Anil K Jain:
Professor at the Michigan State University.
Research areas: CV,IP,ML and PR.
Jitendra Malik:
Professor at UCB.
Research areas: CV,CG and ML.
Andrew Zisserman:
Professor at the University of Oxford.
David Lowe:
Senior research scientist at Google Research.
Takeo Kanade:
Professor and scientist, CMU (the Robotics Institute).
Peter E. Hart

Note: Incase you have any additional resources relevant to the domain, feel free to add them as comments :)