Intelligent Video Captioning
********** | |
Brian Thomas | |
Telestream |
Project's details
Intelligent Video Captioning | |
Closed captioning is the process of displaying text on a television, video screen, or other visual display to provide additional or interpretive information. It is typically used as a transcription of the audio portion of a program as it occurs, sometimes including descriptions of non-speech elements. There are legal requirements for broadcasters to create captions for their programs to increase their accessibility. A variety of tools are available to aid in the authoring of these captions, e.g. http://www.telestream.net/captioning/overview.htm. Machine learning (ML) is the study of computer algorithms that improve automatically through experience. Machine learning is used regular in the field of computer vision where it can be applied to detect features in the video e.g. faces. | |
The goal of this project is to create and integrate a library that can use machine learning to identify areas of a video that are suitable for displaying closed caption data. For example, you would not want to place captions over an area of the video image that includes the protagonists’ face. The project is setup to mirror a typical R&D project in a software development company. The team receives this project description and is expected to meet with their technical mentors on a weekly basis. The mentors will help adjust the project given the discoveries made, with the goal of being able to present a proof of concept to an executive team at the end of the project. The presentation would include enough information for the executives to make a go / no-go decision on productizing the development. It is anticipated that the team would divide the work, for example a four-person team might split the task … • Algorithm Development (selection of appropriate areas for caption placement) • Machine Learning training • Integration with caption authoring software • Webpage development (HTML / JavaScript) for proof of concept | |
• Create a web application demonstrating the automatic placement of captions over video. • Present to company executives on the application / viability of the developed technology. | |
It is expected that the project team would have coding skills appropriate to a final year computer science undergraduate degree student and that between the group they would have some exposure to C++ and HTML / JavaScript. Knowledge of caption authoring would be gained during the course of the project. The mentors would be on hand to assist with questions related to caption authoring / C++ / machine learning etc. | |
********** | |
30-60 min weekly or more | |
Open source project | |
Attachment | N/A |
Yes | |
Team members | N/A |
Rex Liu | |
N/A |