Media & Entertainment Industry Trends, Technology and Research

Top 10 areas Artificial Intelligence is leading automation in Media Industry

Posted In Artificial Intelligence, Big Data, Machine Learning - By Nitin Narang on Tuesday, September 26th, 2017 With No Comments »

In this era of data explosion, collecting data in itself is not sufficient. It needs to be processed, sliced and diced to gain insights for running and growing the business. Unfortunately, majority of data available in the world today is unstructured and hidden making it difficult to process without significant human participation. A large part of data in media industry falls squarely in this category but it has started to CHANGE.

Take any video file, and it holds large amount of unstructured data interwoven in its fabric which requires close human engagement to understand and decode. It needs human effort for doing the most basic jobs of content management, processing, interpretation, quality checking etc. before it can be marked ready for distribution. Interestingly, AI and ML algorithms and especially deep learning, is now reaching a level in parity to human accuracy to perform large part of these tasks at scale. AI is well positioned to both automate workflow activities as well as generate tremendous insights from this hidden asset “data”. As a result, media industry is witnessing several winners in domain of natural language processing (NLP), facial recognition, anomaly detection and more where AI is bringing large scale automation with unmatched efficiencies. 2107 marks an important year when AI is starting to harvest rich dividends in broadcasting, content management, post production, advertising and many more verticals. And they say, it is only the beginning of AI journey!

Predictive Analytics and Deep Learning

Predictive analytics uses a critical assumption that future behavior is likely influenced from past trends, and in most cases it holds good for a period of time. At the foundation of these predictive models are a set of hypothesis, that bind together a number of independent variables (say for content personalization – variables like age, gender, financial status, education, content interest) to build statistical correlations. It is the collective strength and degree of these correlations, that can predict a future behavior. Read here to learn more on Predictive Analytics. More recently deep learning, which uses neural networks to bring human brain like analytical capabilities, is taking machine learning to yet higher cognitive levels. By simulating human brain like response to a situation, deep learning brings a marked shift from old school brute force decision trees to something more real.

Machine Learning focus areas in Media and Entertainment Industry

AI and ML have been in academic and R&D world for last several decades and it is only in the last few years that real industry integration has started to make way. AI brings technology to automate tasks which have been largely human intensive and offers benefits from scalability, speed of computation and repeatability. It has great potential to bring serious cost savings by automating existing tasks in content management, media operations as well as improving customer engagement and experience. For example, AI can automate a complex job of audio/video sync, saving tremendous amount of manual human effort as well as cut down on human errors. The following are top ten AI transformation areas making inroads in Media & Entertainment Industry.

1. Deep Video Analysis, Translation, Transcription and Tagging – AI took several years to perfect hand writing recognition and quickly moved to natural language understanding (NLU). It is now accelerated to go beyond natural language and metadata processing to delve into deep content analysis. Transcription is becoming near real time with machine led automation converting spoken audio into readable text. We all have seen the early arrival with Alexa, Cortana and Google voice. Neural network trained systems are replacing traditional word for word conversion by adding new dimension of contextual and intent relevance. It is expected that in next 3 years, AI will completely take over transcription and translation activity and will be resident on daily use audio devices.

Deep video analysis is another interesting area leading to manifold expansion in video insights by learning scene changes, locational references, voice, facial and object recognition. This intelligence is going a long way in enriching content taxonomy and appropriate tagging of the content, which is improving accuracy of content linkage, search and association. Here AI is significantly changing the entire content management landscape with machine driven indexing, metadata-tagging, cataloging etc., turning manual processes to highly automated workflows. Video translation to multiple languages and dialects and multi lingual subtitles is helping expand content’s addressable market to far greater audiences than ever before.

2. Voice based virtual assistants – In last 2 years, voice assistants like Alexa, Google home and voice remotes like Siri and Roku have started to fade away chunky TV remotes by perfecting basic menu navigation. Coming next is intelligence for content search and discovery with help of user follow up commands. AI using supervised learning algorithms is now powering virtual assistants to combine consumer’s knowledge graphs, geo coordinates, voice inputs and rich content metadata (cast, synopsis, quotes, locations etc.) to offer personalized recommendations. This ability of virtual assistants to understand linguistic features, emotions and user intent is making them smarter, intuitive and mature conversation system adding to better customer experience. As individual digital relationships become more profound, virtual assistants are expected to play a dominant role in offering addressable video content delivery.

3. Optimized Video Encoding and Delivery- Video streaming had a major fillip with introduction of adaptive bit rate (ABR) streaming. ABR encoding creates small chunks of original file into different bit rates to service a client based on available bandwidth (Read here to understand more on streaming). AI is going the extra mile by bringing technology to improve fixed bitrate chunking to scene based encoding. AI, by learning complexity of scenes across multiple quality metrics, can determine required level of compression and given a video for encoding, the system can determine frame level complexity and optimal compression parameters while keeping track of quality. Netflix mastered this technology a while back to generate refined encoded streams even at lower bitrates. This new encoding is radically transforming ways of providing uninterrupted video to growing population of viewers in emerging economies where low bandwidth network on mobile phones is most dominant platform for watching video. AI is also improving online media player performance by optimizing required bitrate based on viewer location, network congestion, infrastructure metrics and bandwidth details.

4. Visual Recognition – Facial and object recognition is an AI area which is heavy on visual processing. It deals with identification of individuals and objects in the video and still images as well as its relative changes with time. While this kind of visual processing comes naturally to humans, it has been an uphill task for machines to crunch large data variations to reach desired level of accuracy. More recently, AI and machine learning is increasingly able to master visual perception – facial and pattern recognition, opening rich avenues in content editing and automated content creation. Wondered how Facebook and numerous photo apps do an amazing job with photo tagging of your friends; it is all AI and ML in the making

5. Anomaly Detection – In last several years, online video has grown dis-proportionally. YouTube, Facebook and online networks have further created unbounded opportunities for both amateurs and professionals to become content creators and reach mass audiences. Today, for the amount of video and images getting generated every second, it has become humanly impossible to monitor and flag inappropriate content (piracy, violence, adult, etc.). It is again machine learning services which are proving exemplary in this space with most networks creating automated AI based detection tools at the point of upload. Google’s cloud vision API is one such service achieving great results to tag content appropriately. While fake content creation has been an increasing threat from AI, it is the same AI technology coming to the rescue in restricting the malice

6. Content Fingerprinting – Working on the principle of capturing sample content snippets to create a unique fingerprint for identification, content fingerprinting has come a long way in media industry. As content continues to grow with multi-channel distribution, there is number of application where AI based fingerprinting technology is playing an important role. Some use cases are

  • Finding exact and similar profile media with effective search, Shazam is a live model
  • Micro licensing of content with block chain for payment and tracking against usage
  • Identification and tracking of consumer viewing behavior, measurement of commercials
  • Broadcast monitoring to validate an event occurance
  • Content protection for audio, video and images, tracking unauthorized distribution

Can read here for detailed understanding on digital fingerprinting

7. Video Quality Assessment – Video compression has been fundamental to video to achieve reasonable bitrates for delivery. But compression being lossy, introduces impairments and artifacts like blockiness etc. Video quality assessment has always been a critical process before content distribution and has grown manifold with multi channel distribution. Traditionally two standard methods, either standalone or in conjunction are used for quality assessment. Manual human based visual analysis by playing the content and checking for errors and a more automated reference based evaluation using metrics like VQM, PSNR, MSE, SSIM and others. While the former needs significant human effort, later has its challenges with accuracy, non-real time nature and dependence on a reference model. AI and machine learning is changing it all by mastering non reference based video quality assessment. AI using extensive feature sets and learning from error patterns is able to offer near real-time quality assessment. An area of tremendous potential to automate quality control in video workflows and bring matchless efficiency in reducing content release timelines

8. Virtual Reality and Augmented Reality – AR/VR market holds great potential but the technology has largely under performed due to challenges in cost, content maturity and ease of usage. While virtual reality (VR) specializes in creating a 360 degree immersive experience, augmented reality (AR) deals with overlay of computer graphic elements on real world elements. For a large part VR/AR apps and services are still rough and AI is bringing renewed energy by improving quality of data and decision making. AI is helping with accuracy of images, better understanding of user input and intent, content correlation, contextualization, as well as content authoring to build a more immersive experience for users

9. Post Production –   A large number of creative processes are based on defined rules and techniques and hence can be mastered by machine learning algorithms. AI systems have potential to automate ground work required for various creative processes from plot identification, scene selection, scripting and more. Heard about Morgan ? A sci-fi,  AI based movie released last September had something common with the movie theme itself. The movie trailer although finalized by human editors was suggested by AI using IBM Watson. Here, Watson was trained to learn from trailers of similar theme and select critical scenes from the movie, which were later, stitched together to create final trailer. A great example where AI can select scenes, insert visual effects and build a convincing, human edited like trailer. Below are some more areas where AI is making an entry

  • Structural and semantic analysis of video content to help create short form video snippets for news, video segmentation as well as special interest content for fan engagement.
  • Script proofing, content cleanup, scene sequencing and taking first pass at film editing. Given a script context, creating multiple scene performances with rating scores for selection
  • Video skimming in slow moving content capture to create informative only content

Recently IBM partnered with U.S. Open to provide sports highlights, by recognizing important match moments. AI ability for quick content identification and aggregation of related content in sports and news can completely transform business of sport and news coverage as it exists today

10.Content Production

Structural and object based analysis of content has opened new avenues where AI is helping with actual content development. Learning from minute details of how an on-screen character behaves, walks, talks and all possible moods of facial expression, AI systems can create virtual performances. It is amazing to see how a real life like performance can be created – check this clipping of a speech by US President Obama which he never gave, leaving little to the imagination. AI is still making baby steps in the world of content creation and there are many areas where it can benefit the production processes

  • Creating virtual human characters (digital only avatars) by learning from popular features, expressions, persona and styles of popular celebrities
  • Automate computer graphics work in animation movies, replacing human intensive work of character animation but with far greater efficiency


Artificial Intelligence and Machine Learning has potential to impact anything and everything which is based on a set of rules, and where a pattern can be established and learned by machines. AI and ML technology has its own unexplored territory and hurdles but is positioned for greater goals and holds promise of unparalleled capabilities. With financial services, high tech and telecom rapidly adopting AI, Media and Entertainment Industry is not far behind in automating its workflow processes.

It is the human talent of creativity, ingenuity and imagination which always had a special place in media industry, but it seems not all will remain the same as AI powered automation takes over…….

About - Digital Media Technology Consultant. I have passion for TV technology, digital convergence and changing face of Media and Entertainment industry.