Automatic summarization would be a particularly useful technology for Salesforce, which produces a variety of customer-service focused products. The company notes that the resulting summaries could be used by sales or customer service representatives to quickly digest emails and information, which would allow them to spend more time focused on their customers.
To that end, Salesforce is turning to machine learning to find ways to summarize longer blocks of texts, which it could eventually incorporate into its products. The company announced that it made two breakthroughs in natural language processing, introducing a new, “contextual word generation model,” and a “new way of training summarization models.” Together, the two advances allow researchers to automatically create summaries of longer texts that are accurate and readable. The company acquired a deep learning outfit MetaMind last year, which was behind the research.
The researchers explain that automatic text summarization works in two ways: extraction or abstraction. With extraction, computer can draw from preexisting wording in a text, but it’s not very flexible. Abstraction allows the computer to introduce new words, but the system has to understand the original article enough to be able to introduce the right words.
This is where deep learning neural networks come into play. They process numerous examples of sentences and words to spit out new representations of each phrase, which allows the system to interpret texts and introduce its own words. The researchers let their model to look back at the text it’s working off of for additional context. It also looks back at earlier generated examples, to ensure that it’s not repeating itself.
The results are pretty astonishing: the researchers provided several examples, showing the original article, a human-generated summary, and a summary generated by their own model, and in each case, the summaries are considerably shorter than the original text, but contain the essentials in a readable form. Despite their advances, there’s still considerable work to be done in this field: MIT Technology Review spoke with Kristian Hammond, a professor at Northwestern University, who noted that the advance “shows the limits of relying purely on statistical machine learning,” but that it’s a step in the right direction.
According to theverge