Industry-talks

Session 1 Domain driven solutions 9:05 – 10:30

1. Open Challenges in The Application of Dense Retrieval for Case Law Search

Pan Du, Hawre Hosseini, George Sanchez and Filippo Pompili (Thomsom / Reuters)

Abstract

Dense retrieval (DR) has gained remarkable success in general-purpose search engines and on public datasets in the past few years. Recently, some approaches have been proposed to explore its effectiveness in case law search tasks. However, most of them leverage small, curated datasets, which is different from real world search applications for legal professionals. We endeavor to build a DR based search engine for case law search on a large scale. Several attributes differentiate such domain-specific platforms from general-purpose search engines raising unique challenges that have not been thoroughly studied in literature.

First, in order to generate sufficient training data for these models, we draw on usage data from application logs. Our search application provides richer interactions than typically available in web search data sets. Users not only click on documents but print, folder, highlight, etc., all within the app. Unfortunately, usage is also relatively sparse (compared to web scale consumer datasets) due to the limited number of professional users as well as the nature of their work. This introduces questions about how to explore the space of user interactions and derive patterns meaningfully correlated with relevance. At the same time, we must consider the nature of user behavior in a setting characterized by complex search intent relative to underspecified queries, expert users, multiple dimensions of relevance, and high penalties for low recall. These factors inform our interpretation of usage data, while suggesting various forms of bias that may need to be considered. Our presentation will explore these challenges, among others encountered in the effort to create high quality silver data from a professional search application.

We will also discuss challenges related to adapting DR models to our domain. One challenge is the tendency for legal documents to be long and complex with a rich information structure. Due to the input length restriction of neural language models used by DR, documents must be truncated to a certain length or segmented into multiple passages before encoding. Truncation would lead to considerable information loss detrimental to the search system’s performance, especially on recall. So instead, we consider various approaches to segmentation, which means addressing the non-trivial gap between passage and document relevance.

Our overall aim is to present a number of open challenges, not all of which are specific to the legal domain, that must be addressed in order to fully leverage the power of state-of-the-art models in a wide variety of settings. We will focus on insights derived from our specific experience interpreting user behavior in a full featured search application rather than the details of any particular solution. We hope to introduce at least one or two new problems and bring attention and discussion to the community about these issues.

Company profile

Thomson Reuters is a content-driven technology company and one of the world’s most trusted providers of answers. Our customers operate in complex arenas that move society forward — law, tax, compliance, government, and media – and face increasing complexity as regulation and technology disrupt every industry. Our team brings together information, innovation and authoritative insight to keep customers up to speed on global developments. We’re on a mission to help professionals advance their businesses and gain competitive advantage.

Presenter

Hawre Hosseini is an Applied Research Scientist at the Thomson Reuters Labs. He has been doing research and development in Information Retrieval and Natural Language Processing since 2013 on a range of related problems and applications. Prior to joining TR, he was a PhD Candidate working mainly on understanding implicit and underspecified language. Hawre has also served as a co-founder and CRO of the first Kurdish NLP company, AsoSoft.

2. Dense Neural Retrieval for Scientific Documents at Zeta Alpha

Jakub Zavrel, Marzieh Fadaee, Artem Grotov and Rodrigo Nogueira (Zeta Alpha)

Abstract

For the use case of discovery, recommendation and semantic clustering of scientific documents, dense neural retrieval presents a number of distinct advantages, especially for bridging the lexical gap between searcher and documents, and for long complex discovery queries. However, a well tuned BM25 turned out to be hard to beat, and state-of-the-art neural search has also brought many engineering challenges. In this talk we will outline how we use dense retrieval in the Zeta Alpha production system, and describe our journey going from a naive BERT based retrieval system to our current situation where dense retrieval outperforms classical keyword search. We will also present our latest experiments on using self-supervised learning and data augmentation for tuning of dense retrieval representations.

Company Profile

Zeta Alpha is an Amsterdam-based R&D and product engineering lab with a passion for AI technology. Our ambition is to change how AI helps people to make better decisions in their work. Zeta Alpha is building a next generation neural discovery platform that allows our users to find documents by meaning, stay up-to-date on essential information, and organize their knowledge discovery work in teams. We make R&D teams in AI and Data Science more efficient and competitive, support decision-makers, and keep busy knowledge workers up to date with relevant information, insights, and connections. Our platform becomes your research assistant that streamlines how you and your team use and reuse knowledge.

Zeta Alpha was founded in 2019 by AI technologist and entrepreneur Jakub Zavrel (previously founder and CEO of Textkernel) in close collaboration with Information Retrieval and NLP researchers from the University of Amsterdam. The company is privately funded and is part of the AI community at Amsterdam Science Park. Zeta Alpha currently has 10 employees.

References

Mark Berger, Jakub Zavrel, Paul Groth. (2020). “Effective distributed representations for academic expert search”, Proc. of the 1st Workshop on Scholarly Document Processing”, Association for Computational Linguistics”, https://aclanthology.org/2020.sdp-1.7.

Marzieh Fadaee, Olga Gureenkova, Fernando Rejon Barrera, Carsten Schnober, Wouter

Weerkamp, Jakub Zavrel. (2020). “A New Neural Search and Insights Platform for Navigating and Organizing AI Research”, https://arxiv.org/abs/2011.00061.

Jakub Zavrel, Artem Grotov, Jonathan Mitnik. (2021). “Building a Platform for Ensemble-based Personalized Research Literature Recommendations for AI and Data Science at Zeta Alpha”, RecSys ’21: Fifteenth ACM Conference on Recommender SystemsSeptember 2021. https://doi.org/10.1145/3460231.3474619.

Presenter

Jakub Zavrel is the founder and CEO at Zeta-Alpha. He is an experienced AI, Machine Learning and NLP researcher,technologist, and entrepreneur; Believer in building strong,independent, and long term sustainable technology companies.

3. Annotating and Indexing Scientific Articles with Rare Diseases

Hosein Azarbonyad, Zubair Afzal, Max Dumoulin, George Tsatsaronis and Rik Iping(Elsevier)

Abstract

In Europe 30 million people are suffering from a rare (or orphan) disease, a disease that occurs in less than 1 per 2,000 people. Rare disease patients are entitled to the best possible health care, constituting the efficient organization of the respective clinical care and scientific literature imperative.The European Commission and member states have established a policy based on European Reference Networks specializing in ranges of diseases, which envisages to contribute to the efficient organization of the information around rare diseases. However, important questions, such as which are the excellence centers that could best treat patients for certain rare diseases, or, which are the key research initiatives for the various different rare diseases, require deep bibliometric and scientometric analysis that can be based in the efficient annotation and indexing of the respective scientific literature.

The primary challenge is the ability to automatically and efficiently identify which scientific articles and guidelines are dealing with which rare disease(s). With this work, we are presenting a novel methodology to annotate and index any scientific text with taxonomic concepts that describe rare diseases from the OrphaNet taxonomy (orphadata.org). The technical challenges are several: first, some rare diseases are only rare in a specific part of the population; second, some of the rare diseases are very similar conceptually and their differences are very difficult to recognize when research around them is discussed in the context of a medical or clinical scientific article; third, the OrphaNet taxonomy, as any taxonomy, might be incomplete in certain areas, and its structure might not be homogeneous in granularity across all the parts of the taxonomy; and, fourth, despite the great advances in the areas of Natural Language Processing (NLP) and Information Retrieval, polysemy and synonymy of the string surface appearance of rare diseases in text may still hinder the applicability of any annotation engine.

In this presentation we are going to discuss how Elsevier has used TERMite, a state of the art annotation engine, to address some of these challenges, in combination with advanced NLP and Text Mining techniques. The core of our methodology relies on using our TERMite text analysis engine to create a vocabulary based on Orphanet. In turn this vocabulary is used as a query in Scopus, Elsevier’s scientific literature database that includes all relevant research papers worldwide and links out to many other document types as well. These datasets, created for each rare disease, can be the basis for bibliometrics analyses using the wealth of metadata and reference linking that Scopus provides. As part of the presentation, we are going to demonstrate the results of such an analysis for the European landscape in rare diseases research, and we are going to highlight some directions for future research work that may address the open challenges

Company Profile

ELSEVIER is a global information analytics business that helps institutions and professionals progress in science, advance engineering, healthcare, and life sciences and improve research and professional performance. Based in the Netherlands, Elsevier is the world’s leading publisher of science, engineering and health information and serves more than 30 million scientists, students, and health and information professionals worldwide. It is a part of the RELX Group, known until 2015 as Reed Elsevier. Website: https://www.elsevier.com/

Presenters

Dr. Hosein Azarbonyad, Senior Machine Learning Scientist, Elsevier. Hosein is currently working on developing and improving several NLP-based applications within Elsevier including the Science Direct Topic Pages pipeline, submission systems, and indexing Scopus with Orpha rare diseases taxonomy. Hosein has a PhD in AI from the University of Amsterdam. His PhD research was focused on NLP, exploratory search, and information retrieval. He holds an MSc degree in Information Science from the University of Tehran and a BSc degree in Computer Science from the University of Tabriz. Hosein has served as an organizing committee member in NN4IR@WSDM2018, ICTIR2017, and DIR2015.

Dr. Zubair Afzal, Director Data Science, Elsevier. Zubair has a PhD in clinical text mining from Erasmus University Rotterdam and a Professional Doctorate in Engineering from Technical University of Eindhoven. He has an extensive research and development portfolio and is currently responsible for developing novel data science solutions for several products.

Max Dumoulin, VP Institutional offerings, Elsevier. Max is responsible for development of new services and offerings in collaboration with Elsevier’s academic customers. He works within the Research Products group comprising products and platforms such as ScienceDirect, Scopus, SciVal and Pure. Max has extensive experience in product, marketing, sales and communications. He holds a Master of Science degree in Business Administration from Erasmus University Rotterdam and a Bachelor of Arts degree in Philosophy from the University of Amsterdam. He is based in Amsterdam.

Dr. George Tsatsaronis, VP Data Science, Research Content Operations, Elsevier. George Tsatsaronis is Vice President of Data Science at the Operations division of Elsevier, in Amsterdam, The Netherlands. Prior to joining Elsevier in 2016 he worked in academia for more than 10 years, doing research and teaching in the fields of machine learning, natural language processing and bioinformatics in universities in Greece, Norway and Germany. He has published more than 60 scientific articles in high impact peer review journals and conference proceedings in various areas of Artificial Intelligence, primarily natural language processing and text mining.

4. IR in HR: Matching, ranking, and representing jobs and job seekers

David Graus (Randstad)

Abstract

At Randstad, the global leader in the HR services industry, searching and matching is at the heart of what we do. Being founded in 1960, We know from our heritage that real connections are not made from data and algorithms alone – they require human involvement. Last year, we helped more than two million job seekers find a meaningful job by combining industry-scale recommender and search systems with our distinct human touch. While many opportunities exist, employing AI in recruitment and HR is considered high-risk by the European Commission’s proposed regulatory framework on AI, which will bring additional requirements, obligations, and constraints.

In this talk, I will explain some of the characteristics of, challenges, and opportunities in the HR domain from an IR perspective. I will share some of our own work in recommendations, algorithmic matching, algorithmic bias and knowledge graphs, and highlight some of the ongoing research in this domain.

Company Profile

Randstad is the global leader in the HR services industry. By serving as a trusted human partner in today’s technology-driven world of talent, we help people secure rewarding jobs and stay relevant in the ever changing world of work. Randstad was founded in 1960 and is headquartered in Diemen, the Netherlands.

Presenter

David Graus is lead data scientist at Randstad Groep Nederland, where he leads the data science chapter with over a dozen data scientists who work on a wide variety of projects including recommender systems, information extraction from resume and vacancy data, and knowledge graphs. Prior to joining Randstad, David worked on news recommendations at FD Mediagroep, a Dutch media company that owns a newspaper and radio station. He obtained his PhD in Information Retrieval in 2017 at the University of Amsterdam, where he worked on semantic search and computational methods for automated understanding of large-scale textual digital traces under supervision of prof. dr. Maarten de Rijke.

5. Neural Information Retrieval for Educational Resources Carsten

Schnober, Gerben de Vries and Thijs Westerveld (Wizenoze)

Abstract

The essence of our product is to provide high-quality, up-to-date educational content that matches the users’ current information requirements, which are defined by customized curricula. These custom curricula define hierarchically organized topic nodes per age group and subject down to the level of specific learning objectives.

For each node in each curriculum, select search results from a large index that contains both public and proprietary education sources are presented to the users. The output of the search engine is manually curated by experts to ensure the quality of the results. Optimizing the underlying search algorithms is key to minimizing the curation effort, and therefore making the product scalable without compromising quality.

In this presentation, we show how we have developed a vector-based search engine to first support and eventually replace our current, keyword-based search algorithm. This includes hands-on descriptions of how we

fine-tuned a state-of-the-art neural information retrieval (NIR) model for our use case using Siamese BERT-Networks (Reimers & Gurevych, 2019),
evaluated the model performance in a practical setting,
performed in-depth analyses for understanding the impact of the model on the level of individual search requests,
implemented and deployed the model with limited resources.

Our findings show that NIR can outperform established, BM25-based information retrieval approaches in practical, real-world use cases with costs that are affordable for smaller organizations. We also compare different NIR models and investigate the impact of fine-tuning approaches on the model performance.

We will share details on the inner workings of our system, our data, and our use case. We show how we have been able to train an effective NIR model with limited resources, improving our search results significantly. We have performed further analyses on the query level, given that “embarrassing mistakes” on individual samples can be extraordinarily harmful in terms of user trust in an educational setting.

We demonstrate how our evaluation strategy involves domain experts in a systematic way. Qualitative analyses additionally make the improvements visible and build trust in users and stakeholders. Supporting the aggregative evaluation metrics with domain expertise and per-query analyses gives confirmation that academic state-of-the-art NIR approaches are applicable in industry use cases and real-world settings.

Deploying models that are based on large, Transformers-based language models (Devlin et al., 2019) with millions or even billions of parameters is another challenge in settings that have high requirements in terms of speed and cost effectiveness. We show how we have deployed our model as a 2nd-stage reranker at first, finding further confirmation in quality and effectiveness, and lay out the route towards a first-stage vector-based retrieval system. The new model has been deployed as an optional method running in parallel to our keyword-based retrieval system in place for starters, again having to find an acceptable trade-off between the cost of running two models in parallel and ensuring a smooth transition between the two approaches.

For the different transition stages, we have adopted suitable deployment methods. We will show the audience how we calculated speed and cost for different scenarios, and how we chose and implemented an optimal approach per situation.

In conclusion, we aim for the audience to learn about state-of-the-art research in Neural Information Retrieval, and its application to the domain of educational resources. Practitioners will gain insights on engineering, deployment, and evaluation, all supported by scientifically validated methodology.

Company Profile

Wizenoze has been a provider for EdTech solutions since 2013. The core product of the company is an API which provides curated educational content matching customized and generic curricula. The underlying application and the content are developed and maintained by a team comprising domain-expert curators, NLP and IR researchers, and software developers. The company is based in Amsterdam, Netherlands, and has offices in the UK and in India. Consequently, the products of the company are available in Dutch and English.

Presenter

Carsten Schnober is a research engineer with a focus on Natural Language Processing (NLP), having an academic background and several years of industry experience as a researcher and software engineer. He holds a bachelor’s degree in Computational Linguistics from the University of Heidelberg, Germany, and an MSc in Speech & Language Processing from the University of Edinburgh. After his graduation in 2010, Carsten was an academic researcher at the Institute for the German Language in Mannheim, Germany, and at the Ubiquitous Knowledge Processing (UKP) Lab at the TU Darmstadt, Germany. Carsten has worked in the R&D labs of different companies as an NLP research engineer. His focus lies on making academic state-of-the-art algorithms from fields including NLP, machine learning, and information retrieval (IR) usable in real-world software applications. This involves research, customization, implementation, and deployment in large-scale production settings.

6. Learning from Controlled Sources

Onno Zoeter (Booking.com)

Abstract

The classic supervised learning problem that is taught in machine learning courses and is the subject of many machine learning competitions, is often too narrow to reflect the problems that we face in practice. Historical datasets typically reflect a combination of a source of randomness (for example customers making browsing and buying decisions) and a controlling mechanism such as a ranker or highlighting heuristics (badges, promotions, etc.). Or there might be a selection mechanism (such as the decision to not accept transactions with high fraud risk) that influences the training data. A straightforward regression approach would not be able to disentangle the influence of the controller and phenomenon under study. As a result, it risks making incorrect predictions as the controller is changed.

In practice however, such problems are typically treated as a classic regression problem in a first iteration and attempts to identify and correct these complications come as afterthoughts or are not undertaken at all. Ideally there is a rigorous and flexible formalism that captures the correct framing of the problem from the very start, accompanied by a set of practical algorithms that work well in practice for each of the identified cases.

This research objective is the main goal of the Mercury Machine Learning Lab, a collaboration between the University of Amsterdam, the Technical University of Delft and Booking.com. It brings together the fields of information retrieval, causality and reinforcement learning where the topic is studied under the names of off-line evaluation, transferability and s-recoverability and off-policy learning respectively. This presentation will sketch the problem and discuss early results.

Company Profile

Part of Booking Holdings Inc. (NASDAQ: BKNG), Booking.com’s mission is to make it easier for everyone to experience the world whenever it’s safe to do so again. By investing in the technology that helps take the friction out of travel, Booking.com seamlessly connects millions of travelers with memorable experiences, a range of transportation options and incredible places to stay – from homes to hotels and much more. As one of the world’s largest travel marketplaces for both established brands and entrepreneurs of all sizes, Booking.com enables properties all over the world to reach a global audience and grow their businesses. Booking.com is available in 44 languages and offers more than 28 million total reported accommodation listings, including more than 6.3 million listings of homes, apartments and other unique places to stay. No matter where you want to go or what you want to do, Booking.com makes it easy and backs it all up with 24/7 customer support.

CV of presenter

Dr. Onno Zoeter is principal data scientist at Booking.com and leads Booking’s AI Research Lab (B.AIR). In B.AIR, Booking.com’s machine Learning scientists and engineers work together with top academic groups on fundamental problems in mission critical systems. It is B.AIR’s mission to make sure that all AI groups within Booking.com are aware of the state-of-the-art and can apply breakthroughs early and reliably.

Session 2 Media 11.00 – 12.00

1. Finding the Right Audio Content for You

Henrik Lindström, Daniel Doro and Jussi Karlgren (Spotify)

Abstract

Search is crucial to the Spotify experience. It acts as the entry point to Spotify’s expansive audio catalog for our wide and varied audience, allowing users to find specific pieces of content with low effort as well as explore and discover new music and podcasts. There are numerous requirements on Spotify’s search service that contrasts with search in other application domains, both due to the data involved and the use cases we engage with, and this presentation will give an overview of what we have learned over the past years.

We have in recent years moved from helping our listeners to find music to helping them to find podcasts to listen and enjoy. This entails a different approach to analysis of the catalog our creators have provided for us and the use cases our listeners engage in. Podcasts require us to reach into the content of the material in ways which are different from how we process music. Music search can often be resolved through metadata about items and with a similarity metric over the items; podcast material search, by contrast, may be relevant to its audience by virtue of what is discussed in the item itself. This necessitates content analysis mechanisms, e.g. using speech and language technology, an appropriate semantic representation, and subsequent technology equipped to handle that richer representation. To encourage research in this direction we have released the 100K English Language Podcast Dataset which has been well received by researchers in information access.

With hundreds of millions of users, we have hundreds of millions of different tastes to serve. Over the years we have invested in personalizing the search result. At the same time we want to help users discover new great content, and help creators build an audience. This means that we need to strike a balance between optimizing for what you already know about and allowing you to find new things.

Having such a large user base we enjoy the benefit of having vast amounts of data to evaluate and train our systems on. We will give an overview of how we’ve developed search satisfaction metrics, and how we leverage these to continuously improve the quality and impact of Spotify search.

We conclude by giving some examples of open questions that we invite the audience to work on in their research.

Company Profile

Spotify is the world’s most popular audio streaming service and provides hundreds of millions of listeners and millions of creators, a service for streaming music and podcasts over the Internet. Spotify’s catalog contains over 82 million tracks, and more than 3.6 million podcast titles and serves 406 million monthly active listeners including 180 million subscribers across 184 markets. Spotify is based in Stockholm with a number of offices all over the world.

Presenter

Henrik Lindström is product lead for Search at Spotify. He has 15+ years of experience working with search systems, both on the engineering and the product side. Throughout his career Henrik has been an advocate for applying Machine Learning to solve real user needs. Prior to Spotify, Henrik worked at the National Library of Sweden designing and building national research and catalog systems. Henrik is also an experienced public speaker with keynotes given at both internal and external conferences. Henrik holds a MSc degree in Computer Science from the Royal Institute of Technology in Stockholm, Sweden.

2. Scaling Cross-Domain Content-Based Image Retrieval for E-commerce Snap and Search Application

Isaac Chung, Minh Tran and Eran Nussinovitch (Clarifai)

Abstract

E-commerce companies are increasingly adopting Computer Vision based technologies to improve shopper bounce rate and offer a more appealing shopping experience. Visual search recommendations in their mobile and web applications improve customer experience by reducing the time and effort needed for product searches. These efforts aim to grow the number of conversion opportunities, which would lead to increased revenue.

We achieve this using a combination of visual search and classification capabilities, working in tandem with a large e-commerce client. This allows users to use their own photos to search for products via the client’s mobile app. This content-based image retrieval task presents a couple of problems. First, the search pool contains 2M+ images which span 3000+ products and 30+ top level categories (TLC). These categories were not designed with visual search in mind, but rather with traditional shopping, e.g. socks can show up under mens or womens top level categories, shelves could be in a bedroom or a garage. With a product catalogue this vast, seemingly similar looking items with functionally different uses can be confused with one another without the appropriate, overarching context. Applying brute force search algorithms at this scale fails if the dimensions in the latent space are not carefully tuned. Second, the search pool images come from a different domain than the user generated (query) images: Those to be retrieved are often studio-taken photos with a white background, or photos of products staged in their intended environment, while the users’ query images are smartphone photos, which can have noisy backgrounds as well as large variations in orientation and lighting. This introduces complexities in comparing between image representations.

Our approach addresses such problems by introducing an intermediate stage between the feature extractor and the approximate nearest neighbors (ANN) algorithm in the image retrieval system. This stage leverages the granular product hierarchy to perform a cascade-style search. The search pool is partitioned by TLC. A multi-class classifier, which is a convolutional neural network (CNN) trained on images in the same domain as the query data with labels as their respective TLCs, takes in query images as input and outputs TLC as predictions. Search is then performed in the search pool partition of the top prediction. With this, we are able to improve image retrieval metrics by an average of 69.7% compared to our baseline method, while limiting the overall latency increase to only 13%.

Company Profile

Clarifai is a leading provider of artificial intelligence for unstructured data. We help organizations transform their images, video, text, and audio data into actionable insights at scale by providing an AI ecosystem for developers, data scientists, and no-code operators. Clarifai supports the full AI development lifecycle; including dataset preparation, model training and deployment. Founded in 2013, Clarifai continues to grow with employees remotely based throughout the United States and Estonia.

Presenter

Isaac Chung joined Clarifai in the Fall of 2020 as a Machine Learning Engineer. He focuses on customer deliveries and professional services on commercial projects, especially on visual

search applications. He holds a B.A.Sc in Engineering Science (Aerospace) from the University of Toronto. Before pursuing his M.A.Sc in Machine Learning, also from the University of Toronto, he spent 2 years in the aerospace industry at Safran Landing Systems.

Minh Tran joined Clarifai in 2017 and currently works as a Research Engineer. He focuses on

customer deliveries and professional services on both commercial and public sector projects.

Before Clarifai, he handled data curation at Twitter for two years.

3. Automated Fact Checking at Factiverse

Vinay Setty (Factiverse)

Abstract

Information today is easier to manipulate more than ever before, and algorithms are often used to accelerate the spread of misinformation. The available solution is manual fact-checking and rigorous research that requires a lot of resources. There are very few AI-driven tools to verify information, research and quickly understand the context of a claim. Journalists have to constantly balance being thorough and being quick to publish. That is what Factiverse aims to solve. We equip journalists and content creators with patented AI-driven solutions that help them to do their best work faster and prevent the spread of misinformation.

In this talk I will discuss the typical pipeline for an automated fact-checking system at Factiverse. The pipeline consists of three steps: (1) detecting check-worthy claims, (2) retrieving most relevant snippets for the claim and (3) predicting the veracity of the claim. In this talk, I will talk about the use of state-of-the-art deep neural architectures developed at Factiverse as part of this pipeline. I will also present results using several benchmarks from political debates and manual fact checking websites such as Politifact and Snopes.

The talk is highly relevant for the ECIR participants since the technology at Factiverse includes retrieval and text classification techniques.

Company Profile

Factiverse AS was founded in December 2019, to address the challenges related to misinformation. The vision of Factiverse is to automate fake news detection using cutting-edge AI-based technology. The name Factiverse comes from our vision of a “Universe where facts matter!”. Factiverse’s core technology builds on long-time research by Associate Professor Vinay Setty at the University of Stavanger (UiS). In 2020, the founding team was expanded to include Maria Amelie in the role of CEO, a well-known journalist, author and business developer. We have won several innovation awards and are part of Media City Bergen and the prestigious AI collab at the LSE and BBC News Lab.

More information can be found about Factiverse at https://www.factiverse.no.

Presenter

Vinay Setty is the CTO of Factiverse AS and an Associate Professor at the Department of

Electrical Engineering and Computer Science. Before that he has been an Assistant Professor at Aalborg University in Denmark and Postdoctoral Researcher at Max Planck Institute for Informatics. Setty got his PhD from the University of Oslo, Norway. Setty’s recent research areas mainly include information retrieval, text and graph mining using machine learning techniques. Text mining includes dealing with unstructured text, specifically news documents for tasks such as fake news detection, fact checking, news ranking, news recommendation etc. The research conducted at UiS related to automated fact checking was the basis for founding Factiverse AS towards commercializing the research. He is also actively publishing in the area of data mining and Information Retrieval conferences such as TheWebConf, SIGIR, VLDB, CIKM and WSDM.

4. Searching news media for repeated claims

David Corney (Full Fact)

Abstract

After fact checkers have researched a misleading claim and published a fact check article, we monitor the media to find any repeats of that claim using a combination of IR, NLP and ML techniques. In this talk, I’ll describe the process of fact checking and how technology can help. I’ll describe various tools we’ve developed with a focus on claim matching. I’ll describe the various components of the tool (including BM25 and BERT) and how we’re now using crowd-sourcing to expand our training data. I’ll include examples of the impact our tools have had and our plans for the future. Instead of just presenting the final tool, I’ll also discuss the process which led us there and some of the research dead-ends we found along the way.

I believe this will be of interest to the ECIR Industry Day audience as it describes a real problem which requires a practical solution, how we researched and developed a novel tool, and how that tool is now used on a daily basis.

Company Profile

Bad information ruins lives. Full Fact is a team of independent fact checkers and campaigners who find, expose and counter the harm it does. We believe that anyone making serious claims in public should be prepared to get their facts right, back up what they say with evidence and correct their mistakes. Since 2015, we have been developing technology to help increase the speed, scale and impact of our and others fact checking. Our tools collect and monitor the media; identify and semantically-enrich claims; match new claims to previously fact checked claims; and automatically verify statistical claims.

Presenter

Dr David Corney is a data scientist specializing in natural language processing. For the last three years, he’s helped to bring AI into Full Fact’s tools to better support fact checkers and other colleagues. David has previously worked in academia (UCL, City) and for tech startups (Signal AI), where he developed tools to analyze news articles, tweets and research papers. He enjoys applying NLP and AI to real-world problems, and has straddled the academia-industry interface from both sides.

Boaster session for Posters 12.00-12.30

1. Dense Retrieval with Apache Solr Neural Search

Alessandro Benedetti (Sease)

Abstract

Neural Search is an industry derivation from the academic field of Neural information Retrieval. More and more frequently, we hear about how Artificial Intelligence (AI) permeates every aspect of our lives and this includes also software engineering and Information Retrieval. In particular, the advent of Deep Learning introduced the use of deep neural networks to solve complex problems that could not be solved simply by an algorithm. For example, Deep Learning can be used to produce a vector representation of both the query and the documents in a corpus of information. Search, in general, comprises of performing four primary steps:

generate a representation of the query that describes the information need
generate a representation of the document that captures the information contained in it
match the query and the document representations from the corpus of information
assign a score to each matched document in order to establish a meaningful document ranking by relevance in the results

With the Neural Search module, Apache Solr is introducing support for neural network based techniques that can improve these four aspects of search. This talk explores the first official contribution of Neural Search capabilities coming to Apache Solr 9.1, in the first quarter of 2022: Approximate K-Nearest Neighbor Vector Search for matching and ranking.

You will learn:

how Approximate Nearest Neighbor (ANN) approaches work, with a focus on Hierarchical Navigable Small World Graph (HNSW)
how the Apache Lucene implementation works
how the Apache Solr implementation works, with the new field type and query parser introduced
how to run KNN queries and how to use it to rerank a first stage pass
how the performance benchmarks compare with classic BM25 lexical retrieval and ranking

Join us as we explore this new Apache Solr feature!

Reasons why your experiences should be interesting to the ECIR audience: I think it would be really beneficial for both an academic and industrial audience to explore this new contribution to Apache Solr: neural search is becoming increasingly popular over time and having it available in Apache Solr is a solid enabler for real-world usage. In addition, drawing researchers’ and practitioners’ attention to the open-source community can facilitate future contributions and potentially help in reducing the gap between academia and the industry(in the neural domain or not).

Company Profile

Sease’s mission is to make research in Information Retrieval more accessible to an industry audience, transforming the best research principles, ideas and implementations from academia into real-world products. Firmly believing open source is the way, Sease puts strong effort in contributing code back to the community, supporting public mailing lists and evangelising R&D at world class conferences. Focus of the company is to provide R&D projects guidance and implementation, search consulting services, training and search solutions using open source software such as Apache Lucene/Solr and Elasticsearch.

Presenter

Alessandro Benedetti is director and R&D Software Engineer at Sease Ltd. His focus is on information retrieval, information extraction, natural language processing, and machine learning. At Sease Alessandro is working on Search/Machine learning R&D and consultancies. When he isn’t on clients projects, he is actively contributing to the open-source community and presenting the applications of leading edge techniques in real world scenarios at meet-ups and conferences such as ECIR, the Lucene/Solr Revolution, ApacheCon, Haystack, FOSDEM and Open Source Summit.

2. Evaluating Ranking Models in Production: a View on Offline and Online Experiences Alessandro Benedetti (Sease)

Abstract

Evaluation plays a key role in the field of information retrieval. Researchers and practitioners design and develop ranking models to represent the relationship between an information need expressed by a user (query) and information (search result) from the available resources (corpus). To validate any research paper on ranking innovation, It is fundamental to test the

produced models by comparing their outcomes and calculating relevance metrics on a pre-

defined ground truth(judgments).

What happens in the industry, where real users interact with the system, business interests affect the concept of relevance and pre-defined relevance judgments are not available? This talk illustrates how companies in different domains approach the problem and

implement offline and online testing/monitoring solutions. For each real-world application, this presentation describes:

how it is approached and implemented (A/B testing, Interleaving, Statistical Significance calculations…)
how the implicit/explicit feedback is collected and used to estimate the relevance (internal team of experts, users interactions with the system, revenue/profit signals, sponsored results… )
how the experiments are designed and planned (how many models to compare at the time, what models to compare in the same test, how to test mobile/desktop/tablet platforms…)
what Open Source technologies are used to facilitate the tasks
most common pitfalls and solutions to mitigate them

Reasons why your experiences should be interesting to the ECIR audience: I think it would be really beneficial for both an academic and industrial audience to explore how ranking models evaluation happens in real-world scenarios: what are the methods most used and how they are implemented. From an academic perspective, it can be very useful to understand what’s used in the industry and if it aligns with the expectations and the state of the art. Having an overview of real-world applications and challenges can also provoke new ideas and new research directions to be explored. For the industrial audience, it’s going to be very valuable to observe how the problems they likely face every day are approached, mitigated, and solved in similar domains with open-source technologies. Is the most known paper on online testing also the most used in production or most of the time simpler approaches are implemented? Does it change if it’s a big e-commerce or a small bibliographic company? This talk tries to answer these questions and more.

Company Profile

Presenter

3. Recommendations in a Multi-Domain Setting: Adapting for Customization, Scalability and Real-Time Performance

Emanuel Lacic and Dominik Kowald (Know-Center)

Abstract

Recommender systems have gained a tremendous increase in popularity in recent years for many industry practitioners. Early recommender systems often considered only user-item interactions, but nowadays, many application domains can leverage different contextual sources like textual meta-data, images or implicitly arising graph structures. Furthermore, practitioners who build modern recommender systems need to address the scalability and real-time demand when providing recommendations in an online setting, since there is usually a trade-off between accuracy and runtime performance. When put into production, different challenges need to be addressed in order to continuously maintain the stability and health of a recommender system. A distributed architecture which is guided by design principles like providing service isolation, supporting data heterogeneity, allowing for algorithmic customization as well as ensuring fault tolerance is thus a necessity.

In this talk, we will show how to build a modern recommender system that can serve recommendations in real-time for a diverse set of application domains. We will share our experiences that we gained in both research-oriented (e.g., Horizon 2020) and industry-oriented projects on how we build hybrid models based on a microservice architecture. This architecture utilizes popular algorithms from the literature such as Collaborative Filtering, Content-based Filtering as well as various neural embedding approaches (e.g., Doc2Vec, Autoencoders, etc.). We will further show how we adapt our architecture to calculate relevant recommendations in real-time (i.e., after a recommendation is requested), since in many cases individual requests may be targeted for user sessions that are short-lived and context-dependent.

To showcase the applicability of such an approach, we will specifically focus on and present two real-world use-cases, namely providing recommendations for the domains of (i) job marketplaces, and (ii) entrepreneurial start-up founding. For the former, we tackle the problem of finding the right job for university students by guiding the students toward different types of entities that are related to their career, i.e., job postings, company profiles, and career-related articles. Here, for instance, we find that the online performance of the utilized approach also depends on the location context where the recommendations are displayed. For the latter, we will present how a recommender system can support academic entrepreneurs who want to go through the process of building a start-up from an initial innovation idea. In such a setting, a recommender system needs to suggest relevant experts that can provide feedback to an innovation idea, support potential co-founder and team member matching, allow accelerators, incubators, and innovation hubs to discover these innovations as well as continuously provide relevant education materials until the innovation idea has become mature enough in order to form a start-up. By adapting a recommender system for such diverse personalization scenarios, we observe that a dynamic customization of the utilized recommender algorithms with respect to the underlying data structures is of key importance.

Taken together, we strongly believe that our experiences from both research- and industry-oriented projects should be of interest for the ECIR audience, especially for practitioners in the field of real-time multi-domain recommender systems.

Company Profile

With more than 20 years of experience in cutting-edge research, Know-Center GmbH is Austria’s leading research center for Data-Driven Business and Big Data Analytics. In its role as an innovation hub between science and industry, the Know-Center as a non-profit company offers application-oriented research in cooperation with academic institutions and partners from industry. The scientific strategy of the Know-Center is to combine approaches of Big Data Analytics with Human-Centered Computing to create cognitive computing systems that enable humans to use huge amounts of data. With over 130 employees, Know-Center has extensive experience in national as well as international collaborative R&D projects in Big Data, Machine Learning and Artificial Intelligence.

Presenter

Emanuel Lacic, MSc. is Operations Area Manager of the Social Computing area at the Know-Center. He is a former Marshall Plan fellow and has been working as a visiting researcher at the Computer Science department of the University of California, Los Angeles. His main research interests are in the fields of Recommender Systems, Deep Learning, as well as Social Network Analysis.

Dr. Dominik Kowald is Research Area Manager of the Social Computing area at the Know-Center and senior researcher at Graz University of Technology. He is also task lead in the H2020 AI4EU and TRUSTS projects. He is review editor of Frontiers in Big Data – Recommender Systems section, and his research interests are in Recommender Systems, Privacy, Fairness, and Biases in algorithms.

4.Event Detection Information System to Support Decision Making for Raw Materials Supply Chain Management

Elliot Maître, Max Chevalier, Bernard Dousset, Jean-Philippe Gitto and Olivier Teste (Scalian)

Abstract

Raw materials, such as metals or oil, are crucial in industrial supply chains. They are the foundation of manufactured products and thus are a critical element of most supply chains. A difficulty raw material buyers and supply chain managers face is that the availability and price of raw materials are greatly influenced by events happening around the world, such as geopolitical events or weather-related events and keeping up with all these events is complicated. They are afraid to miss events that will have critical impacts on raw materials prices and availability.

In this project, the objective is to build an information system to help raw material buyers in their daily decision making process. This information system is the result of a collaboration between purchasing experts and computer scientists particularly in the information retrieval field and data management. The goal of this system is to provide a curated list and description of potentially impactful events happening around the world to raw material buyers, so they can analyze them, review them and take informed decisions. To detect these events, we use external sources such as social media or news feeds. Our event detection system had to satisfy several constraints:

It must perform the task of open-domain event detection, meaning that the events to detect are unspecified. Indeed, some events are unknown beforehand and unpredictable.
The system must be able to detect both large and small events. When working with specialized domains, some potentially impactful events are not necessarily attracting a lot of attention. On the other hand, some really general and trending events can have a huge impact as well.
The system must be able to combine documents from different sources. An event depends on the sources that relay it.
Track the evolution of events during time. Events are not stable over time and as they unfold, their potential consequences evolve.

To meet these constraints, we build an event detection system based on several state-of-the-art algorithms. We filter spam, uninformative documents and target potentially important domains using filtering rules, then represent documents using Transformer-based language models and cluster the documents using similarity metrics and graph-based clustering algorithms. To track the evolution of events over time, we build cluster chains.

The objective of the presentation is to present both the scientific issues related to information retrieval, the issues related to the collaboration with experts from other domains but also how we approached and solved them. We will highlight how we designed our event detection system with the inclusion of the feedback of the buyers, to keep it as accurate and updated as possible in a context of concept drift.

Company Profile

Scalian is an engineering and consulting company specialized in digital systems, quality and performance management for industrial operations and working in numerous domains such as aeronautics, transportation, energy and defense. It is a multinational company employing more than 3000 people in eight countries. It has a research and development laboratory composed of expert researchers in different fields such as applied mathematics, computer science, industrial science and environmental science. The objective of the laboratory is to make research and innovative propositions about the expertise domains of the company in close collaboration with the employees and the clients of the company, in order to broaden its offer.

Presenter

Elliot Maître is a last year computer science PhD student. The PhD is conducted in collaboration between IRIT (Institut de Recherche en Informatique de Toulouse) and Scalian, the industrial representative. He should defend his PhD during the second term of this year (2022) and has coordinated the collaboration between the industrial and academic sides of his scientific project. He holds an engineering degree and he had to collaborate with both academic researchers and industrial experts to complete the project. He conducted the project in such a way as to ensure that the interests of both parties were met.

5. iRM: An Explainable Al based Approach for Environmental, Social, and Governance Risk Management

Sayantan Polly, Subhajit Mondal and Arun Majumdar (Technology Risk Limited)

Abstract

1. Introduction and problem statement

Risks arising out of Environment, Social and Governance (ESG) are currently a vibrant topic for corporate boardroom discussion. The pandemic and climate change related disasters across the globe are stark facts that are calling for actions beyond words. For corporate leadership, there are various risks due to a volatile geo-political environment coupled with an enhanced regulatory load. Companies running for profit have realized that ESG has the potential for delivering value to top-line revenue and bottom line growth. There are various commercially available tools that benchmark various ESG related initiatives of a company. Often such tools help in tracking, estimating and forecasting carbon footprint, resource utilization efficiency. Data driven and statistical approaches, such as “what-if analysis” based on machine learning and deep learning models help in forecasting. For example, what bottom-line growth can be estimated by lowering expenses from electricity consumption, using renewable sources of energy, and bringing in supply chain efficiencies? Studies show that often consumers are ready to pay an increased price for greener products. But there are a couple of interesting challenges that arise in developing such software tools that we attempt to address in our integrated Risk Management solution, iRM, based on explainable AI research. Two such aspects are explainability of machine learning predictions and ethical use of data on a multi-tenant cloud (software-as-a-service) solution. Although AI based models are currently widespread in various commercial software applications, often the adoption at CXO level users in companies is hindered by lack of explainability of black-box machine learning models. How does a machine learning model arrive at a decision so that a non-AI user can trust the recommendations, for a risk management use case such as ESG?

2. Proposed Solution

Our solution, iRM consumes data securely from various source transaction systems, and other publicly available sources of data. iRM works with a datalake scenario combining structured and unstructured data to identify potential risks. Risk management key performance indicators are used along with industry benchmarks, rules and expert knowledge. We employ various SOTA explainable AI-based methods that attempt to explore causal relationships between the features. Managing unstructured social data, regulatory reports and documents brings an Information Retrieval (IR) task. We employ recent explainable AI driven research on text to build a retrieval based chatbot that supports the users in explaining why certain items are relevant during free text search.

3. Open Questions and Discussion

The first challenge lies in evaluation of the explanations, due to the lack of

ground truth explanations, which is an open research question in the explainable

AI community. We attempt to address the problem of evaluation of explanations, by capturing user inputs and presenting the statistics transparently to users. The evaluation of explainable AI brings the related second research question on the ethics of data usage on cloud based shared software applications (SaaS): Can we learn machine learning weights (e.g. regression or neural network parameters) from the data of company X and use it for Y, even after obfuscating the personally identifiable features and adhering to related laws like EU GDPR. The predictive performance of deep learning models often depends on the volume, quality and annotation in data. For identifying and mitigating risks, such as security role violations, fraud transactions and ESG anomalies, it would be beneficial if companies collaborate. Especially, the public sector ones run by taxpayer money, can be expected to collaborate as a community when it helps a common cause such as fraud detection, audit anomalies and ESG related causes like climate change. Hence in iRM we propose a transparent feature where companies can volunteer to opt for contributing selected parts of data, in order to leverage the machine learning weights on a bigger data volume, contributed by others. Otherwise, each entity can decide to use machine-learned weights limited to their own data and be restricted with prediction quality based on the same. We would like to use the ECIR Industry forum to brainstorm these ideas with the members of the community.

Company Profile

Technology Risk Limited is a UK based startup founded in 2018, specialising in enterprise risk management. The company provides boutique risk consulting services driven by in-house AI products to ensure that risks are mitigated in complex Digital Transformation projects. Currently the company has over 30 employees in Europe, Africa, Asia, USA and serve customers across the globe, such us a leading global telecom player, world’s largest shipping and logistics company. The founders have advised FTSE 100 companies and Governments across the globe and have worked for Big Four consulting firms and software product development majors.

Presenter

Arun Kumar Mazumdar is the CEO and founder, driving the management consulting arm. He has worked with Big four management consulting firms in Europe and Asia. His core expertise lies in SOX Advisory, ERP Audit, Risks & Controls Automation. He has advised more than 250 customers over the last 15 years in the UK, Ireland, US, Europe, Middle East which include FTSE 100 companies, European Regulators and Governments.

Sayantan Polley is the co-founder of iRM, the product development subsidiary of Technology Risk Limited. He has worked for Oracle product development and Big four consulting firms. Currently he drives the product development initiative with Subhajit Mondal. iRM is based on an explainable AI approach, which is aligned with Sayantan’s PhD topic[6], accepted in the ECIR 2022 doctoral consortium. Sayantan has a masters degree in Data and Knowledge Engineering, Otto von Guericke University Magdeburg, Germany. Subhajit is currently a masters student of Data and Knowledge Engineering, Otto von Guericke University

Magdeburg.

6. Challenges of Ranking for Push Notifications

Yuguang Yue (Twitter)

Abstract

Social media, like other internet applications, is accessed via mobile phones by the majority of users. Mobile phone push notifications have been shown to be a powerful tool to boost engagement and, when well constructed, an important way to help users to stay up to date with relevant content.

Recommender systems for push notifications present unique challenges compared to other content ranking problems. This presentation will introduce some of the intrinsic difficulties and present one approach we have pursued with success. Although some aspects may be unique to Twitter, our findings should have general relevance to others and also share research questions on an emerging recommender systems paradigm with an academic audience.

Push notifications have several characteristics that make ranking challenging and distinct from many existing ranking problems. Firstly, of all the candidates ranked, only a single candidate can be sent to the user and receive feedback due to the display limitations of push notifications (in contrast to e.g. search, where the user may give implicit feedback on multiple documents). Secondly, the document relevance is highly personalized since users do not actively seek information and thus there is limited context that would indicate the information needed. Thirdly, the user responses are non-stationary, which means a document may be relevant to the user now, but not relevant a short time in the future. Finally, at Twitter new content is being created at a rapid pace, so approaches must generalize to novel content.

These properties combine to make the push notification problem challenging to solve and make counterfactual evaluation and many ranking losses difficult to use. One can approach a push notification problem from either a learning to rank perspective or a contextual bandit perspective. However, in our setup most contextual bandit algorithms cannot be directly applied on a push notification problem because of its non-stationary property, and we focus on formulating a push notification problem under a learning to rank framework.

After discussing the challenges we will present one approach we have used with promising results in a large scale production experiment. Our approach was to develop a novel ranking loss specific to the properties of push notifications which weights a pairwise loss by the expected regret of misordering the pair. We will present results both in simulation and from our production experiment comparing this loss with pairwise, pointwise and other ranking losses.

Our presentation will be relevant to audiences with an industrial background who want to learn about the implementation of push notifications algorithms at industry scale, and also present some of the continuing challenges and areas of further research for academics. The push notification problem is highly relevant to information retrieval, and has demonstrated its importance in fields such as e-commerce, communication platform, streaming services, etc. We hope this presentation will both share what we have learned as well as stimulate discussion and future research in this topic.

Company Profile

Twitter is a global platform for public self-expression and conversation in real time.

Twitter allows people to consume, create, distribute and discover content and has

democratized content creation and distribution. Founded in 2006, it is widely used by

major public figures and has over 200 million users. Twitter has invested heavily in

machine learning to better serve the people who use its platform.

Presenter

Yuguang joined Twitter in June 2021, and has been working as a research scientist at Twitter

Cortex Applied Research – Recommender Systems. Prior to Twitter, he obtained his PhD degree from the University of Texas at Austin with emphases on Bayesian Statistics and

Reinforcement Learning. Prior to his PhD, he spent two years at University of California, Los

Angeles and four years at Fudan University in Shanghai.

7. Neural Methods for Personalized, Product Query Suggestion: A Case Study

Muthusamy Chelliah (Flipkart)

Abstract

Auto completion aka query suggestion returns the most frequent queries that match prefixes entered by users based on search logs – ranking such candidates through features extracted from past queries and current user input. In an online shopping platform [Goyal 21], such features are based on category relationships of previous searches and plausible suggestions; closer the nodes are in a tree, higher is the similarity between them (e.g., Shoes – Sandals vs. Clothing – Footwear). Effectively exploiting context thus in query prefix is an active area of IR research; neural, language models [Sordoni 15] and attention mechanisms [Dehghani 17] have been explored to model query sequences and user signals like clicks on suggested queries [Wu 18, Jiang 18].

In this talk, we first motivate personalization in the query suggestion system of Flipkart – e-tail leader from India. Personalization takes user’s reformulation context (in- vs. cross-session, affinity (e.g., brand, product), important attributes, word addition/removal) into consideration, thus optimizing search funnel and increasing attribute diversity in the index. Various actions in a user’s session (e.g., browsing through products, filtering, product page views, cart adds, checkouts) indicate purchase intent. We expand the talk with potential future directions (e.g., diversification [Chen 17, Liao 19], leveraging session-level user behaviour/reformulation [Li 19], unseen prefixes [Wang 20]) utilizing state-of-the-art literature (e.g., positive/negative feedback and filtering noise in an e-commerce platform [Yang 21] as well as a generative approach [Chen 20], extreme multi-label ranking [Yadav 21] and ranking to capture long-term intent [Cheng 21] in Web search systems).

Intent of this presentation thus is to provide an overview of product query recommendation in a working system (i.e., Flipkart) and how it could be further improved with recent results from a comparable platform (i.e., Alibaba) and open up new avenues for applied research from an adjacent domain (i.e., Web search) too.

[Chen 17] Chen, Wanyu, et al. Personalized query suggestion diversification. SIGIR ‘17

[Dehghani 17] Dehghani, Mostafa, et al. Learning to attend, copy, and generate for session-based query suggestion. CIKM ‘17

[Chen 20] Chen, Ruey-Cheng, and Chia-Jung Lee. Incorporating Context Structures for Query Generation. EMNLP ‘20

[Cheng 21] Cheng, Qiannan, et al. Long Short-Term Session Search: Joint Personalized Reranking and Next Query Prediction. WebConf ‘21

[Goyal 21] https://tech.flipkart.com/building-personalized-autosuggestion-9e705d5bf5f8

[Jiang 18] Jiang, Jyun-Yu, and Wei Wang. RIN: Reformulation inference network for context-aware query suggestion. CIKM ‘18

[Li 19] Li, Ruirui, et al. Click feedback-aware query recommendation using adversarial examples. WebConf. ‘19

[Liao 19] Liao, Ziyang, and Keishi Tajima. Disjunctive Sets of Phrase Queries for Diverse Query Suggestion. Web Intelligence ‘19.

[Park 17] Park, Dae Hoon, and Rikio Chiba. A neural language model for query auto-completion. SIGIR ‘17

[Wang 20] Wang, Sida, et al. Efficient Neural Query Auto Completion. CIKM ‘20

[Wu 18] Wu, Bin, et al. Query suggestion with feedback memory network. WebConf. ‘18

[Yadav 21] Yadav, Nishant, et al. Session-aware query auto-completion using extreme multi-label ranking. KDD ‘21

[Yang 21] Yang, Yatao, et al. FINN:Feedback Interactive Neural Network for Intent Recommendation. WebConf. ’21

Company Profile

Flipkart is an Indian e-commerce company with over 60% market share competing primarily with Amazon’s local subsidiary. In August 2018, U.S.-based retail chain Walmart acquired a 77% controlling stake in Flipkart. Research advances in Flipkart product search have been published in various IR Forums in recent years [Kumar 18, Maji 19, Maji 20].

[Kumar ‘18] Kumar, Rohan, et al. Did we get it right? Predicting query performance in e-commerce search. SIGIR eCom ‘18.

[Maji 19] Maji, Subhadeep, et al. Addressing Vocabulary Gap in E-commerce Search. SIGIR ‘19

[Maji 20] Maji, Subhadeep, et al. A Regularised Intent Model for Discovering Multiple Intents in E-Commerce Tail Queries. ECIR ‘20.

Presenter

Muthusamy Chelliah heads external research collaboration for Flipkart and his current interests are ML applications in IR, NLP and data mining. He holds a PhD degree in Computer Science from Georgia Tech., Atlanta; he is passionate about catalyzing industry-relevant data science in global universities. He has delivered tutorials in RecSys ‘17, ECIR ‘19, WebConf 19, IJCAI ‘19, ACM Multimedia ‘19, CIKM‘19 and ECIR‘20 on topics spanning recommender systems, product reviews and question answering. He’s also a coauthor of an accepted full paper at ECIR ‘22 around outfit recommendation. A sample of his video presentation from ECIR ‘20 tutorial (45-min.) is at: https://drive.google.com/file/d/1nfkT5gz3KohdtbM6o9m0knTAnYkCgCG2/view?usp=sharing

8. Behind the Scenes: Building RecList in the Open with Students, Researchers and Partners

Jacopo Tagliabue, Ciro Greco and Federico Bianchi (Coveo)

Abstract

Recommender systems (RSs) are possibly the most ubiquitous Machine Learning system, personalizing the digital life of billions of people and acting as a constant reminder of our responsibilities as practitioners, users, and legislators. As with most ML systems, point-wise metrics (MRR, NDCG) over-hold out data are typically used to predict their behavior “in the wild”: however, as even one bad prediction may cause reputational damage, aggregate metrics tend to overestimate performance and complicate error analysis. If managing one RS is hard, the problem is compounded when deploying hundreds of them, which is the case for a growing portion of the industry (including unicorns such as Algolia and Bloomreach, and public companies like Coveo and Yext). At the end of 2021, we introduced RecList, an open source package to help scaling behavioral testing in RSs: RecList builds on top of existing ad hoc procedures, providing practitioners with a common lexicon and working code for in-depth testing and error analysis.

In this talk, we share our experience in completing an applied research project as a community-driven effort. RecList started out of the very practical necessities involved in scaling RSs to hundreds of organizations, but quickly raised interesting research questions on representational learning and model evaluation. We discuss our four phases when building RecList “in the open”: first, the motivation phase, primarily driven by industry concerns; second, the design phase, in which academic researchers place the problem in the context of recent literature, and highlight open challenges; third, the execution phase, in which external collaborators answer our “call for builders”, and quickly help us extend the core functionalities in several directions; fourth, the feedback and iteration phase, in which other institutions – including researchers at eBay, NVIDIA, Tubi, BBC – start discussing the package and provide new perspectives on use cases and functionalities.

Finally, we present RecList as an example of virtuous interaction between academia and startups, in which all parties benefit even without assuming Big Tech resources and computational budgets. In particular, we highlight the benefit of developing software tools with students and researchers, who provide in-depth expertise for topics outside of our comfort zone and much needed help in every stage of the process (from literature review, to running experiments); on the other hand, we discuss the many interesting questions raised by real-world deployments, and the abundance of research projects that can be tackled within the realm of industrial applications.

Company Profile

Coveo is a startup providing AI-driven solutions for ecommerce, service, and workplace use

cases. Coveo raised more than 300M dollars in the last three years, and recently became a

public company traded on the Toronto Stock Exchange, with global clients such as Tableau,

Dell, Xero and Motorola. Coveo is supported by a network of accredited global partners,

integrators and alliances, including Salesforce, ServiceNow and Sitecore.

Presenter

Jacopo Tagliabue was co-founder and CTO of Tooso, a NLP-focused company in San Francisco acquired in 2019. Jacopo is currently the Director of A.I. at Coveo, leading the company roadmap in A.I., NLP and personalization. When not busy building products, he is exploring research topics at the intersection of language, reasoning and learning: a co-organizer of SIGIR eCom, his work has been featured both at top-tier conferences and in the general press. In previous lives, he managed to get a Ph.D., do scienc-y things for a pro basketball team, and simulate a pre-Columbian civilization.

Session 3 Posters 13:30 – 15:00

All presenters will participate in the poster session to facilitate discussions and socializing between audience and speakers. All presenters physically attending the conference will present their posters in-person. Each remote presenter there will have a dedicated zoom session for the audience to interact with them.

Neural Methods for Personalized, Product Query Suggestion: A Case Study Muthusamy Chelliah (Flipkart)

Behind the Scenes: Building RecList in the Open with Students, Researchers and Partners Jacopo Tagliabue, Ciro Greco and Federico Bianchi (Coveo)

The Impact and Importance of Keyphrases in Building NP Products

Mayank Kulkarni (Bloomberg)

Is now a good time? Using reinforcement learning to make optimal decisions about when to send push notifications Jonathan Hunt (Twitter)

Neural Information Retrieval for Educational Resources Carsten Schnober. Gerben de Vies and This Westerveld (Wizenoze)

Searching news media for repeated claims David Coney (Full Facts)

Challenges of Ranking for Push Notifications Yuguang Yue (Twitter)

iRM: An explainable Al based approach for environmental, social, and governance risk management Sayantan Polley, Subhajit Mondal and Arun Majumdar (Technology Risk Limited)

Automated Summarization for Realtime Production Status: An Application of TQFS Nicholas Adams-Cohen, Shade El-Hadik, Emanuel Tapia and Carsten Witte (Accenture)

Session 4 Summarisation and recommendation 15:30 – 17:00

1. Serving Low-Latency Session-Based Recommendations at bol.com

Barrie Kersbergen, Olivier Sprangers and Sebastian Schelter (bol.com)

Abstract

Session-based recommendation targets a core scenario in e-commerce and online browsing. Given a sequence of interactions of a visitor with a selection of items, we want to recommend to the user the next item(s) of interest to interact with [1]–[3]. This machine learning problem is crucial for e-commerce platforms, which aim to recommend interesting items to buy to users browsing the site.

Scaling session-based recommender systems is a difficult undertaking, because the input space (sequences of item interactions) for the recommender system is exponentially large, which renders it impractical to precompute recommendations offline and serve them from a data store. Instead, session-based recommenders must maintain state in order to react to online changes in the evolving user sessions, and compute next item recommendations with low latency [3], [4] in real-time. Recent research indicates that nearest-neighbor methods provide state- of-the-art performance for session-based recommendation, and even outperform complex neural network-based approaches in offline evaluations [2], [3]. It is however unclear whether this superior offline performance also translates to increased user engagement in real-world recommender systems. Furthermore, it is unclear whether the academic nearest-neighbor approaches scale to industrial use cases, where they have to efficiently search through hundreds of millions of historical clicks while adhering to strict service-level-agreements for response latency.

We created a scalable adaptation of the state-of-the-art session-based recommendation algorithm VS-kNN [2]. Our approach minimises intermediate results, controls the memory usage and prunes the search space with early stopping. Consequently, this approach drastically outperforms VS-kNN in terms of prediction latency, while still providing the desired prediction quality advantages over neural network-based approaches. Furthermore, we designed and implemented a real-world system around this algorithm, which is deployed in production at bol.com [5].

In order to tackle the scalability challenge, we leverage an offline data-parallel Spark job that generates a session similarity index. We replicate our index to all recommendation servers, and colocate the session storage with the update and recommendation requests, so that we only have to use machine-local reads and writes for maintaining sessions and computing recommendations. Our system currently computes recommendations on the product detail pages, e.g., the “others also viewed” recommendations on https://go.bol.com/p/9200000055087295.

For evaluation, we ran load tests on our system with 6.5 million distinct items in its index and find that it gracefully handles more than 1,000 requests per second and responds within less than 7 milliseconds at the 90th percentile while using only two vCPU’s in total. Our system easily handles up to 600 requests per second during an A/B test on the e-commerce platform at bol.com with very low response latencies at the 90th percentile of around 5 milliseconds. The session recommendations produced by our system significantly increase customer engagement by 2.85% compared to classical item- to-item recommendations (as produced by our legacy system).

We believe our work is interesting for the ECIR audience, as, to the best of our knowledge, we are the first to implement and evaluate a real-world system based on the recent nearest neighbor-based algorithms for session-based recommendation published by the IR community. We think our experiences will be valuable both to industry practitioners to learn about our system design and requirements, as well as to researchers who might take inspiration for future work incorporating additional requirements beyond predictive accuracy, such as scalability to datasets with millions of items and strict latency constraints for inference requests.

REFERENCES

[1] Q. Liu et al., “Stamp: short-term attention/memory priority model for session-based recommendation,” KDD, 2018.

[2] M. Ludewig, N. Mauro, S. Latifi, and D. Jannach, “Performance comparison of neural and non-neural approaches to session-based recommendation,” in RECSYS, 2019.

[3] B. Kersbergen and S. Schelter, “Learnings from a retail recommendation system on billions of interactions at bol.com,” ICDE, 2021.

[4] I. Arapakis, X. Bai, and B. B. Cambazoglu, “Impact of response latency on user behavior in web search,” in SIGIR, 2014.

[5] B. Kersbergen, O. Sprangers, and S. Schelter, “Serenade – low-latency session-based recommendation in e-commerce at scale,” SIGMOD, 2022 (to appear).

Company Profile

This talk details a production system from bol.com, which is the largest online retail platform in the Netherlands and Belgium. It offers merchandising products in categories such as music, film, electronics, toys, jewelry, watches, baby products, gardening, and DIY. The store serves over 12,6 million customers and offers over 33 Million products and has 7,000 pickup points in the Netherlands and Belgium. More than 47,000 retailers sell products on the platform. The main office employs 2,400 people.

Presenter

Barrie Kersbergen joined bol.com in 2010 as a data scientist. His main focus is solving business problems by improving processes with scalable recommendation systems. Examples of his work are the recommendation systems ‘others also viewed’, ‘often bought together’, ‘look further’, ‘visual recommendations’ and ‘search ranking’ functionality, that serve millions of items to millions of customers. In January 2021, he started as an external PhD candidate at the University of Amsterdam in the “AI for Retail Lab” where he conducts research on scaling up algorithms for session-based recommendation to industry workloads.

2. Accelerating autocomplete suggestions at Salesforce Search

Georgios Balikas, Guillaume Kempf and Marc Brette (SalesForce)

Abstract

Salesforce Search allows users to find their information and get relevant results quickly. It is a key component of the Salesforce Customer Relation Management (CRM) and is massively used: every day hundreds of thousands of organisations use it and it serves hundreds of millions of queries. Each organisation is a Salesforce customer where search is deployed in a configurable and multi-tenant fashion: data reside in the same hardware (servers) but they are strictly available to the organization who owns them.

*Context.* This presentation focuses on efforts to accelerate the search autocomplete performance to meet latency service level agreements. Data in Salesforce are organized in a set of standard objects (Accounts, Opportunities, ..) and organizations can add custom objects that fit their needs. An organization can have hundreds of objects. Implemeting autocomplete by exhaustively querying the objects and ranking the returned results is not feasible. Therefore, the problem is to rank the Salesforce objects to query only the top-N for records that match the user’s input. The solution needs to be fast and accurate, to generalize without frequent re-training to new objects and handle cold-prediction problems for new organizations and new users.

*Implemented solution.* Once the problem and the constraints are clearly defined the presentation will focus on the experimental part. We will first describe the legacy system that uses a statistics heuristic to select the top-N salesforce objects. Then, we will describe a deep learning (DL) model for the same task. For these alternatives we will enumerate the advantages and their inconveniences. We will then present insights on the offline evaluation we performed and share lessons learned from training the DL system and doing error analyses: we dealt with domain adaptation, data drift and training data from logs selection bias. The final solution we implemented in production to remove the data selection biases and improve performance is a hybrid: a DL model that uses the legacy system to be robust on drift. We will present a summary of an A/B experiment we did during the first two weeks of March of 2021 on over 600K users that clicked on 7 million records.

We believe that this presentation will benefit the ECIR audience because:

We discuss an in-production solution of an IR problem at a massive scale along with benchmark (offline) and A/B results. Salesforce CRM is a user-configurable (custom entities) and multi-tenant (data separation between customers) environment. These are interesting system and model design requirements whose impact we will describe on a well-understood problem like autocomplete.
We will report on every step from the conception of the idea, to data collection, model selection and evaluation as parts of the path to production. We made several decisions and found out several issues in error analyses: these “lessons from the tranches” can serve both researchers and practitioners of IR.
We are confident we will give an engaging presentation. We have attended and presented in several editions of ECIR and SIGIR. We hope to tailor the presentation to the audience by abstracting the system architecture complexity while focusing on the machine learning and the information retrieval evaluation challenges.

Company Profile

Salesforce is an American cloud-based software company headquartered in San Francisco, California. It provides customer relationship management (CRM) service and also provides enterprise applications focused on customer service, marketing automation, analytics, and application development.

Presenter

Georgios Balikas is a Senior Data Scientist at Salesforce Search. He works on building production models for machine learning applications such as named entity recognition, classification, ranking and question answering. He holds a PhD from the University of Grenoble Alps on the intersection of machine learning and natural language processing.

The work presented is done in collaboration with Guillaume Kempf and Marc Brette.

Guillaume is a Lead Software Engineer at Salesforce Search focusing on developing deep learning solutions for production while contributing both on the training and the serving parts of the modeling pipeline. Marc is an architect at Salesforce Search and dealing with business integration, performance and relevance whose industry experience spans 25 years at Salesforce, EMC, and Xerox Research Center Europe.

3. Towards Intelligent O&A for Shopify ecommerce shops with Opinew

David /asek and Tomasz Sadowski (Opinew)

Abstract

Ecommerce platforms such as Shopify allow customers to order any item available in traditional brick and mortar stores from the comfort of their homes and get it delivered directly to them without having to leave their house. With the pandemic and government restrictions negatively affecting traditional retail stores, an increasing number of customers and retailers are coming to ecommerce platforms to buy and sell their products online. Product reviews are an important part of the ecommerce experience, allowing customers to benefit from the wisdom of the crowds when selecting products. However, for some products, browsing reviews can be tiresome, leading to many services introducing review helpfulness votes and predictors.

On the other hand, questions & answers sections in product description pages have become an increasingly popular feature of many large online stores, as they provide a simple and convenient format of conveying important product information to the customer. However, not all shops have large enough user traffic to have reliably populated Q&A sections and risk having either none or very few product-related questions and answers. This leaves customers stranded in a sea of product reviews that are generally easier for the retailer to collect or import using services like Opinew. This approach makes it hard for customers to find short and straightforward answers for potential questions that they might have about a product and they end up facing a choice between making an uninformed purchase decision or spending unnecessary time reading through pages of product reviews.

This research project proposes an automated Q&A generation approach, which uses the information available on the product pages in the form of product descriptions and customer reviews to generate product-related questions and answers that can then be added to the product pages by the retailer. The generated product Q&As summarise questions that customers can have about a product and show them to the customer in a short and easy to follow Q&A format.

This research project aims to address the research gap related to product-based question and answer generation. The project uses a variety of IR & NLP techniques combined in multiple stages. These stages include approaches like query generation, passage retrieval and question answering. We evaluate the effects of fine-tuning an existing doc2query question generation model on a dataset of product reviews and questions. Additionally, we evaluate multiple existing question answering models on product-related question answering tasks. This project also evaluates the intermediate steps between question generation and answering (question aggregation and passage retrieval for question answering) by testing different approaches for these tasks and evaluating their impact on the relevance of the generated content. Additional evaluation is also done to determine whether an automated system can generate product Q&As of similar quality as a user-based Q&A system and how such a system compares in terms of the relevance of generated questions and answers. Lastly, the proposed system will be integrated with online shops that deploy Opinew, to evaluate how the auto-generated Q&As affect retailer satisfaction and customer buying decisions.

This research project will be of interest to an ECIR audience as it presents an alternative way of using IR-related technologies in a real-world environment. For instance, while doc2query query generation has been used to improve adhoc search performance, in our approach, we use query generation to formulate the questions which will then be visible to the customer. Furthermore, we use a combination of passage retrieval and question answering transformer models to answer these generated questions using customer reviews and retailer-provided product descriptions. All these approaches are combined to create a single pipeline that covers a combination of IR, summarisation and question-answering tasks.

Company Profile

Opinew is an SME startup based in Glasgow, UK (with 13 employees) that provides review management software for over 10,000 sellers on the Shopify ecommerce platform. In 2021 our reviews influenced over $30Mil of online sales. We collect reviews and Q&A and then display them on widget plug-ins on shops’ product pages. We are developing AI/ML/IR techniques to enhance the user experience for both retailers and customers. Indeed, AI-supported review management allows customers to more quickly identify products that are relevant and appropriate for them; further it could allow retailers to identify the most appropriate questions and correct answers that can help the customers to discriminate among products. Opinew collaborates with IR researcher Craig Macdonald at the University of Glasgow.

Presenter

Tomasz Sadowski is the Founder and CEO of Opinew, which he started as a student project at the University of Glasgow in 2015. Bootstrapped by an Enterprise Fellowship from the Royal Society of Edinburgh in 2017, Opinew has grown to manage product reviews for over 10,000 Shopify merchants in over 150 countries. Aside from day-to-day CEO responsibilities Tomasz provides commercial/product guidance on the Q&A generation project.

David Jasek is a lead software engineer at Opinew and a final year Computing Science master’s student at the University of Glasgow. David started working with Tomasz and Opinew over 3 years ago. While studying at the University of Glasgow, David developed an interest in data science and using machine learning to automate and simplify tasks that would previously require lots of monotonic manual labour. With Tomasz’s help, David is able to combine his study with work at Opinew by researching and developing new features exploiting the capabilities of latest advances in machine learning and NLP.

4. Automated Summarization for Real Time Production Status: An Application of TQFS Nicholas Adams-Cohen, Shade El-Hadik, Emanuel Tapia and Carsten Witte (Accenture)

Abstract

Automating customer communication leveraging AI solutions is becoming essential for industries to keep up with large customer base real time inquiries. To support their efforts in this area, one of the largest USA-based telecom companies presented us with the following business case scenario challenge: automatically generate a customer-facing status update using their electronic ticketing management system (ETMS) as input. With an ETMS, customer issues are logged with the creation of a “ticket,” which travels through various steps of the resolution process. At each stage, the ticket is updated with a new text document (or “comment”). In order to summarize multiple documents from a single ticket to a single status update, our group developed and applied the Time-Series Query Focused Summarization (TQFS) model, a novel algorithm that (1) incorporates TF-IDF query relevance into a pre-trained abstractive model, (2) scales the events of a timeline horizontally, and (3) uses the vector product of the previous two outputs produce our final time-sensitive abstractive model. Our model builds on Query Focused Summarization (QFS) methods by considering the time-dimension, pivotal to summarizing multiple documents in a time-series. One major advantage of our TQFS approach is avoiding the need to gather large training sets or exhaust computational resources on retraining tasks, a business requirement in many cases. We validated our model with extensive computational experiments and through business SMEs confirming adequacy of the customer language communicated in real time.

Company Profile

Accenture is a management consulting and information technology (IT) services company embracing the power of change to create long-lasting value in every direction for our clients, people and communities.With more than 600,000 people worldwide — in 200 cities across 50 countries — Accenture provides services across strategy, consulting, interactive, technology, and operations. Accenture also operates more than 100 “innovation hubs,” developing digital and cloud-based solutions for a broad range of industries. Accenture now works with 91 of the Fortune Global 100. As of 2021, Accenture has made 19 consecutive appearances on the list of Fortune’s “World’s Most Admired Companies.”

Presenter

Shade EL-Hadik, AI Senior Manager (T&L Track). I lead technical sale and delivery teams to identify intelligence strategy and innovation opportunities for our clients to augment their services with cognitive features and build new products, which today are either unavailable or too expensive without application of artificial intelligence. Certified AI solution architect with years of hands-on experience in integration, feature engineering, personalization, data modeling and visualization, software engineering, and solution scaling. I led and delivered multiple mid to large scale projects in various industries using SDLC, microservices, containerizations, CI/CD/CL, and Cloud Native architectures. Involved in RFP and SOW responses, strategy definition, requirements gathering, fit-gap analysis, platforms comparative study, solution design, capacity planning and execution, deployment, testing, risk management, complex project management, and client relation handling.

Nicholas Adams-Cohen, Artificial Intelligence Assoc Principal. I am a quantitative researcher focusing on developing and adapting novel AI methods. My research focuses on translating complex data to better understand human behaviour, which helps me influence strategy and create more efficient systems.

5. The Impact and Importance of Keyphrases in Building NP Products

Mayank Kulkarni (Bloomberg)

Abstract

Keyphrases are prevalent in a wide range of genres, including News Articles, Financial Documents, Scientific Articles, and Discussion Forums. Keyphrases go beyond traditional named entities, capturing phrases present directly in the source text, as well as concepts outside the source text. Automatically identifying these keyphrases not only serves as a strong foundation to facilitate downstream tasks, such as target-based sentiment analysis and improving search results based on user queries, but also to provide insight through trending keyphrases, keyphrase volume aggregation, and historical trends.

In this presentation, we will focus on Keyphrase Extraction and Keyphrase Generation, two key NLP tasks in the context of NLP in the financial domain. Keyphrase Extraction is similar to Named Entity Recognition. That is, given a source text, the task is to identify all text spans that are Keyphrases. On the other hand, Keyphrase Generation is a more generic task that entails inferring all keyphrases associated with a source text and also includes implicit keyphrases that might not appear verbatim in the input text (i.e., so-called ‘absent’ Keyphrases).

In ECIR 2020, we published the first neural models to work on Keyphrase Extraction using contextual embeddings to achieve state-of-the-art results at the time. More recently, we explored how to pre-train large language models by leveraging Keyphrases with a novel pre-training objective in both discriminative and generative settings and demonstrated significant gains over state-of-the-art in Keyphrase Extraction and Keyphrase Generation. At the same time, we also show gains in related span-based NLP tasks, such as Named Entity Recognition, Question Answering, Relation Extraction, and Summarization, using the same pre-trained language models. Our presentation will introduce the above tasks and approaches, and demonstrate that Keyphrases are pivotal in core NLP tasks for Bloomberg.

Company Profile

Bloomberg, the global business and financial information and news leader, gives influential

decision makers a critical edge by connecting them to a dynamic network of information,

people and ideas. The company’s strength – delivering data, news and analytics through

innovative technology, quickly and accurately – is at the core of the Bloomberg Terminal.

Bloomberg’s enterprise solutions build on the company’s core strength: leveraging technology

to allow customers to access, integrate, distribute and manage data and information across

organizations more efficiently and effectively. For more information, visit

www.bloomberg.com.

Presenter

Mayank Kulkarni is a Senior Research Scientist in Bloomberg’s AI Engineering Group, where he is working on building models and scalable infrastructure for real-world solutions that require information extraction and sentiment analysis. His research interests are focused on Keyphrase Extraction, Keyphrase Generation, Named Entity Recognition, Summarization, Language Modeling, Dialogue Understanding, and Code & Natural Language Generation across various domains, ranging from Education to Social Media and News. Prior to joining Bloomberg, Mayank earned his Master’s in Computer Science at University of Florida, during which time he served as a graduate student researcher in the LearnDialogue Lab.

6. Is now a good time? Using reinforcement learning to make optimal decisions about when to send push notifications.

Jonathan Hunt (Twitter)

Abstract

Most recommender systems today are myopic, that is they use a machine learning model to optimize some objective based on the immediate response of the user (such as predicting if the user will engage with some content). This is often misaligned with the true objective, such as creating user satisfaction, and the misalignment is typically mitigated using heuristics. There is significant interest in recommender systems which more directly optimize for long-term value. In this talk, we will present our work applying reinforcement learning to optimize decision making about when to send users mobile application push notifications.

Push notifications are an important paradigm for content consumption now that many users access information on mobile phones. However, users have high demands for push notifications since they may interrupt what they are doing. Sending a user too many push notifications or irrelevant notifications may result in users disabling notifications, uninstalling the application or simply learning to ignore notifications. We will outline evidence that deciding when to send the user a notification requires accounting for the effect of the recommender system’s actions now on the future.

We will describe a solution for deciding when to send a push notification using model-based reinforcement learning we have developed at Twitter. This approach allows us to account for the effects the actions taken now have on the future. We will show the results from a production experiment showing that with this system we are able to send less notifications compared to the baseline heuristic approach while receiving the same level of engagement with the app and increase the fraction of notifications that users choose to open.

This talk should be relevant to other industry groups who may be interested to learn about successful approaches to optimizing when to send push notifications. While some aspects will be specific to Twitter, the underlying approach is general. We also hope that understanding the unique challenges of push notifications will provide academic groups with new ideas for research directions.

Company Profile

Twitter is a global platform for public self-expression and conversation in real time. Twitter allows people to consume, create, distribute and discover content and has democratized content creation and distribution. Founded in 2006, it is widely used by major public figures and has over 200 million users. Twitter has invested heavily in machine learning to better serve the people who use its platform.

Presenter

JJ Hunt is a research scientist at Twitter Cortex Applied Research – Recommender Systems. At Twitter he studies a variety of recommender systems problems including push notifications,

long-term value and advertising. Our team collaborates closely with several product teams to

put research into practice. Prior to Twitter, jj researched Deep Reinforcement Learning and

other topics at DeepMind and Brain Corporation. Prior to discovering machine learning, he

spent time as an unsuccessful neuroscientist and physicist, and holds a PhD from the University of Queensland (Australia) and a BSc Hons from Massey University (New Zealand) from those attempts.