In the modern information age, information overload is experienced in almost every day-to-day activity. This is especially true in the area of research and innovation (R&I) where this increase in information volume and velocity threatens to overburden scientists and engineers in all disciplines.
We will present the high-level architecture of the platform which we have developed to analyze published research content at global scale (>100 million publications), as well as the data analysis pipeline, the main machine learning techniques and some of the AI driven services (e.g., content-based recommendation, similarity and trends analysis) which we have incorporated to alleviate the impact of such information overload.
Therefore, we will go from data aggregation, to information enrichment, knowledge discovery and representation learning and from No-SQL based data storage and Lucene-based indexing to microservices architecture and distributed big data analytics.
As a use case, we will try to identify the main concepts and topics in previous Voxxed Days presentations and “associate” them to worldwide research in computer science.