The team developed machine learning algorithms to predict the emergence of new patentable technologies related to autonomous vehicles. The team’s insights will enable companies to focus their investments on the most relevant innovations.
Team: Ilias Atigui, Eliott Atlani, Djavan De Clercq, Benjamin Tan
Advisor: Lee Fleming
This project combined natural language processing and machine learning to automatically classify patents.
The objective of this study was to apply natural language processing to extract insights from patent data for feature engineering and use the latest methods in multi-label classification to automatically classify patents into their respective technology areas. Autonomous vehicle patent data was used as a case study.
- DATA QUERY: Extracted data from a SQL database hosted on Google Cloud 1
- NLP & ML: NLP & multi-label machine learning for automatic classification 2
- ACADEMIC PAPER: We have submitted our results for publication in a leading journal
Method
- Topic Modeling: This study used latent dirichlet allocation for topic modeling and feature engineering
- Multi-label ML: We used algorithm adaptation & problem transformation to solve our multi-label problem.
- Web app visuals: We used R Shiny to develop an interactive web application that allowed customizable user input to explore our machine learning and NLP results.
- Full reproducibility: The full code behind our novel method of NLP-based multi-label classification will be provided online. Academic publication We provided a full write-up of our results, which will be submitted to a leading journal
← View all Capstone Projects