Digital Library

cab1

 
Title:      TECHNOLOGICAL AREAS DETECTION AND CLUSTERING FOR LARGE-SCALE OF PATENT TEXTS
Author(s):      Mustafa Sofean and Hidir Aras
ISBN:      978-989-8533-80-7
Editors:      Ajith P. Abraham, Jörg Roth and Guo Chao Peng
Year:      2018
Edition:      Single
Keywords:      Patent Analysis, Community Detection, Topic Modeling, Big Data, Text Mining
Type:      Full Paper
First Page:      51
Last Page:      58
Language:      English
Cover:      cover          
Full Contents:      click to dowload Download
Paper Abstract:      The text of technical field section in patent document displays the technological areas of inventions. Analyzing this text will help patent information professionals searching for related patents, and observe and track trending areas of inventions. This work introduces methods for detecting texts of technical field in unstructured patents texts, extracting terms of technological areas, and identifying semantically meaningful communities/topics for a large collection of patent documents. A hybrid text mining techniques such as machine learning, rule-based algorithm, and heuristics are used to identify the text of technical field section. The most significant technical areas are extracted by applying scalable analytics service that involves natural language processing techniques. Community detection approaches are applied for efficiently aggregating terms of technical areas into communities/topics based on a network graph. Our methods are built on top of big-data architecture to deal with large-scale of patent texts. A comparison to the standard LDA clustering is presented, and the results show that our methods do not share common terms and perform well with real applications.
   

Social Media Links

Search

Login