An information entropy and latent Dirichlet allocation approach to noise patent filtering
- 주제(키워드) 도움말 Noise patent filtering , Information entropy , Latent Dirichlet allocation
- 발행기관 ELSEVIER SCI LTD
- 발행년도 2021
- 총서유형 Journal
- 본문언어 영어
초록/요약 도움말
Defining valid patents in a particular technological field is an indispensable step in patent analysis. To minimise the risk of missing valid patents, domain experts manually exclude irrelevant patents, known as noise patents, from an initial patent set derived using a loose retrieval query. However, this task has become time-consuming and labour intensive due to the increasing number of patents and rising complexity of technological knowledge. This study proposes a semi-automated approach to noise patent filtering based on information entropy theory and latent Dirichlet allocation. The proposed approach comprises four discrete steps: (1) structuring patents using a term-weighting method; (2) recommending noise patent seeds based on the information quantity of patents in terms of focal keyword groups; (3) measuring text similarities for patent clustering using latent Dirichlet allocation; and (4) identifying potential noise patent clusters with respect to the noise patent seeds. Our case study confirms that the proposed approach is valuable as a complementary noise patent filtering tool that will enable domain experts to focus more on their own knowledge-intensive tasks such as prior art analysis and research and development (R&D) strategy formulation.
more