Identifying relevant databases for multidatabase mining

Huan Liu, Hongjun Lu, Jun Yao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

37 Scopus citations

Abstract

Various tools and systems for knowledge discovery and data mining are developed and available for applications. However, when we are immersed in heaps of databases, an immediate question facing practitioners is where we should start mining. In this paper, breaking away from the conventional data mining assumption that many databases be joined into one, we argue that the first step for multidatabase mining is to identify databases that are most likely relevant to an application; without doing so, the mining process can be lengthy, aimless and ineffective. A relevance measure is thus proposed to identify relevant databases for mining tasks with an objective to find patterns or regularities about certain attributes. An efficient implementation for identifying relevant databases is described. Experiments are conducted to validate the measure’s performance and to show its promising applications.

Original languageEnglish (US)
Title of host publicationResearch and Development in Knowledge Discovery and Data Mining - 2nd Pacific-Asia Conference, PAKDD 1998, Proceedings
EditorsXindong Wu, Ramamohanarao Kotagiri, Kevin B. Korb
PublisherSpringer Verlag
Pages210-221
Number of pages12
ISBN (Print)3540643834, 9783540643838
DOIs
StatePublished - 1998
Externally publishedYes
Event2nd Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 1998 - Melbourne, Australia
Duration: Apr 15 1998Apr 17 1998

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume1394

Other

Other2nd Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 1998
Country/TerritoryAustralia
CityMelbourne
Period4/15/984/17/98

Keywords

  • Data mining
  • Multiple databases
  • Query
  • Relevance measure

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Identifying relevant databases for multidatabase mining'. Together they form a unique fingerprint.

Cite this