A major challenge for third generation data mining and knowledge discovery systems is the integration of different data/knowledge resources (which are highly diverse in nature in terms of representation and data formats) and computer systems (tools for data integration, data mining and knowledge discovery) which are distributed across the network. While the first generation data mining systems supported a single algorithm or a small collection of algorithms that are designed to mine attribute-valued data, today's second generation systems are characterized by supporting high performance interfaces to databases and data warehouses and by providing increased scalability and increased functionality; for example, second generation systems can mine larger and more complex data sets and provide increased flexibility by supporting a data mining schema and a data mining query language.
The emerging third generation data mining and knowledge discovery systems should be able to mine distributed and highly heterogeneous data found on intranets/extranets/grid and integrate efficiently with operational data/knowledge management and data mining systems. The key technologies which will make third generation data mining and knowledge discovery possible is to provide meta-data (semantic annotations) of different information resources (data, human-coded knowledge, and machine-induced patterns and predictive models) and data mining and knowledge discovery systems (pattern mining and model discovery tools) and implementation of data mining and knowledge discovery tools as services available on the web. Such service-oriented data mining and knowledge discovery systems will enable meta-level search of data/knowledge resources and systems, enabling the construction of knowledge discovery workflows (representing potentially repeatable sequences of data mining and data integration steps), resulting in improved pattern and model discovery.
Compared to contemporary search engines which provide a means of locating data on the net, third generation data mining and knowledge discovery systems will provide a means for discovering patterns, associations, changes and anomalies in networked data, where each data source comes with its own structure, semantics, data formats, names, concepts, and access methods. Currently, the burden falls on the user to manually (via programs) convert between the data formats, resolve conflicts, integrate data and interpret results in order to make viable use of this information.
The workshop is planned as a half-day workshop. Given its novelty, we will try to reserve one hour for a panel discussion on future research trends in this area. Also, invited speakers will present the Taverna software tool which allows users to integrate different software components, including web services, to construct scientific workflows for knowledge discovery.
The workshop calls for papers on the following topics:
- Theoretical framework for third generation data mining and knowledge discovery
- Inductive databases, Constraint-Based Data Mining and Inductive Queries
- Learning from data and knowledge (texts, ontologies, …)
- Service-oriented approaches to data mining
- Meta-level annotations and search for data mining services
- Data mining workflows/scenarios
- Data mining on the grid
- Applications of service-oriented data mining approaches in business, ecological modeling, medicine, health care, e-science, bioinformatics, …
Paper submission guidelines
Papers must be in English, formatted according to the Springer-Verlag Lecture Notes in Artificial Intelligence guidelines. Authors instructions and style files can be downloaded at http://www.springer.de/comp/lncs/authors.html. We recommend a maximum length of 12 pages in this format, including figures, title pages, references, and appendices. Shorter papers presenting new ideas or thought-provoking issues are also welcome.
We recommend the use of LaTeX for the preparation of your paper. It is also possible to use Lyx editor as a frontend for LaTeX. Instructions can be found here.
Submission website
Papers should be submitted as PDF files using the submission site:
http://www.easychair.org/conferences/?conf=sokd08