Research Topics

+ + +

Heterogeneous Liked Open Data Management

Preprocessing-based Approaches for Imbalanced Classification

In classification, class imbalance is a factor that degrades the classification performance of many classification methods. We have studied resampling (over- and under-sampling) and metric learning for the class imbalance problem. Resampling is one widely accepted approach to the class imbalance, and metric learning is another approach to deal with insufficient feature space problem. In addition, since metric learning methods also suffer from the class imbalance problem, we have studied the combination of resampling and metric learning.

Japanese Legal Data Management

Faceted Search for Semi-structured Data

Semi-structured data such as XML data have been widely used in various situations in order to reuse information not only in the services but also external applications. Utilizing such semi-structured data is an important challenge, and this research particularly focuses on the exploration over semi-structured data. Faceted search is one of the widely accepted exploratory search methods, therefore, this research applies faceted search to semi-structured data. In order to construct a faceted search system, this research works on four directions: (1) a framework to construct a faceted search system over XML data which is a tree-structured semi-structured data, (2) the extended framework for graph-structured semi-structured data, (3) an automation scheme for extracting facet information from texts, and (4) utilization of textual contents in semi-structured data.