"Sixth Framework Programme"
Search   


Workplan

The methodology underlying APrIL II is based on the assumption that -- in order to understand the problem of probabilistic logic learning and its applicability -- various representations as well as applications have to be studied. Various probabilistic representations (and corresponding inference and learning engines) are needed to cope with different types of problems. This situation is akin to that in traditional approaches to probabilistic representations and learning. Indeed, Bayesian networks and stochastic context free grammars are just two examples of quite different (even complementary) representations. Furthermore, expressivity usually has to be balanced with e ciency. Also, within the field of machine learning, different settings such as supervised learning and unsupervised require different techniques. When time and action are important, Markov decision processes and reinforcement learning come into play. Nevertheless, despite the different settings, representations and algorithms many of the underlying principles remain the same such as maximum likelihood, Bayesian approaches, minimal description length, Expectation-Maximization (EM), gradients, MCMC, etc. The APrIL II project aims at
identifying the underlying principles of probabilistic logic learning through the investigation of different settings and representations for probabilistic logic learning.
This is also the underlying motivation for the workpackages WP 1 (Representation) and WP 2 (Learning).

A second methodological guideline comes from the application perspective. As it is our goal to obtain an appreciation of the applicability of probabilistic logic learning, APrIL II will
develop different types and classes of probabilistic logic learning systems and apply them in a variety of different applications.

The application domains, that have been selected, all require the need for probabilistic logic learning, but are still quite different in the underlying requirements they impose on probabilistic logic learning. Indeed, the metabolic pathways can be modelled in a kind of graph structure (related to Bayesian networks). On the other hand, proteins and genetic information possess a sequential nature, which may be more suited for modelling with approaches based on (hidden) Markov models or grammars. In addition, two different types of probabilistic logic learning systems will be applied. For the protein folding domain, general purpose probabilistic logic learning methods will be applied, whereas for the metabolic pathways and haplotype applications, probabilistic logic learning components will be embedded into methods and systems that already exist for these applications. Furthermore, different classes of probabilistic logic learning, such as sequence, graph, discrete structure and grammar based will be considered. This motivates the workpackages WP 3 (Systems) and WP 4 (Applications).

Despite the fact that the APrIL II project will investigate different settings, types and classes for probabilistic logic learning, it should be pointed out that the resulting representations and algorithms are strongly connected to one another. This situation is akin to propositional probabilistic logic learning, where coherent principles, formalisms and algorithms have been developed. Indeed, embedded systems to be developed within APrIL II will contain some of the core components of the general purpose ones. Furthermore, experiences with the embedded components should provide valuable feedback for the general purpose level. Furthermore, also the different classes of probabilistic logic learning form a coherent whole (cf. Section 7.7, WP 1 for more details on the relationship).

Finally, the insights obtained at all levels concerning probabilistic logic learning should allow us to identify a core theory of probabilistic logic learning (WP 5). This theory should make abstraction of specific representations, learning approaches and settings as much as possible, and its should serve as the basis for further developments in the area.

In addition, there are the usual workpackages concerned with Dissemination (WP 6) and Management (WP 7).

APrIL II will have three milestones, one milestone at the end of each year for the probabilistic logic learning techniques as well as for each of the applications:

Milestone A:
  • Problem formulation and data collection for each of the applications.
  • New and missing components of probabilistic logic representations and inference methods have been identified and designed.
  • Components of an initial theory of probabilistic logic learning are formulated.
Milestone B:
  • Experiments with real data and prototype probabilistic logic learning systems are running.
  • New and missing components of probabilistic logic learning algorithms have been identified and designed.
  • (Possibly embedded) prototypes of probabilistic logic learning system have been implemented.
  • A refined theory of probabilistic logic learning is formulated.
Milestone C:
  • The applications have been turned into show-cases for probabilistic logic learning.
  • The systems are ready for use on other applications as well.
  • An integrated theory of probabilistic logic learning is available.
  • Final APrIL II Report.
Two key deliverables of the APrIL II project include 1) the APrIL book (D20) which will provide an overview of the field of probabilisitic logic learning (based on overviews of the different workpackages) and 2) the APrIL repository (D19) which will contain publications, software and data sets on probabilisitic logic learning and its applications.