Tutorials for ICML/COLT '97

Nashville, Tennessee

Tuesday July 8, 1997

Tutorials for both the International Conference on Machine Learning and the Conference on Computational Learning Theory will take place during ICML and COLT, which will be co-located at Vanderbilt University during the week of July 6-12, 1997. There will be two, 2-hour tutorial sessions separated by a short break. The cost of the tutorials is covered by the ICML and COLT registration fees, thus there is no additional cost to attend the tutorials. Below is a list of the tutorials that will be offered along with a brief description for each of the tutorials (and for most a link to a web page with more information).

Please look these tutorial topics over and upon or after registering for COLT and/or ICML, please fill out a tutorial preference form (coming soon) to help us schedule and allocate resources. This procedure is optional, but nonetheless very helpful to us.

ICML/COLT Tutorial Session 1 (1:30-3:30, July 8):

Inductive Logic Programming (Wilson 126)
The Probabilistic Causal Graph Model (Wilson 112)
Multiplicative Update Algorithms (Wilson 103)

ICML/COLT Tutorial Session 2 (4:00-6:00, July 8):

Learning in Planning (Wilson 112)
Game Theory (Wilson 113)
Fourier Analysis in Machine Learning (Wilson 103)
A (Very) Short Course in Computational Learning Theory (Wilson 126)

Inductive Logic Programming given by Luc De Raedt (Katholieke Universiteit Leuven, Belgium) and Nada Lavrac (Stefan Institute, Ljubljana, Slovenia)

Logic has always been very popular as a representation language for inductive concept-learning. Recently, the research investigating the possibilities for learning in a first-order representation have been concentrated in the research field of Inductive Logic Programming (ILP), which studies inductive learning within the framework of logic programming.

ILP significantly extends the usual attribute-value representation and consequently enlarges the scope of machine learning applications, which include inductive data engineering, scientific discovery and knowledge discovery in databases. In KDD terms, propositional techniques learn from a single relation or table in a databases, whereas ILP typically copes with multiple relational tables.

The tutorial aims at giving an overview of the field of ILP, with an emphasis on the foundations and basic techniques. It provides the background of ILP, discusses ILP as search, presents a selection of ILP techniques, as well as different issues addressed in ILP (imperfect data, bias, predicate invention, learnability, and ILP applications).

More Information

The Probabilistic Causal Graph Model by Richard Neapolitan (Northeastern Illinois University, USA)

The Bayesian network has become one of the most important architectures for representing and reasoning with uncertainty in artificial intelligence and expert systems. Based on the Bayesian network is the Probabilistic Causal Graph Model, which maintains that the causal relationships among a set of variables U can be represented by a directed acyclic graph D in which each node consists of an element of U, and a joint probability distribution on the variables in U, which satisfies the Markov and faithfulness conditions for D. This tutorial describes this model and the intuition behind the Markov and faithfulness conditions. It then shows that much of the structure of D can be retrieved from the conditional independencies in the probability distribution over the variables in U. That is, it shows how causal relationships can be deduced from probabilistic information. Furthermore, it discusses how a good deal of structure can be retrieved even when only the distribution on a subset of U is observed. Finally, it cites research by Piaget that support the hypothesis that humans learn causal structure in this same manner and that perhaps the very concept of causality is a recapitulation of ones experience with time and correlation among variables.

Multiplicative Update Algorithms by Manfred Warmuth (University of California, Santa Cruz, USA).

A new family of algorithms have been recently developed within the Computational Learning Theory community. These algorithms update their parameters by factors that use the gradients of the loss function in their exponent. These algorithms have radically different behavior from the standard algorithms that are based on various heuristics such as gradient descent. The goal of this tutorial is to familiarize people with this new family of algorithms, explain the differences between the new family and existing algorithms, and to demonstrate the practical importance of the new family.

We will show how the new family can be successfully applied to tasks as diverse as linear regression, density estimation with a mixture of Gaussians, and finding the best Hidden Markov Model for a given set of strings. Other likely applications include the training of neural networks. Thus the new family has the potential to revolutionize the way many large learning problems are approached.

Besides introducing the new family we will give a general framework for deriving, tuning, and bounding the performance of on-line algorithms.

Learning in Planning by Manuela Veloso (Carnegie Mellon University, USA)

Planning is a decision making process which involves the generation of sequences of actions that achieve a set of goals from a given state. The main issues involved in the computational planning task are: How to acquire and represent the planning action model? How to generate plans in a computationally tractable way? How to create plans of good quality? and finally How to scale up to real-world problems? Machine learning approaches can be applied to planning to automate the process of interpreting the planning experience into reusable task knowledge in order to improve the overall planner's performance. Learning in planning goes beyond inductive data classification. It addresses the acquisition of knowledge to efficiently guide a decision making process to reach solutions of good quality based on its input data. Several such learning techniques have been developed that have shown empirically to scale well in task and problem complexity. Learning in planning provides ground for both applied and theoretical research. In this tutorial we will: overview several planning algorithms and introduce their learning opportunities; present methods that combine inductive and deductive learning to incrementally acquire control knowledge and improve plan quality; and briefly discuss learning approaches in multiagent planning systems.

Game Theory by Rakesh Vohra (Ohio State University, USA) and Dean Foster (University of Pennsylvania, USA)

Game Theory is a systematic way of analyzing strategic behavior where one agent's actions depend on another. It should more properly be called interactive decision theory. The agents that interact with each other are called players and the set of rules that govern their interactions is called the game.

Two questions animate the subject:

1) Given a game, what will the agents do?
2) If I am an agent in the game, what should I do?
Answers to both question involve an equilibrium notion of some kind. Broadly speaking, the outcome of the game must be such that no player can do better by deviating. A particular equilibrium concept is traditionally justified by an appeal to the rationality of the players. This proves problematic because the level of rationality required is implausibly excessive, to the point that the conclusion appears to be assumed rather than deduced. In the last 10 years, an alternative (but intimately connected) approach to justifying equilibrium has been explored. It presumes that players are bondedly rational and that that in repeated interactions they "learn" the equilibrium of the game.

In this tutorial we will describe the major equilibrium notions used in Game Theory as well as the arguments used in their defense as well as criticisms. Second, we will give an overview of the learning and evolutionary program in Game Theory. Finally, we will describe some applications of Game Theoretic ideas to learning theory.

Slides for the Presentation

Fourier Analysis in Machine Learning by Jeffrey Jackson (Duquesne University, USA) and Christino Tamon (Clarkson University, USA)

Fourier analysis is a versatile tool for discovering useful attributes of many interesting function classes, including perceptrons, decision trees, and Disjunctive Normal Form (DNF) expressions. For example, Fourier analysis applied to an arbitrary Boolean function f can provide a lower bound on the number of leaves in any decision tree representation of f. Fourier-derived attributes such as this are central to a number of powerful learning algorithms, primarily but not exclusively in theoretical learning models.

Perhaps surprisingly, many Fourier-based results that might seem quite deep are conceptually simple to derive and understand once a few basics are mastered. Beginning with these basics, this tutorial will help the attendees develop a strong intuitive understanding of the Fourier method and several of the foundational Fourier learning results. Some open learnability problems and associated Fourier observations will also be presented.

Tentative topics include:

Basics: only typical prior knowledge assumed
Connections to perceptron learning, including a practical boosting-based algorithm
Standard Fourier algorithmic ideas for learning:
Learning by approximating the "power spectrum"
Goldreich/Levin (aka Kushhilevitz/Mansour)
Fourier analyses:
Decision trees
Open questions
More Information

A (Very) Short Course in Computational Learning Theory by Michael Kearns (AT&T Labs Research, USA)

This tutorial is intended for the ICML '97 attendee who is not intimately familiar with COLT research but would like to learn more. Rather than present an exhaustive survey of COLT results, I will instead choose a few illustrative but important topics and methods, and attempt to illuminate their core ideas in the most elementary manner possible, with all unnecessary technical detail and notation eliminated. The goal will not be rigor but intuition and understanding.

Topics will be chosen for the influence they have had in the COLT community, or for their potential interest to practitioners. Possible topics include:

Simple results in the PAC model and its variants
Learning curves and the VC dimension
Types of intractability results
Learning in the presence of noise and errors, and "agnostic" learning
Query models and algorithms
Mistake-bounded learning and multiplicative update algorithms