Tomáš Křen

Programmer  |  Researcher  |  Dreamer

Selected Publications

In this paper we describe a multi-objective genetic programming algorithm which can be used to create complete machine learning workflows. The algorithm is an extension of a single-objective one. In a series of test on four datasets, we show that the additional objectives can be used to search for smaller or faster models. The algorithm is also in some cases much faster than the single-objective one while obtaining results of similar quality.

Automated program discovery has been mainly approached via Genetic Programming, which represents the search space of programs implicitly by a collection of individuals. In this paper, we propose a way of representing program search space explicitly, with a top-down hierarchy of semi-constructed programs. Together with a bottom-up generation procedure, we maintain an overview of the overall search space structure, while being able to quickly get samples from the fringe. Moreover, having a type system with parametric polymorphism allows us to limit the state space size using type level programming while keeping a complete control over program sizes. Our approach can naturally be used for searching the space of programs using tree search methods. We back this claim by a simple experiment utilizing a Monte-Carlo Tree Search algorithm, though other search methods (such as A*) might be used as well.

Manual creation of machine learning ensembles is a hard and tedious task which requires an expert and a lot of time. In this work we describe a new version of the GP-ML algorithm which uses genetic programming to create machine learning workflows (combinations of preprocessing, classification, and ensembles) automatically, using strongly typed genetic programming and asynchronous evolution. The current version improves the way in which the individuals in the genetic programming are created and allows for much larger workflows. Additionally, we added new machine learning methods. The algorithm is compared to the grid search of the base methods and to its previous versions on a set of problems from the UCI machine learning repository.
International Journal on Artificial Intelligence Tools, Volume 26, Issue 05,2017

In this paper we present a new initialization method for genetic programming based on randomized exhaustive enumeration. It naturally enables complete sharing of sub trees among individuals which in turn allows an efficient reuse of computations. Moreover, it can be implemented as a random one pass initialization. We present experimental results on different instances of simple symbolic regression exploring the landscape of possible initializations based on our approach and confirming the usability of these initializations.
IEEE SMC, 2015

Lambda calculus representation of programs offers a more expressive alternative to traditional S-expressions. In this paper we discuss advantages of this representation coming from the use of reductions (beta and eta) and a way to overcome disadvantages caused by variables occurring in the programs by use of the abstraction elimination algorithm. We discuss the role of those reductions in the process of generating initial population and propose two novel crossover operations based on abstraction elimination capable of handling general form of typed lambda term while being a straight generalization of the standard crossover operation. We compare their performances using the even parity benchmark problem.
GECCO, 2014

Tree based genetic programming (GP) traditionally uses simple S-expressions to represent programs, however more expressive representations, such as lambda calculus, can exhibit better results while being better suited for typed GP. In this paper we present population initialization methods within a framework of GP over simply typed lambda calculus that can be also used in the standard GP approach. Initializations can be parameterized by different search strategies, leading to wide spectrum of methods corresponding to standard ramped half-and-half initialization on one hand, or exhaustive systematic search on the other. A novel geometric strategy is proposed that balances those two approaches. Experiments on well known benchmark problems show that the geometric strategy outperforms the standard generating method in success rate, best fitness value, time consumption and average individual size.
CEC, 2014

All Publications

. Multi-Objective Evolution of Machine Learning Workflows. IEEE SSCI, 2017.

. Combining Top-Down and Bottom-Up Approaches for Automated Discovery of Typed Programs. IEEE SSCI, 2017.

. Algorithm Discovery with Monte-Carlo Search: Controlling the Size. IEEE ICTAI, 2017.

. Automatic Creation of Machine Learning Workflows with Strongly Typed Genetic Programming. International Journal on Artificial Intelligence Tools, Volume 26, Issue 05, 2017.


. Evolving Workflow Graphs Using Typed Genetic Programming. IEEE SSCI, 2015.


. Generating Workflow Graphs Using Typed Genetic Programming. Workshop MetaSel ECMD PKDD 2015, 2015.


Selected Projects

This section is still under development, so the information presented here is very incomplete. Sorry.


A library for Typed Functional Genetic Programming approaching its objective from the functional programming point of view by using type theory.


The most notable application of TFGP library so far is a system for automatic creation of machine learning workflows by means of computational evolution.


Game world and programming language. At the same time. My bachelor’s thesis.

Foolship Trees

Game in which you grow your forest. Work in progress.


A port of TFGP library to Python enriched with Monte Carlo search methods for program genaeration.

Cellular Plaza

Interactive evulution of vissually appealing tile patterns using TFGP and cellular automata. Used as a part of architectonic design.


Predecessor of TFGP project written in Haskell. My master’s thesis.


Runnable code encoded in JSON inspired by lambda calculus and Lisp. Work in progress.