## Genetic Algorithm for Conference Schedule Mining

February 27, 2012 Leave a comment

**Genetic Algorithm for Conference Schedule Mining
**

**Ali Tarhini
**

**Problem Definition:
**

Scheduling sessions, also known as “time tabling” at a large conference is a persistent challenge. The problem states that given a large number of sessions, rooms, time slots and a set of constraints, it is required to mine for the best schedule that covers all sessions within the given time slots while satisfying the required set of constraints. Allocating sessions to specific “time slots” requires advanced computational techniques for finding the best schedule that satisfies most of the constraints. The problem with large conferences is that many sessions will be scheduled to run concurrently; therefore an attendant may run into a problem if two or more of his/her favorite sessions are scheduled to run at the same time. Although there have been efforts to tackle the problem by using “tracks” to categorize sessions, which basically assumes that an attendant will stick to one track where the sessions of a track are scheduled sequentially in time. It turned out that this solution is not efficient in practice because attendants tend to hop from one track to the other rather than following a strict schedule for one track. More advanced approaches reduce the problem to the Job Shop Scheduling problem (JSSP) with multiple precedence constraints which is an optimization problem composed of resources, operations, and constraints. A job consists of a sequence of operations with each operation is part of exactly one job. Each operation is executed on a resource with starting time during a processing time with precedence constraints. A job has a release date, completion time and due date. A resource can execute only one operation at a time. We assume that any successive operations of the same job are going to be processed on different machines. It is desired to find a feasible schedule, which optimizes a set of given performance measures. However, JSSP problems tend to be theoretical and are not feasible to be applied in practice for international conferences because the algorithm does not take into account human interest in a particular set of sessions. For example, JSSP cannot handle attendant’s favorite sessions because this is not a constraint, it’s a training set. In this paper, a genetic algorithm is presented to provide a solution for the conference scheduling problem. In the context of genetic algorithms, a class of approximation algorithms, our “solution” means one among a set of all possible “best” solutions. It may not be the “best possible solution” but is certainly among the best.

**Proposed algorithm:
**

Input: Sessions, Rooms, Timeslots, set of preference sessions.

As any genetic algorithm, this approach utilizes the genetic property “survival of the fittest” which means that as solutions are getting generated, bad solutions are eliminated and good solutions are carried over to the next step. In order to determine what’s good and what’s bad, a Fitness function is used. This function validates a given solution with the set of constraints and list of preference sessions and returns a value indicating how relevant the solution is. The fitness function also takes into consideration the training set of favorite sessions recommended by attendants and tries to optimize the fitness according to how well each solution is close to any of the favorite choices.

A solution is simply a set of associations of a session to room and a timeslot.

The algorithm starts by generating solutions at random, that is, by randomly picking a session and assigning it to a room and a timeslot. The session is then removed from the input and the selected room/timeslot is also removed from the list of remaining rooms/timeslots. This is to eliminate the possibility of assigning the same session twice. Hence, the probability of coming up with solutions that are at least worth considering is optimized. We will denote the set of generated solutions as generation 0; the first generation. Next, each solution in generation 0 is applied to the Fitness function which compares the solution against the set of constraints and returns a value indicating relevance. Once all solutions are evaluated, they are sorted in descending order(assuming lower value indicates better solution) and the top X percentage of the solutions is chosen to be carried on to the next generation. Kill all the remaining solutions. The surviving solutions will undergo two operations in order to compute the next generation. Mutation and Crossover. In mutating a solution, one association of a session to a room/timeslot is selected and the session is changed to a different session that is not already listed in the solution. In crossover, two solutions are picked at random and sessions are exchanged in between. This procedure is repeated over many generations until a solution that meets most of the constraints is met.

**References:
**

**[1] Wayne Smith, Applying Data Mining To Scheduling Courses at a University, School of Information Systems and Technology Claremont Graduate University
**

This study demonstrates the feasibility of applying the principles of data mining. Specifically it uses association rules to evaluate a nonstandard (“aberrant”) timetabling pilot study undertaken in one College at a University. The results indicate that inductive methods are indeed applicable, and that both summary and detailed results can be understood by key decision-makers, and, straightforward, repeatable SQL queries can be used as the chief analytical technique on a recurring basis. In addition, this study was one of the first empirical studies to provide an accurate measure of the discernable, but negligible, scheduling exclusionary effects that may impact course availability and diversity negatively.

**[2] Atif Shahzad Nasser Mebarki Discovering Dispatching Rules For Job Shop Scheduling Problem Through Data Mining
**

A data mining based approach to discover previously unknown priority dispatching rules for job shop scheduling problem is presented. This approach is based upon seeking the knowledge that is assumed to be embedded in the efficient solutions provided by the optimization module built using tabu search. The objective is to discover the scheduling concepts using data mining and to obtain a rule-set capable of approximating the efficient solutions in a dynamic job shop scheduling environment. A data mining based scheduling framework is presented which consists of 3 phases: Search for a set of solutions, Apply data cleaning and for the resulting solutions including aggregation and attribute construction and the finally to model induction and interpretation by applying decision tree induction algorithm.

**[3] Ahmed Hamdi Abu Absa, Sana’a Wafa Al-Sayegh, E-Learning Timetale Generator Using Genetic Algorithms
**

In this paper, the authors explain the details of the implementation of a computer program which employs Genetic Algorithms (GAs) in the quest for an optimal lecture timetable generator. GA theory is covered with emphasis on less fully encoded systems employing nongenetic operators. The field of Automated Timetabling is also explored. A timetable is explained as, essentially, a schedule with constraints placed upon it. The program, written in java, has the special libraries to deal with genetic algorithm which are used for the implementation. In a simplified university timetable problem it consistently evolves constraint violation free timetables. The effects of altered mutation rate and population size are tested. It is seen that the GA could be improved by the further incorporation of repair strategies, and is readily scalable to the complete timetabling problem.

**[4] Branimir Sigl, Marin Golub, Vedran Mornar, Solving Timetable Scheduling Problem
**

**Using Genetic Algorithms**

In this paper a genetic algorithm for solving timetable scheduling problem is described. The algorithm was tested on small and large instances of the problem. Algorithm performance was significantly enhanced with modification of basic genetic operators. Intelligent operators restrain the creation of new conflicts **in **the individual and improve overall algorithm **‘s **behavior.*
*The program uses eliminating selection, which chooses and eliminates bad individuals from the current population, making room for new children that will be born from the remaining individuals. The probability of elimination increases proportionally with the fitness value of the individual.

**As**the remaining individuals are better than the average of the population, it is expected that their children will be better as well. There is some probability (though very small) that eliminating selection deletes the best individual. That would ruin the algorithm efforts and put its work back for some number of generations. Therefore, protection mechanism for best individuals has to be made, so the good genetic material is sustained in population. It is called

*the elitism.*The authors’ choice was to keep just the top one individual. The reproduction operators constitute