Introduction

During my PhD, I’ve worked on a variety of global optimization problems when fitting my model to experimental data. I tried various heuristic optimization procedures implemented in pagmo (a great library developed by ESA) and I found Differential Evolution particularly efficient for my problems. This heuristic algorithm is extremely simple to describe and easy to implement. This is exactly what this post is about.

Algorithm

Problem:

Given cost function \(f: \mathbb{R}^n \to \mathbb{R}\) find \(\mathrm{arg min}_{\mathbf{x} \in A} f(\mathbf{x})\) where \(A \subseteq \mathbb{R}^n\).

Solution: The idea is losely inspired by evolution. Generate population of candidate solution vectors. Then for each such vector, “parent” vector, generate a new vector using mutation (mixing components of other vectors in population) and crossover (replacement of some of the components of the “parent” vector with mutated values). This gives “son” vector. Then replace “parent” with “son” if “son” is better. Repeat as long as needed.

Fix \(N_{\text{generations}}, N_{\text{population}} \in \mathbb{N}\), \(F_{\text{mutation}} \in [0, 2]\), \(p_{\text{CR}} \in [0,1]\).
Generate a population of \(N\) candidate solution vectors \(\mathbf{x}_0^{(j)}\) where \(j=1..N_{\text{population}}\), \(\{\mathbf{x}_0^{(j)}\ |\ j=1..N_{\text{population}}\}\).
For each \(n \in \{1,...,N_{\text{generations}}\}\). Evolve \(\{\mathbf{x}_n^{(j)}\ |\ j=1..N_{\text{population}}\}\):
- for each \(j\):
  - Pick randomly three “parent” vectors \(\mathbf{x}_{n-1}^{(k_1)}\), \(\mathbf{x}_{n-1}^{(k_2)}\), \(\mathbf{x}_{n-1}^{(k_3)}\) where \(k_i\) randomly chosen from \(\{1,...,N\}\setminus\{j\}\) without replacement.
  - Generate \(\mathbf{m} = \mathbf{x}_{n-1}^{(k_1)} + F_{\text{mutation}} \cdot (\mathbf{x}_{n-1}^{(k_2)}-\mathbf{x}_{n-1}^{(k_3)})\).
  - Generate \(\mathbf{u} = (g(u_1)\ g(u_2)...g(u_{N_{\text{population}}}))\) where each \(u_i \sim U(0,1)\), and \(g(x) = 1 \text{ if } x<p_{\text{CR}}\ 0 \text{ otherwise}\).
  - Generate \(\mathbf{y} = \mathbf{u} \odot \mathbf{m} + (1-\mathbf{u}) \odot \mathbf{x}_{n-1}^{(j)}\). Here \(\odot\) is elementwise product.
  - Update \(\mathbf{y}\) by randomly replacing one element in \(\mathbf{y}\) by corresponding element in \(\mathbf{x}_{n-1}^{(j)}\).
  - if \(f(\mathbf{y}) \lt f(\mathbf{x}_{n-1}^{(j)})\) then \(\mathbf{x}_n^{(j)} = \mathbf{y}\) otherwise \(\mathbf{x}_n^{(j)} = \mathbf{x}_{n-1}^{(j)}\).

Code

Let’s get our hands dirty.


import numpy as np
import random
from copy import deepcopy
import os


def np_rnd_seed():
    """ Generates random seed """
    return np.random.seed(hash(os.urandom(100)) % 1000000000)


def removeThenChose(i, l, n=3):
    """
    randomly take n elements from the list l
    excluding its i-th element.
    """
    l = deepcopy(l)
    l.pop(i)
    random.shuffle(l)
    return l[:n]


def de(
        cost,
        bounds,
        mutation=1.2,
        crossoverRate=0.4,
        popsize=10,
        maxiter=100,
        mapper=map # or replace with parallel implementation of the map function
        ):

    def random_individual():
        return np.random.uniform(
                low=bounds[:,0],
                high=bounds[:,1])

    def random_individual_cost():
        x = random_individual()
        return (tuple(x), cost(x))

    def evolve_population(parCost):
        np_rnd_seed()
        actual_popsize = len(parCost)
        if actual_popsize < popsize: # key collision occured so the population shrinked
            # add random individuals to keep population size constant
            parCost = dict(
                list(parCost.items())\
                +\
                [ random_individual_cost()
                  for _ in range(popsize-actual_popsize)]
                )

        population = list(map(np.array, parCost.keys()))

        def evolve_individual(i):
            x = population[i]
            n = len(x)
            a, b, c = removeThenChose(i, population)
            m = a + mutation * (b-c)
            idx_to_modify = np.random.uniform(size=n) < crossoverRate
            # candidate solution
            y = deepcopy(x)
            y[idx_to_modify] = m[idx_to_modify]
            r = random.randint(0, n-1)
            y[r] = m[r]
            # compare
            costy = cost(y)
            costx = parCost[tuple(x)] # recall memorized
            if costy < costx:
                return (tuple(y), costy)
            else:
                return (tuple(x), costx)

        new_parCost = dict(mapper(evolve_individual, range(len(population))))
        return new_parCost


    # init random population & assign costs
    parCost = dict([random_individual_cost() for _ in range(popsize)])

    for _ in range(maxiter):
        parCost = evolve_population(parCost)

    return parCost

Let’s test it with Ackley function optimization benchmark. Ackley function has global minimum at f(0,0)=0.


def ackley(p):
    """Ackley optimization benchmark"""
    x, y = tuple(p)
    return\
        -20*np.exp(-0.2*np.sqrt(0.5*(x**2+y**2)))\
        -np.exp(0.5*(np.cos(2*np.pi*x) + np.cos(2*np.pi*y)))\
        +np.exp(1)+20

cost = ackley
bounds = np.array(
    [ [-10, 10]
    , [-20, 20]
    ])
parCost = de(cost, bounds, mapper=map, popsize=100)
best = min(parCost.items(), key=lambda t: t[1])
print(best)

Ilya Prokin

Differential Evolution in Python

Introduction

Algorithm

Code

References