{0, 1}* is the set of all strings consisting of the digits 0 and 1:
Here the ε symbol stands for the zero-length string.
Consider the set containing all the finite subsets of {0, 1}*; that is, the set of all the finite sets whose members are only strings of 0s and 1s:
{ | {}, {ε}, {0}, {ε, 0}, {1}, {ε, 1}, {0, 1}, {ε, 0, 1}, |
{00}, {ε, 00}, {0, 00}, {ε, 0, 00}, | |
{1, 00}, {ε, 1, 00}, {0, 1, 00}, {ε, 0, 1, 00}, ... }. |
Suppose we want to write a program that generates this set,
i.e. a program that enumerates all the finite subsets of {0, 1}*. (Of course,
since there are infinitely many such subsets,
it cannot really store them all explicitly in a list or something of that sort, but
it can in principle generate as many as we wish, or, given an n,
generate the
We welcome solutions in other languages, as well as alternative solutions in languages already covered here. Just mail them to me.
(Author: Uros Jovanovic)
p [] i=[[i]] p (x:y) i=[x++[i]]++(p y i) q (x:y) s=(p s x)++(q y ((p s x)++s)) z []=[] z (x:y)=["0"++x]++["1"++x]++(z y) w=["0","1"]++(z w) g=[[]]++(q w [])
[ToDo: explanation]
def a(): yield "" for s in a(): yield "0" + s; yield "1" + s def b(M): yield [M] for s in a(): if s==M: break for t in b(s): yield t + [M] print [] for s in a(): for t in b(s): print t
Here a generates the set {0, 1}* and b(M) generates all the sets that contain only the string M and zero or more of the strings that a generates before generating M. The last three lines then call b(M) for every string M generated by a.
If we wanted a real Python generator for the finite subsets of {0, 1}*, we could replace the last three lines by a third generator:
def c(): yield [] for s in a(): for t in b(s): yield t
Note: because this program uses generators, it requires Python 2.2 or later. In fact it also requires the from __future__ import generators statement, though eventually this won't be necessary as generators will be available by default.
def c(n): s = "" while n > 0: s = "01"[n%2] + s; n >>= 2 return s def x(): i = 0 while 1: s = c(i); i += 1; yield [c(j + 1)[1:] for j in range(len(s)) if s[-1 - j] == "1"]
Here, c(n) returns the binary representation of the integer n, except for n = 0 where it returns an empty string. Thus, n = 0, 1, 2, 3, 4, 5, 6, ... is mapped into ε, 1, 10, 11, 100, 101, 110, etc. By taking values of n from 1 onwards and removing the first character, we get ε, 0, 1, 00, 01, 10, etc., which is exactly an enumeration of {0, 1}*. We do not need to worry about overflows because Python (from version 2.2 onwards) automatically promotes ints to longs when necessary, and longs allow us to work with arbitrarily large numbers.
In addition, the binary representation of a number can be used to define which strings from {0, 1}* are present in some subset -- each bit corresponds to a string, and the string is included if the corresponding bit is set. This is used by x() to generate the subsets in turn.
The following is a slightly improved version where c(n)
produces the reverse string, allowing us to save a few characters
by using
def c(n): s="" while n>0:s+="01"[n%2];n>>=2 return s def x(i=0): while 1:s=c(i);i+=1;yield[c(j+1)[:-1]for j in range(len(s))if s[j]=="1"]
We might save another character by using /= instead of >>=, but Python will eventually begin converting results of division into floats if the actual mathematical result of the division would not be an integer, and to avoid this we would then have to use the //= operator, which isn't any shorter.
This page is maintained by Janez Brank.
[To the index.]