2000/2001 ACM International Collegiate Programming Contest
University of Ulm Local Contest

# Judges' Comments on the Solutions

Problem A: Anagram Groups

Since anagram groups are classes of an equivalence relation, we pick as representative element of each class the sorted version of a member. Then, it takes logarithmic time to find the representative and update the class, as we read the words. We sort the classes by their size and their smallest member. We take the first 5 and output them in sorted order.

Judges' test data was constructed from a dictionary of english words by taking every other word, adding a few extra words and duplicating the result except for one of the extra words. This procedure leaded to a dictionary of 20079 words. The largest anagram group is of size 10 with 5 unique words.

Rating: Medium

Although the problem is not too difficult, many mistakes can be made when writing a program. Some of them are:
• Too small dimensioned arrays (assuming too few words or groups)
• Sorting equal-sized groups not by their smallest elements but by their (sorted) representatives
• Words that occur multiple times in the input must be counted multiple times but output only once
• Too many groups in the output
• Inefficient handling of the sorted data-structures (leads to TLE)
Reference
Musser, D.R. and Saini, A.
STL Tutorial and Reference Guide: C++ Programming with the Standard Template Library
Chapters 13-15, Pages 201-228, Addison-Wesley, 1996
ISBN 0-201-63398-1

There are three possible ways of solving this problem. The first is by exhaustively searching the space of possible bracelets. Whenever you find an unprocessed bracelet, you generate all equivalent bracelets by rotation and reflection and mark them as processed. Note that the number of possible bracelets is bounded by max { cs | cs <= 32 } = 65536.

The second solution is by bringing every possible bracelet to a normal form. If a bracelet is coded into an integer, and you look at the set of codes for a bracelet and all its equivalent bracelets, you can take the smallest code as a normal form. Now, just count all bracelets that are in normal form.

The third solution is based on the Burnside-Cauchy-Frobenius-Lemma from algebra. The symmetries of a bracelet form the dihedral symmetry group. The number of different bracelets is the arithmetic mean of cl over all symmetry-generating permutations, where l is the number of disjoint cycles of the respective permutation. The number of disjoint cycles is calculated easily for the reflection part, and by means of the greatest common divisor for the rotation part, as well.

Judges' test data includes all 119 legal combinations of c and s.

Rating: Medium

A very tricky error can occur when you don't code the bracelet into an integer, but into a character string. Let's say, you take lower-case characters starting with 'a' for the first possible color. Let's assume further that you use the char data type of C on a computer with ASCII character set. Then for the test case s=1, c=32 you will generate beads of color 'a'+31. Now it depends on the context what the value of this color is. When interpreted as an unsigned char, the value is 128. But when interpreted as a signed char, the value is -128. So your program can generate a wrong answer for this test case, if this value is used in an unintended way. Although rather exotic, this error occured in at least two independant programs. There are no problems when the colors are coded starting with 'A' or '0' or when they are coded into an integer array starting with 0.
References
Gerard, S.N.
C++ Math Class Library: permutations, partitions, calculators & gaming
Pages 308-322, John Wiley & Sons, Inc., 1994
ISBN 0-471-59243-9

Dornhoff, L.L. and Hohn, F.E.
Applied Modern Algebra
Section 5.14: Pólya Enumeration Theory, Pages 242-249, Macmillan, 1978
ISBN 0-02-329980-0

Problem C: Simple Computer

Just read in the memory contents and simulate the machine. Since there is no overflow, the accumulator is operated mod 28 and the program counter mod 25.

Judges' test data contains 3 hand-crafted test cases along with 10 random-generated ones (that are guaranteed to terminate).

Rating: Easy

The only difficulty is to realize that overflows in the registers have to be discarded. For every test case the registers must be initialised to 0.
Reference
Voorronde Nederlands Kampioenschap Programmeren 1994
Probleem A: Interpreter
The Netherlands, 1994
Problem D: Mondriaan's Dream

Assume you could calculate the number of different paintings for a rectangle with c columns and r rows where the first r-1 rows are completely filled and the last row has any of 2c possible patterns. Then, by trying all variations of filling the last row where small rectangles may be spilled into a further row, you can calculate the number of different paintings for a rectangle with r+1 rows where the first r rows are completely filled and the last row again has any pattern.

This straightforwardly leads to a dynamic programming solution. All possible ways of filling a row part of which may already be occupied and spilling into the next row creating a new pattern are genrated by backtracking over a row. Viewing these as transitions from a pattern to another pattern, their number is given by the recursive equation Tc = 2 Tc-1 + Tc-2. Its solution is asymptotically exponential with a base of sqrt(2)+1, which is not a problem for c<=11.

If both h and w are odd, the result is 0. Since the number of paintings is a symmetric function, the number of columns should be chosen as the smaller of the two input numbers whenever possible to improve run-time behaviour substantially.

Judges' test data includes all 121 legal combinations of h and w.

Rating: Hard

Since the size of the painting could be as large as 110, a simple backtracking solution won't do, even not with using 5 hours of contest time to precalculate the results. Once the dynamic programming algorithm is implemented a quick review of the results should reveal that they don't fit into 32-bit ints. There are four ways to solve this problem: try double (which works actually), implement 64-bit arithmetics (only addition is needed), implement arbitrary precision arithmetics, or switch to Java and use BigInteger. A more efficient algebraic solution was not known to the judges.
Reference
ACM North-Western European Regional Contest 1998
Problem G: Mondriaan
s'-Hertogenbosch, 1998
Problem E: Equidistance

First, convert from polar to orthonormal coordinate system. Denote Alice's home a and Bob's home b. The geometric place of all points equidistant from a and b is the intersection of the earth with the plane with normal vector a-b through (a+b)/2. From the angle between normal vector and meeting point you easily calculate the angle between meeting point and the equidistance circle (which lies in aforementioned plane). Just multiply this angle with the earth's radius to get the distance. If Alice and Bob live at identical locations, the geometric place of all equidistant points is the whole earth, the distance therefore is always 0.

Judges' test data is based on a table of locations with coordinates, along with several special locations added by hand. The number of locations is 96. 10 queries, 7 of which included an unknown name were added by hand. 18 of the locations were arranged into 3 groups of 6 locations. For each group all 216 possible queries were posed. 200 random queries involving all 96 locations were finally added thus leading to 858 queries.

Rating: Medium

A common mistake is not to handle the two special cases appropriately: Alice and Bob's home may be at the same location and there can occur locations with unknown coordinates. Another possible error is to take as the geometric place of equidistant points the arithmetic mean of the two home locations (which is one point of the geometric place only).
Reference
ACM University of Ulm Local Contest 1997
Problem G: Globetrotter
Ulm, 1997
Problem F: How many Fibs?

First, you have to precalculate all Fibonacci numbers up to 10100 of which there are less than 500. Note that you have to perform large integer arithmetics. The easiest solution is to use Java with its builtin data type BigInteger. In the other programming languages you will have to represent the large numbers e.g. by storing the digits in integer arrays. You will also have to implement basic arithmetic routines for addition and comparison which come free in Java. After you have a list of enough Fibonacci numbers, you can find the lower and upper bound a and b by performing a linear or binary search. Their difference is the number of Fibs in the given range. Pay attention to the interval boundaries.

Judges' test data was created from some hand-crafted data, along with an exhaustive test of 0<=a<=b<=9. Finally 100 random-generated tests were added. The total number of test-cases is 179.

Rating: Easy

Of course the Fibonacci numbers can be calculated by an explicit formula using floating-point arithmetic. But here, even double precision is not enough. The result is an "off-by-1" error for large numbers near to a Fibonacci number. The same phenomenon occurs for small numbers already if you take <= for < and vice versa. The judges do not know of any solution that does not use large integer arithmetics some way or the other.
Problem G: Phylogenetic Trees Inherited

The first thing to observe is that the different positions in every sequence are independent of each other. This reduces the tree of sequences to a tree of amino acids. At the root of the tree, or for that matter of any sub-tree, there may be many possible amino acids leading to optimal costs. Suppose, you have calculated for two sub-trees Tl and Tr the sets of amino-acids leading to optimal costs Al and Ar. Adjacent sub-trees Tl and Tr have as their father the node T. Now you want to find the set of amino-acids A that you can mark T with, leading to optimal costs for T.

There are two cases: if the intersection of Al with Ar is non-empty, define A as just this intersection, otherwise define A to be the union of Al and Ar. To see why this is true, observe the extra costs you get, when you assemble T from Tl and Tr. In the first case, you have 0 extra costs when you take an amino-acid from the intersection, but 1 or 2 extra costs when you do not. In the second case, you have 1 extra costs when you take an amino-acid from the union, but 2 extra-costs when you do not. Now, you may want to assemble T not from Tl and Tr but from some other sub-optimal trees. As you can easily verify, this leads to sub-optimal costs for T as well.

This reasoning is carried over straightforwardly to an induction proof and leads to a dynamic programming solution. Since the amino-acids are upper-case letters, you can represent sets of amino-acids as ints. The set operations you need are then easily performed as bitwise and respectively or. Whenever you do a union operation, your costs increase by 1.

Another, more straight-forward solution is to calculate for each node of the tree the optimal costs for every amino acid the node can be marked with. This is done by trying every possible combination of amino acids for the two sub-trees, assuming their optimal costs have already been calculated. Since this solution might turn out to be too inefficient, it can be improved upon by observing that a father node always can be marked with either the left or the right son's amino-acid - there is no need to take an amino acid that differs from both.

Judges' test data was constructed from a test-case with a few long sequences, a test-case with many short sequences, a test-case where every possible pair of amino-acids occured, and 100 random-generated test-cases where length and number of sequences are geometrically distributed. The total number of test-cases is 110. Since there may be multiple correct answers for the test cases, a special verification program was written by slightly modifying the judges' solution.

Rating: Hard

References
Durbin, R. ; Eddy, S.R. ; Krogh, A. and Mitchison, G.
Biological Sequence Analysis: probabilistic models of proteins and nucleic acids
Section 7.4: Parsimony, Pages 173-176, Cambridge University Press, 1998
ISBN 0-521-62971-3 (paperback), ISBN 0-521-62041-4 (hardcover)

Setubal, J.C. and Meidanis, J.
Introduction to Computational Molecular Biology
PWS Publishing Company (International Thomson Publishing), Boston, 1997
ISBN 0-534-95262-3

Fitch, W.M.
Toward defining the course of evolution: Minimum change for a specific tree topology
Systematic Zoology, Volume 20, Pages 406-416, 1971
ISSN 0039-7989

Problem H: Hike on a Graph

There are n3 different states the game can be in. These are the nodes of a graph, and there is an edge between two nodes, whenever a state transition is defined (i.e. the corresponding move can be made). Find the shortest path from the initial state to any state with three identical locations. This is most easily accomplished by breadth-first search.

Judges' test data was created from some hand-crafted data, along with ten random-generated test cases. The total number of test cases is 20.

Rating: Medium

Your solution must handle correctly the case where all pieces are on the same location in the beginning already (a path of length 0). Furthermore, the numbering of the nodes starts with 1, not with 0.
Reference
ACM North-Western European Regional Contest 1997
Problem A: Arrows
Delft, 1997