Explaining codejam 1B

coding algo codejam python

The problems for 1B were harder, I felt. I only got a rough idea of how to go about solving it, it took some time to really iron out the details.

As mentioned in the analysis, the questions were deceptively easy. There were many edges cases that were not immediately apparent. I didn’t manage to come to a full solution until after I read the analysis.

Problem A

This was no doubt the easiest among the 3 questions. The key to solving this realizing that there are unique letters in the spelling of the numbers.

After identifying the unique letters, we just have to remove the letters, number by number, store down the digits, then arrange them in ascending order (stated in question) to get our answer.

So how do we know which number goes first? We first determine the set of letters needed to spell the numbers. Then we fill in the blanks and delete accordingly.

Table with the letters https://imgur.com/a/RjrW2

success

def rm(S, word):
    for c in word:
        S.remove(c)
    return S

T = input()
for t in range(T):
    S = list(raw_input().strip())
    nums = []

    while S.count("Z"):
        S = rm(S, "ZERO")
        nums.append(0)
    while S.count("X"):
        S = rm(S, "SIX")
        nums.append(6)
    while S.count("S"):
        S = rm(S, "SEVEN")
        nums.append(7)
    while S.count("W"):
        S = rm(S, "TWO")
        nums.append(2)
    while S.count("V"):
        S = rm(S, "FIVE")
        nums.append(5)
    while S.count("F"):
        S = rm(S, "FOUR")
        nums.append(4)
    while S.count("R"):
        S = rm(S, "THREE")
        nums.append(3)
    while S.count("T"):
        S = rm(S, "EIGHT")
        nums.append(8)
    while S.count("O"):
        S = rm(S, "ONE")
        nums.append(1)
    while S.count("N"):
        S = rm(S, "NINE")
        nums.append(9)

    s = "".join(map(str, sorted(nums)))

    print "Case #%d: %s"%(t+1, s)

Problem B

The second problem is the one about finding the lowest score. On the surface, it looks as though a greedy approach can be used, generating the values one by one from left to right.

But it took some time for me to realize that it is more of a “generate all possible values and find the best answer” kind of question.

To start off, all questions can be categorized into 3 main kinds.

Both same values Problems in this category generally ends up with the answer where both values are the same. One such example will be “?6? ??9”, where the answer will end up to be “069 069”

One big one small One way to minimize the difference is to maximize the smaller value by replacing all the “?” with “9” and minimize the larger value by replacing all “?” with “0”. Such as example will be “7?? 6 ??”, which will give us the values “700 699”.

Transferring to a higher placing The last method of minimizing the difference is to introduce a point of difference in the previous placing , this is done by making the next placing larger. One such example is “?0? ?9?”, which will give us the values “100 099”, where we introduced a value difference in the hundreds place.

The solution So how do we go about generating the values? We will go through the string position by position, trying to implement one of the three methods of solving the problem.

So as we go along the string we will try to do the following:

  • make the two values the same
  • introduce a difference in the value by increasing or decreasing the value at that position
  • introduce a difference in the new placing by changing the unknown values to "0" and "1" then swapping them around.

After that we will store all the possible candidates and find the pairing with the smallest difference.

success

def fill(a, b):
    a, b = "".join(a), "".join(b)

    a = a.replace("?", "0" if a>b else "9")
    b = b.replace("?", "9" if a>b else "0")
    return [a, b]

T = input()
for t in range(T):
    a, b = map(list, raw_input().split())
    N = len(a)  

    f = []
    for i in range(N): 
        sub_a, sub_b = a[:i], b[:i]
        if sub_a <> sub_b: 
            f.append(fill(a, b))
            break

        if a[i]=="?" and b[i]=="?":
            a[i], b[i] = "1", "0"
            f.append(fill(a, b))

            a[i], b[i] = "0", "1"
            f.append(fill(a, b))

            a[i], b[i] = "0", "0"
        elif a[i]=="?" or b[i]=="?":
            c, d = a, b
            if b[i]=="?": c, d = b, a

            if d[i]<"9": c[i] = str(int(d[i])+1)
            f.append(fill(a, b))

            if d[i]>"0": c[i] = str(int(d[i])-1)
            f.append(fill(a, b))

            c[i] = d[i]

        if not (a+b).count("?"):
            f.append(fill(a, b))
            break

    min_val = None
    min_pair = None
    for pair in f:
        val = abs(int(pair[0])-int(pair[1]))
        pair = " ".join(pair)

        if not min_val or val<min_val:
            min_val = val
            min_pair = pair
        elif val==min_val and (pair<min_pair or not min_pair):
            min_pair = pair

    print "Case #%d: %s"%(t+1, min_pair)

Problem C

This last problem is the most deceptive one. I initially thought that a greedy approach would work fine, but there are far too many cases to consider.

After looking through the analysis, and seeing the bipartite graph, I have a rough idea of how to use the data structure to solve the problem.

Before I start, let’s have a short explanation of how a bipartite tree can be used to model the problem. Consider the list of these words:

  • HYDROCARBON COMBUSTION
  • BIOMASS COMBUSTION
  • QUAIL COMBUSTION
  • QUAIL BEHAVIOR
  • QUAIL CONTAMINATION
  • GROUNDWATER CONTAMINATION
  • GROUNDWATER HYDROLOGY

Treating all first words to be of a set and all the second words to be one set, we have the graph as follows, courtesy of codejam analysis.

bipartite graph of words

Flow of thinking Ignore all the lines for the moment and take a step back. Finding the most number of fakes means to use as little titles as possible to cover all the words.

This means that we must use as little lines (or vectors, in graph terms) as possible to go through all the words (or nodes, in graph terms).

So here, we work backwards. We have a list of all the nodes that are present in both set, then eliminate them one by one as we choose the optimal vector that covers the most nodes.

We start off with the most obvious ones - the nodes with only one connection. Obviously that has to be the only word pair that can be legit.

After we run out of nodes with only one vector, we iterate through the leftovers and remove the optimal match (best if both nodes of the vector are not yet covered).

If you prefer a more technical way, you can always refer to the analysis for resources to the link to the full algorithm.