jakstys.lt/content/log/2022/uber-mock-interview-retrospective.md
2022-07-01 13:16:47 +03:00

11 KiB

title date slug
Uber Mock Interview Retrospective 2022-07-01T12:55:00+03:00 uber-mock-interview-retrospective

Like mentioned in [the previous post]({{< ref "log/2022/big-tech-hiring" >}}), I did a public mock coding interview. A reminder what that was:

  • The goal was to explain how some bits of Uber's tech recruiting works.
  • The meetup page had 602 attendees as of writing. We expected quite a few participants in the event.

The mock interview consisted of:

  • Introduction by Uber's EMEA recruiter Courtney Cox.
  • Myself doing a coding challenge with a 50 minute cap:
    • I did not know the exercise upfront.
    • Although my job did not depend on it, the ticking timer and people looking at my work (~260) made it quite stressful.
    • I did not complete the exercise. According to my interviewee, I failed the "phone screen". The good part is that I still get to keep my job. :)
  • Half-hour QA session.

TLDR: Highlights

  • Lots of fun for everyone: myself, the interviewer and the spectators.
  • Folks seemed to be engaged: the chat room was active throughout, and we had more questions than time to answer them.
  • Even though I have been coding Zig for the last few months, I felt like I had a strong enough grip on it; the algorithm was the one that tripped me.

TLDR: Lowlights

Most importantly, I did not complete the exercise. Worse, I did not even come up with the correct algorithm, therefore the interview was an obvious failure. I can re-apply in 6 months though!

Biggest mistake? I did the same mistake that interviewees do all the time: start coding without knowing the full algorithm. This is a recipe for failure. It is always, always better to spend 10-15 minutes with hands off the keyboard and come up with the solution, and only then start coding.

On the same day I figured out the solution and implemented it next morning. You can find it below. If you want to show this off in your favorite programming language, read below in the challenge section.

The Exercise and Solution

I have been coding in Zig since last February (so ~5 months now). Loris Cro keeps telling the me and The Internet that Zig is not suitable for coding challenges. Well, after a couple of months of working with him, I can finally say he is wrong! Even though my colleagues tell me Zig was tripping me (e.g. memory leaks in the unit tests, for which I had to add defer hash_map.deinit()), I think this was due to lack of experience in a particular "coding challenge setting". Next time I will construct an arena and be done with memory management.

Exercise was taken from Cracking the coding interview:

// Each year, the government releases a list of the 10000 most common baby
// names and their frequencies (the number of babies with that name). The only
// problem with this is that some names have multiple spellings. For example,
// "John" and "Jon" are essentially the same name but would be listed
// separately in the list. Given two lists, one of names/frequencies and the
// other of pairs of equivalent names, write an algorithm to print a new list
// of the true frequency of each name. Note that if John and Jon are synonyms,
// and Jon and Johnny are synonyms, then John and Johnny are synonyms. (It is
// both transitive and symmetric.) In the final list, any name can be used as
// the "real" name.
//
// Example:
// Names: John (15), Jon (12), Chris (13), Kris (4), Christopher (19)
// Synonyms: (Jon, John), (John, Johnny), (Chris, Kris), (Chris, Christopher)
// Output: John (27), Kris (36)

My solution

Timeline:

  • 50 minutes during the interview. I almost did not use any of it except for small parsing bits and the unit test.
  • 30 minutes after cycling home right after the meetup: thinking about the problem. At this point I realized this problem reduces to finding disconnected graphs.
  • 2 hours 15 minutes: coding. The result of that is below.

{{< highlight zig "linenos=table" >}} const std = @import("std"); const mem = std.mem; const fmt = std.fmt; const Order = std.math.Order; const Allocator = std.mem.Allocator; const ArrayList = std.ArrayList; const ArrayListUnmanaged = std.ArrayListUnmanaged; const StringHashMap = std.StringHashMap; const PriorityQueue = std.PriorityQueue;

// for priority queue fn lessThan(_: void, a: u32, b: u32) Order { return std.math.order(a, b); }

pub fn solution( allocator: Allocator, names: []const u8, synonyms: []const u8, ) error{OutOfMemory}![]const u8 { var arena1 = std.heap.ArenaAllocator.init(allocator); defer arena1.deinit(); var arena = arena1.allocator();

var name2id = StringHashMap(u32).init(arena);
var pairs = ArrayList([2]u32).init(arena);

// populate name2id and pairs
const total_members = blk: {
    var it = mem.tokenize(u8, synonyms, ", ()");
    var idx: u32 = 0;
    while (true) {
        const left = it.next() orelse break;
        const right = it.next().?;
        var pair: [2]u32 = undefined;
        var i: u2 = 0;
        for (&[_][]const u8{ left, right }) |val| {
            const result = try name2id.getOrPut(val);
            if (!result.found_existing) {
                result.value_ptr.* = idx;
                pair[i] = idx;
                idx += 1;
            } else pair[i] = result.value_ptr.*;

            i += 1;
        }
        try pairs.append(pair);
    }

    // now add all "lone" names that do not have aliases
    var it2 = mem.tokenize(u8, names, "(), 0123456789");
    while (it2.next()) |name| {
        const result = try name2id.getOrPut(name);
        if (!result.found_existing) {
            result.value_ptr.* = idx;
            idx += 1;
        }
    }

    break :blk idx;
};

// create id2name for printing the results
var id2name = try arena.alloc([]const u8, total_members);
{
    var it = name2id.iterator();
    while (it.next()) |val|
        id2name[val.value_ptr.*] = val.key_ptr.*;
}

var graph = try arena.alloc(ArrayListUnmanaged(u32), total_members);
mem.set(ArrayListUnmanaged(u32), graph, ArrayListUnmanaged(u32){});

// populate graph
for (pairs.items) |pair| {
    try graph[pair[0]].append(arena, pair[1]);
    try graph[pair[1]].append(arena, pair[0]);
}

// navigate through graph. This is DFS:
// - "visited" is a list of user ids that we should not go into.
// - "unvisited" is a queue of user ids that we need to visit. This is
//   the driver of the loop: work until this is non-empty.
var visited = try arena.alloc(bool, total_members);
mem.set(bool, visited, false);

// everyone is unvisited now
var unvisited = PriorityQueue(u32, void, lessThan).init(arena, {});
try unvisited.ensureTotalCapacity(total_members);
for (id2name) |_, i|
    try unvisited.add(@intCast(u32, i));

// id2synonym is mapping from userid to synonym_id. It just so conveniently
// happens that the synonym_id points to a user id.
var id2synonym = try arena.alloc(u32, total_members);

// traverse the graph and populate id2synonym
{
    var synonym_id: u32 = 0;
    // scratch is our DFS temporary storage: while traversing the member
    // list, which ones to go to when we're done with the current one?
    var scratch = PriorityQueue(u32, void, lessThan).init(arena, {});

    while (unvisited.removeOrNull()) |i| : (synonym_id += 1) {
        if (visited[i]) continue;

        try scratch.add(i);
        while (scratch.removeOrNull()) |j| {
            visited[j] = true;
            id2synonym[j] = synonym_id;
            for (graph[j].items) |k|
                if (!visited[k])
                    try scratch.add(k);
        }
    }
}

var id2count = try arena.alloc(u32, total_members);
mem.set(u32, id2count, 0);
// calculate id2count from names and id2synonym
{
    var it = mem.tokenize(u8, names, ", ()");
    while (true) {
        const name = it.next() orelse break;
        const id = name2id.get(name).?;
        const count = fmt.parseInt(u32, it.next().?, 10) catch unreachable;
        id2count[id2synonym[id]] += count;
    }
}

var result = ArrayList(u8).init(allocator);
const wr = result.writer();
for (id2count) |count, id| {
    if (count == 0) continue;
    if (id != 0) result.appendSlice(", ") catch unreachable;
    wr.print("{s} ({d})", .{ id2name[id], count }) catch unreachable;
}
return result.toOwnedSlice();

}

const tests = [_]struct { names: []const u8, synonyms: []const u8, want: []const u8, }{ .{ .names = "John (15), Jon (12), Chris (13), Kris (4), Christopher (19), Žvangalas (10)", .synonyms = "(Jon, John), (John, Johnny), (Chris, Kris), (Chris, Christopher)", .want = "Jon (27), Chris (36), Žvangalas (10)", }, .{ .names = "John (15), Jon (12), Chris (13), Kris (4), Christopher (19)", .synonyms = "(Jon, John), (John, Johnny), (Chris, Kris), (Chris, Christopher)", .want = "Jon (27), Chris (36)", }, .{ .names = "John (15), Jon (12), Johnny (5), Johnn (4), Johnathan (3)", .synonyms = "(Jon, John), (Johnn, Johnny), (Johnathan, Jon), (John, Johnny)", .want = "Jon (39)", }, };

const testing = std.testing; test "example" { for (tests) |tt| { const got = try solution(testing.allocator, tt.names, tt.synonyms); defer testing.allocator.free(got); try testing.expectEqualStrings(tt.want, got); } } {{< / highlight >}}

Jakub Konka was watching the interview too! His comment to the solution above is:

  • Your solution looks good to me and I think I'd be oscillating roughly around the same solution too.
  • You've used arena in an interesting way to init string sets: I think I'd use an unmanaged version and initialize on first use.
  • But it's fine either way.

Optional: challenge for you

Inclined to show off your solution in Zig or your favorite programming language? Post it to the comment in meetup page (preferably use a public pastebin to keep the comment size reasonable), and I will paste my favorite ones here with your name. Please include the time it took you to code it. The main criteria is, of course, lines of code. :)

Thanks

Many thanks to Brigita Žemgulytė, Courtney Cox, Ignas Kaziukėnas and Mantas Mikšys for making this happen. I would do it again.