Vatic Labs Code Test Post-Mortem

The Problem Statement

Goals of This Post-Mortem

Specifically, I would like to capture stats to use as metrics for comparison against future attempts.

I would also like to mutate the challenge for future attempts, for improvement purposes.

Further, I would like to consider the four questions, Stephan:

  1. What did we do well, that if we don't discuss we might forget?
  2. What did we learn?
  3. What should we do differently next time?
  4. What still puzzles us?

I wish to answers these four questions on this attempt, as well as the process I am creating by repeatedly doing these problems and then post-morteming them.

Stats of Interest

Started: Sun Apr 23 08:30:00 EDT 2017

Ended: Mon Apr 24 08:30:00 EDT 2017

Time to complete task: 24 hours

Tests created for this attempt: 4

End-to-end tests (usable in future attempts): 3

Past test results: N/A

Current test results: all passed

Future test results: N/A

Final result:

  • Passed speed constraint
  • Failed system test
    • Meaning unknown
    • Possibly failed memory constraint
    • Possibly failed implementation

Constraints for next attempt:

  • Same constraints as in problem-statement except:
    • Three hours allotted time to complete task
    • Create larger suite of tests in general
      • Test every edge condition
      • Test every specified constraint of final executable
      • Rewrite end-to-end test such that they run a vatic_code_test executable
        • This will allow end-to-end tests to run against non-python attempts

What did we do well, that if we don't discuss we might forget?

Given the options, I decided to tackle the question in the language I knew best, and, if I had time, then port it to the language I didn't know as well.

I believe this was the correct decision.

I setup my environment pretty quickly, and was even using pylint throughout (caught small errors that might've wasted my time pretty quickly).

What did we learn?

If I am going to take using different editors and IDEs seriously, I have to:

  • note whenever I want something to be different
  • note bad habits
    • make bad habits impossible

Otherwise, I might as well stick to vim.

Pandas is a cool library that I want to get better at.

Printing to stdout is slow! I've learned this multiple times now and I'm now going to create a list of common speed enemies.

Only solve a problem once it is proven to be a problem. Again, learned this multiple times now, but it's a hard habit to break. Specifically, I tried to optimize away rounding error before verifying it would be a problem, and it ended up making my program too slow.

What should we do differently next time?

Git Init

I need to start cultivating a checklist (some people call them recipes and I like that metaphor) for general startup purposes.

I've dabbled in them in the past, and I think it's a good idea to dabble in them again.

But that's really specifically to do something differently next time that I surprisingly forgot to do this time. I did not start my project with:

git init

I didn't even consider properly versioning my code until I was almost done with my first implementation.

Even if my instinct is to make quick prototypes that I delete (more on that later), there's nothing wrong with versioning those prototypes. The repo will be deleted just as easily as the code.

Break It Down, Prototype, Start-Over

The problem was well-defined. I was given input, I was given output.

The problem is not particularly complicated, but it does have layers and many steps.

I needed to define the sub-goals, prototype an answer, get something working, and then redo it so it would naturally be a bit cleaner.

The final product ended up being a lot messier than I would like, probably in part because of my failure to do this.

Super Starting Over

Six hours in, I submitted my first code to see what result I would get.

"Time Limit Exceeded"

I panicked. I reread the problem-statement, I noticed that I was allowed numpy and pandas, and so I started over with the goal of solving the problem using that library.

I'm super glad I did this, but I should not do it again. Not like how I did.

Noting the third-party libraries that are explicitly allowed is ideal, and specifically using those libraries attempts not the first one for the purpose of getting better at those libraries is also a good idea.

Using a new library on a problem I hadn't properly finished before when it was not a requirement was incorrect. Especially since I already had a solution and I hadn't tested it in any way to see how close to fast-enough it was.

The non-working solution I made in pandas ended up over 10x slower than my original attempt.

All I needed to do to make my original program fast enough was save printing to stdout until the very end, and use floats instead of decimal.Decimal.

Had I made the dataset to test the speed of my code first, and tinkered with things to see if I could speed things up, I would've found this out and had time to further debug my code.

What still puzzles us?

How do I make fast pandas code?

I really want to know. Pandas is a library that could be of signficant use to me in future projects, I think.

What test did I fail?

I think it's possible I exceeded the memory limit, but rereading the problem-statement makes me think perhaps I did liquidity wrong.

Things I Should Do in Future Post-Mortems

All right, this post-mortem is taking me long enough. In fact, it's already taken me too long.

Next post-mortem: time-limit of 30 minutes.

I let myself get distracted too often and it wastes a lot of time. I will stop at the 30-minute mark next time. Even if it's in mid-sentence.

If this happens, I'll put a "pencils down" at the end of the post, I guess.

I would also like to make the code I made available.

I'll put it up on github in some way, I think.

"Pencils down."