2017-04-25

Vatic Labs Code Test Post-Mortem

Goals of This Post-Mortem

Specifically, I would like to capture stats to use as metrics for comparison against future attempts.

I would also like to mutate the challenge for future attempts, for improvement purposes.

Further, I would like to consider the four questions, Stephan:

What did we do well, that if we don't discuss we might forget?
What did we learn?
What should we do differently next time?
What still puzzles us?

I wish to answers these four questions on this attempt, as well as the process I am creating by repeatedly doing these problems and then post-morteming them.

Stats of Interest

Started: Sun Apr 23 08:30:00 EDT 2017

Ended: Mon Apr 24 08:30:00 EDT 2017

Time to complete task: 24 hours

Tests created for this attempt: 4

End-to-end tests (usable in future attempts): 3

Past test results: N/A

Current test results: all passed

Future test results: N/A

Final result:

Passed speed constraint
Failed system test
- Meaning unknown
- Possibly failed memory constraint
- Possibly failed implementation

Constraints for next attempt:

Same constraints as in problem-statement except:
- Three hours allotted time to complete task
- Create larger suite of tests in general
  - Test every edge condition
  - Test every specified constraint of final executable
  - Rewrite end-to-end test such that they run a vatic_code_test executable
    - This will allow end-to-end tests to run against non-python attempts

What did we do well, that if we don't discuss we might forget?

Given the options, I decided to tackle the question in the language I knew best, and, if I had time, then port it to the language I didn't know as well.

I believe this was the correct decision.

I setup my environment pretty quickly, and was even using pylint throughout (caught small errors that might've wasted my time pretty quickly).

What did we learn?

If I am going to take using different editors and IDEs seriously, I have to:

note whenever I want something to be different
note bad habits
- make bad habits impossible

Otherwise, I might as well stick to vim.

Pandas is a cool library that I want to get better at.

Printing to stdout is slow! I've learned this multiple times now and I'm now going to create a list of common speed enemies.

Only solve a problem once it is proven to be a problem. Again, learned this multiple times now, but it's a hard habit to break. Specifically, I tried to optimize away rounding error before verifying it would be a problem, and it ended up making my program too slow.

What should we do differently next time?

Git Init

I need to start cultivating a checklist (some people call them recipes and I like that metaphor) for general startup purposes.

I've dabbled in them in the past, and I think it's a good idea to dabble in them again.

But that's really specifically to do something differently next time that I surprisingly forgot to do this time. I did not start my project with:

git init

I didn't even consider properly versioning my code until I was almost done with my first implementation.

Even if my instinct is to make quick prototypes that I delete (more on that later), there's nothing wrong with versioning those prototypes. The repo will be deleted just as easily as the code.

Break It Down, Prototype, Start-Over

The problem was well-defined. I was given input, I was given output.

The problem is not particularly complicated, but it does have layers and many steps.

I needed to define the sub-goals, prototype an answer, get something working, and then redo it so it would naturally be a bit cleaner.

The final product ended up being a lot messier than I would like, probably in part because of my failure to do this.

Super Starting Over

Six hours in, I submitted my first code to see what result I would get.

"Time Limit Exceeded"

I panicked. I reread the problem-statement, I noticed that I was allowed numpy and pandas, and so I started over with the goal of solving the problem using that library.

I'm super glad I did this, but I should not do it again. Not like how I did.

Noting the third-party libraries that are explicitly allowed is ideal, and specifically using those libraries attempts not the first one for the purpose of getting better at those libraries is also a good idea.

Using a new library on a problem I hadn't properly finished before when it was not a requirement was incorrect. Especially since I already had a solution and I hadn't tested it in any way to see how close to fast-enough it was.

The non-working solution I made in pandas ended up over 10x slower than my original attempt.

All I needed to do to make my original program fast enough was save printing to stdout until the very end, and use floats instead of decimal.Decimal.

Had I made the dataset to test the speed of my code first, and tinkered with things to see if I could speed things up, I would've found this out and had time to further debug my code.

What still puzzles us?

How do I make fast pandas code?

I really want to know. Pandas is a library that could be of signficant use to me in future projects, I think.

What test did I fail?

I think it's possible I exceeded the memory limit, but rereading the problem-statement makes me think perhaps I did liquidity wrong.

Things I Should Do in Future Post-Mortems

All right, this post-mortem is taking me long enough. In fact, it's already taken me too long.

Next post-mortem: time-limit of 30 minutes.

I let myself get distracted too often and it wastes a lot of time. I will stop at the 30-minute mark next time. Even if it's in mid-sentence.

If this happens, I'll put a "pencils down" at the end of the post, I guess.

I would also like to make the code I made available.

I'll put it up on github in some way, I think.

"Pencils down."