Archive for March, 2010

Building a world-class team: six mistakes I made early in my career

For the past few weeks my primary job at RethinkDB has been to hire world-class software developers (we’re still hiring – see our jobs page). Recruiting great people is a difficult process with at least seven components: sourcing candidates, reviewing resumes, doing technical phone screens, conducting technical interviews, closing candidates, extending offers, and keeping candidates happy once they’ve joined. Each component seems simple in principle, but is very subtle in practice. A testament to this is that all software companies aspire to hire only the best people, but in practice very few companies achieve this goal.

I’ve performed many technical interviews before, but always as a small part of a process designed by someone else. I was very excited to start recruiting for RethinkDB because I could finally design the recruiting process from scratch, and fix many of the issues that were previously outside of my control. There are many amazing articles on how to recruit for software companies. Many of the ideas I borrowed for our process come from the articles written by Joel Spolsky, Steve Yegge, Marc Andreessen, Michael Lopp, and Paul English. I won’t attempt to restate all of their advice, which largely became conventional wisdom anyway. Instead, I will focus on pointing out the mistakes I’ve made early in my career, even after reading everything there is to read about hiring. I hope this will add a little more clarity to the subtleties of the process.

Over time, I hope to write about every component of the recruiting process, but for now I will focus on my favorite part – doing technical interviews. In this blog post I will cover what not to do. Later on I will write a follow up post discussing exactly how we conduct technical interviews and what types of questions we ask.

Informal interviews

Almost every great article on interviewing I’ve ever read insists on one thing: that you should establish a formal interview process and stick to it. Despite reading this over and over again, early in my career I did all of the technical interviews informally. In almost every case I can think of, this turned out to be a disaster.

For an inexperienced person, performing a formal interview is psychologically difficult. Early on, I lacked the confidence to place smarter, older, or more experienced people in front of a white board, and ask them tough technical questions. Eventually I learned that it is much easier to do than it seems. The best people not only welcome a challenge, they require it. All the great people I know would never work for a company that doesn’t perform a tough, formal interview because they recognize the company will mostly end up with bad employees. The first time I resolved to do a formal interview, I cautiously asked a candidate who looked twice my age and had a stellar resume to code a binary search. I was pleasantly surprised when he smiled and gladly walked up to the whiteboard.

The only counterexample I’ve seen is Aaron Swartz’s article on hiring. His main argument is that he was never great at solving problems under the social pressure of the interview, and wouldn’t want to ask others to do it too. I agree with Aaron – I am often unable to solve complex problems during the interview because of the nervous pressure. But I don’t want to hire people as good as me, I want to hire people much better than me. It’s a cliché, but firing an employee is so difficult and expensive, that the last thing to be concerned about is a false negative.

Occasionally, some candidates suggest an alternative to a formal process (such as asking them to write a piece of code over a period of a few days, or hiring them temporarily as contractors). This has never really worked for me. Firstly, I can never feel confident enough in the candidate’s abilities without going through all the technical questions. Secondly, this ends up taking a tremendous amount of time, and if the candidate doesn’t work out (which will happen most of the time), you’ve essentially wasted a ton of your time and theirs.

If you decide to go down the path of informal interviews, I’d caution you to be extremely careful. At the very least, make sure you do it because you honestly think it’s a better way to hire great people, and not because you feel insecure about asking intimidating people tough questions. My opinion is that if you end up hiring great people this way, it will probably be an accident.

Making exceptions for technical friends or ex-coworkers

A special case of the above is hiring technical friends, family, or old coworkers, without a formal interview. Unless you’re absolutely certain, beyond a shadow of a reasonable doubt, that they’re one of the best developers in the world, don’t hire them without an interview. It might be awkward to ask a friend or someone you’ve already worked with for two years to go through a formal interview, but it will be much more awkward to ask them to leave when they don’t work out.

This is one of the reasons I prefer not to meet new candidates in an informal setting before the first interview. It establishes a rapport of friendship, and makes it that much harder to ask them to do a formal interview later. If you’re a beginner and lack the confidence to ask a friend to go through a formal interview, try to meet new candidates in a formal setting to avoid having an awkward conversation later. In retrospect, this seems somewhat obvious, but I’ve seen too many smart people skip a formal interview process because they’ve made friends with the candidate. I’d wager a guess that the overwhelming majority of Silicon Valley startups founded by young, first-time entrepreneurs make this mistake, and most of the time it has disastrous consequences.

Avoiding candidates that seem intimidating

It’s funny how many of the rookie mistakes stem from lack of confidence. Early on, I avoided candidates that seemed intimidating either because of their skills, age, or track record. In retrospect, this is the exact opposite of what I should have been doing! Recently, someone who was a very successful manager at Google gave me great advice – if someone does not intimidate you, don’t hire them. Period.

Building an army of clones

Diversity is very much valued in the U.S. Every company aspires to have diverse employees, but this is much easier said than done because true diversity is completely unobvious. You can go out of your way to hire people of different nationalities, skin colors, and cultural backgrounds, but if they’re all Java-only programmers, it won’t do your company much good. We are naturally inclined to sympathize with people that are similar to us, so the intellectual honesty required not to mistake diversity for lack of intelligence is staggering.

Sometimes it is immediately obvious that a company (or an industry sector) is hiring clones of its current employees because its interview questions resemble a secret handshake for a fraternity. When I was working on Wall Street, everyone always asked what C++ keywords “explicit” and “mutable” do. Every Wall Street software veteran knew the answer, but I’ve never encountered a single codebase there that made use of these keywords.

In many cases, the fact that a company hires only very specialized people is hidden by an unconscious linguistic transformation. Many companies today heavily focus on questions from the field of algorithms, but rename them to “problem solving” questions. The field of algorithm design and analysis is an essential pillar of Computer Science, but it is very important not to focus the vast majority of the interview on algorithms alone. It is only one field of many that need to be probed during an interview. Algorithms involve problem solving, sure, but they also involve a tremendous amount of domain expertise in algorithms. Not being able to solve complex and unobvious algorithmic problems doesn’t necessarily make one a poor problem solver – only a poor algorist.

One example of this is Microsoft – a company that at its heyday had plenty of great coders, but very few people with a great sense of user interface design. This likely happened because early Microsoft employees, all brilliant people, mistook lack of algorithms experience coupled with a great sense of UI, for lack of intelligence. In order to avoid this problem I always ask the candidate to teach me about a field I know nothing about. Anything will do – jazz, physics, color theory, nutrition, computer vision, anything hard. If all our interests intersect, I’m probably hiring a clone of myself. If this happens too often, the interview process is likely overfitting, and needs to be redesigned.

Ignoring everything but intelligence

I’m ashamed to admit that early in my career I was a pretty bad employee. My skills and intelligence weren’t an issue – I never encountered a problem at work that I couldn’t solve. But my attitude was. I’d spend days lingering, and I often made my coworkers linger with me. I’d read technical articles all day long, or complain about the poor quality of the code base or the inflexibility of company policies, or talk for hours about how the whole thing would have been much better if it were written in Lisp.

Well, it was fun to read articles on my employer’s dime, the code base usually was horrendous, the policies were terrible, and the whole thing would have been much better were it written in Lisp. But none of that was constructive, or useful, or helpful to my employer or me. In retrospect, I can’t believe how incompetent I was, despite my skills and (hopefully) intelligence.

Intelligence means nothing without solid work ethic and a killer drive to accomplish useful things. Many young, intelligent people feel a sense of entitlement because the job market is heavily stacked in their favor. Smart people should feel entitled, but if this goes out of reasonable proportions, it can be a very poisonous mindset to be in. This is very difficult to recognize during the interview, so try to pay attention to little hints that might indicate the candidate has raw intelligence that isn’t backed by good old professionalism. Steer clear of such candidates no matter how smart they are, or you will have a much bigger problem on your hands later on.

Letting the job market set the anchor

In his article The Guerrilla Guide to Interviewing, Joel Spolsky says:

At the end of the interview, you must be prepared to make a sharp decision about the candidate. [...] Never say “Maybe, I can’t tell.” If you can’t tell, that means No Hire. It’s really easier than you’d think. Can’t tell? Just say no! If you are on the fence, that means No Hire. Never say, “Well, Hire, I guess, but I’m a little bit concerned about…” That’s a No Hire as well. Mechanically translate all the waffling to “no” and you’ll be all right.

In my experience, this is much harder to do than Joel might lead you to believe, because of a psychological effect sales people refer to as “anchoring”. In sales, naming a price first is called setting the anchor, because this number will affect the psychology of the counterparty for the rest of the negotiations process. If you offer a software license for $50,000 per year, people might try to negotiate you down to $25,000 per year, but it wouldn’t even occur to them to offer you fifty bucks.

During the recruiting process, you’re entering implicit negotiations with the job market, and the first few candidates it throws at you will act as the anchor. Chances are, the first few candidates will be mediocre, so when you see someone who’s a lot better than mediocre (but still nowhere near stellar), you’ll automatically get excited about hiring them.

Over time, the pressure to hire people in order to deliver the product will force you to second guess your standards. You’ll start wondering if the interview questions you’re asking are too hard, or if your standards are impossibly high. And you will hire mediocre employees who will hire more bad people, who eventually will wreck your company. The problem seems easy to avoid if you know about it, but in practice it’s very difficult to fight your own psychology.

So set the anchor first, before you even see the first resume. If you’re building a software startup, recognize that you’re competing with everyone in the world, and can only win if you first employees are the best in the world at what they do. Not the “Northeast division” best. The Olympics best. Theoretically, it’s possible that your standard is unreasonably high, but in practice people almost always have the opposite problem.

I look to Bell Labs for inspiration. At its peak, the folks at Bell Labs developed radio astronomy, the transistor, the laser, information theory, the C programming language, and the UNIX operating system. These are the kinds of people you should be trying to hire. Think Dennis Ritchie before he developed the C language. Think Claude Shannon before he invented information theory. When in doubt, ask yourself: “would this person have been good enough to be hired for a junior position at Bell Labs during its peak?” If the answer isn’t a resounding yes, it’s a No Hire.

Knowing about these mistakes helped us tremendously to design a balanced hiring process for recruiting great candidates. In the next post, I will cover the structure of the actual interviews, our philosophy for picking technical questions, and the exact types of questions we ask every candidate.

Interested in working at RethinkDB? We’re hiring – please see our jobs page for more details.

Thanks to Ben Kudria and John Rizzo for feedback and many great suggestions on how to improve this post.

High scalability: SQL and computational complexity

Interested in working at RethinkDB? We’re hiring – please see our jobs page for more details.

Recently there has been a lot of discussion on fundamental scalability of traditional relational database systems. Many of the blog posts on this topic give a great overview of some of the immediate issues faced by engineers while scaling relational databases, but don’t dissect the problem in a systematic way and with sufficient depth to get to the core issues. I’d like to dedicate a series of blog posts to the problem of scalability and how it pertains to relational databases.

There are a number of aspects of an RDBMS that are relevant to high scalability:

  • Relational data model
  • Constraint enforcement (including referential integrity)
  • SQL and its operational semantics
  • ACID compliance

In addition, every aspect above must be discussed in the context of the following attributes:

  • A specific usage pattern
  • The implementation (a concrete system vs. a theoretically possible ideal)
  • Hardware platform (few expensive machines vs. many cheap machines)

In these series I will deal exclusively with OLTP (realtime) use cases. I will not discuss various forms of analytics, data mining problems, etc. For this post, let’s define realtime as O(k * log N), where k is some small constant that represents a well defined number of queries, and N is the total size of the data (in rows). In other words, all operations of a realtime database must completely evaluate in a logarithmic time relative to the full size of the dataset, and the number of such operations required to make a logically complete business transaction is small and independent of the size of the dataset. I chose this definition because it seems intuitively interesting – other functions that we see in practice are unlikely to satisfy realtime demands of real world systems. In future blog posts I’ll restrict this definition even further to account for constant factors, but for now reasoning in terms of complexity theory is sufficient.

There are two aspects of high scalability I’d like to cover – specific issues and problems relevant today, and what we can expect from data management systems in the future. In order to cover these aspects, I will focus both on concrete systems commonly used in production and on theoretical ideals. For completeness, I will also discuss most issues in terms of horizontal and vertical scaling. Every time I talk about an RDBMS, I will define these contexts explicitly.

Since I’m a big fan of theory of computation and programming language theory, I’ll kick off the series with a discussion on scalability of SQL from the perspective of theory of computation (here I use the acronym ‘SQL’ in its strictest sense and mean the actual query language as defined by the ANSI standard and implemented by most vendors).

From a purely theoretical, computational perspective, the ANSI SQL-92 is equivalent to a primitive recursive language. Most real world SQL implementations are even more expressive, and are Turing-complete. As far as vertical scalability is concerned, SQL is simply too expressive. Even if we restrict ourselves to SQL-92, it is possible to write queries of polynomial, or even exponential complexity – a far cry from a logarithmic requirement we established earlier. This means that according to our definition of real-time, SQL is fundamentally not a vertically scalable language.

What about horizontal scalability? Reasoning about it is more difficult because it involves somewhat esoteric computational classes, and requires additional assumptions. To simplify the reasoning, we make one key assumption. We assume that we can only have a polynomial number of machines (and cores) – this appears to hold true because the number of machines we can manufacture is dwarfed by the astronomical amounts of information we consume. If this holds true, even if we can trivially parallelize each query, an exponential function (the amount of information) divided by a polynomial (number of machines) still dominates the logarithmic function we defined earlier as acceptable. This means that given modern trends, if a given query isn’t scalable vertically, it also isn’t scalable horizontally, which makes SQL fundamentally unscalable, period.

Of course so far we’ve shown what we already know – that it is possible to write SQL queries that will likely never be fast enough to evaluate in practice. At first glace this doesn’t appear very useful – all we have to do is avoid writing such queries and use the subset of SQL that can be evaluated in logarithmic time. Unfortunately from a theoretical (and far too often practical) perspective doing this is impossible.

The culprit is SQL’s lack of operational semantics. Even a simple point query can (and often does) run in O(1) time for hash indexes, O(log N) for tree indexes, or O(N) for a linear scan. For more complicated queries, there are too many edge cases where the optimizer might magically switch from logarithmic to linear execution on a whim, despite having an index available. In practice, these changes result in expensive downtime, and hours of debugging and rearchitecting. For massively scalable realtime systems, this is SQL’s Achilles’ heel – you can’t use a subset of SQL that runs in logarithmic time – some of the time (in practice, far more often than you’d like), you’ll end up writing queries that don’t satisfy your requirements. If we do settle on a declarative query language (and I believe that anything else is a huge step backward) for massively scalable systems, it must have the property that any query you could express in this language is guaranteed to evaluate in logarithmic time.

Of course such a language has a significantly limited purpose. It cannot be used for most analytics problems, and more importantly for realtime systems, cannot be used for realtime problems which involve polynomial islands of data in the exponential universe. Facebook may some day have billions of users, but any given user is unlikely to have more than a thousand friends. In this scenario there are realtime subproblems where linear, and loglinear queries are perfectly acceptable, and our language can’t handle them. This means that for these subproblems, one must use a different system, which may or may not be an acceptable solution. Perhaps a better solution is to design a database system that only allows to run provably logarithmic queries for massive datasets, but relaxes the requirement for smaller subsets of data.

Unfortunately it would be extremely difficult to modify SQL to satisfy this behavior because it isn’t modular – almost all additions are special forms, and it is so far removed from any theoretical model (including relational algebra), that reasoning about it in a rigorous way is extremely difficult for both humans and compilers. My prediction is that systems of the future will use a modular, verifiable, higher order query language capable of enforcing various complexity requirements at compile time, and that it will not look very much like SQL.

Interested in working at RethinkDB? We’re hiring – please see our jobs page for more details.