Step 1: Identify your needs.
For any hire, it's wise to start by identifying your needs. If you're unclear on exactly why you need a data scientist then you probably don't need a data scientist, or you're the wrong person to be making this important hiring decision.
It may help to start by asking yourself a few questions:
Why am I looking to hire a data scientist?
What will this person do on a day-to-day basis?
What will this person be expected to deliver?
Step 2: Prioritize
Prioritize the skills & knowledge that are critical to deliver on the expectations defined in step 1. I like to start by making a list of all the qualities an ideal candidate would have. If you don't have an in-depth knowledge of machine learning yourself, you'll definitely want to perform this step with someone who does.
Make sure you don't neglect culture fit and interpersonal characteristics. As with all hiring, you'll rarely find someone who fits all the requirements; finding the right fit is about prioritizing well and making smart tradeoffs. If a candidate is the field's leading expert on an an algorithm critical to your success, but is a disrespectful prick, you'll have to split hairs somewhere and make compromises.
If you're interested in a simple way of doing this empirically, I've provided a method below. If you're not, you can skip to the next paragraph.
Create a square matrix of these characteristics as both rows and columns. For each column, assume the person has the trait, and for each row, assume the person lacks the trait. Since you can't be both, fill the diagonal with zeros. Then, for each pair in the matrix, put a zero if you wouldn't and a one if you you would hire the person who has the column trait but lacks the row trait. When you're done with the whole matrix, sum along the columns to get a row vector, and along the rows to get a column vector. Transpose one of the vectors and subtract the "does not have trait" vector from the "has trait" vector. This effectively assigns positive value to the trait if you're likely to hire the person when the trait is present, and penalizes the value of the trait if you'd hire someone despite their lack of the trait. The highest number in the vector of trait values will corresponds to the highest trait you should prioritize, while the lowest number should be your trait with the lowest priority. Later, when you assess the candidate for each of these characteristics (see the next step), you can objectively compare candidates using a single number by taking the dot product of this "trait value" vector with their scores assigned to each trait.
It's important to get the prioritization right early on, so that during interviews you can avoid wasting time deciding what's important. Instead, you'll be able to effortlessly apply these heuristics without getting bogged down in analysis paralysis.
To narrow down the list of traits, ask yourself questions like 'What domains of knowledge are relevant to my product?' For example, a solid background in image processing and feature extraction algorithms may matter a lot if you're building an emotion recognition system from real-time video feeds of faces. It may be less relevant if you're a hedge fund trying to predict stock price fluctuations from tweets.
Step 3: Operationalize
For each of the traits, identify a quick, efficient way to assess whether your candidate sufficiently meets the criteria. Ideally, you'll want to use or develop a standardized set of questions or assignments designed to test the ranked list of traits you identified in step 2.
Try to create questions that are clear and concise, but effective. For example, if you want to design a question that will specifically screen out applicants that don't have the "big picture" spatial intuitions underlying the decision boundaries of different classification algorithms, you might have a multiple choice question with a series of 2D plots requiring the applicant to pick the decision boundary most likely to be generated by an SVM classifier with a RBF kernel. Be sure to tune the depth of the questions to your needs.
Step 4: Automate
Finally, use a tool like HackerRank for Work as a pre-screening tool to automate this set of questions / problems. Automation has numerous benefits, and will pay dividends for years down the road in reduced work.
For example, automation will help reduce the amount of manual scoring, screening, and awkward under-qualified interviewing you'll need to do. As an added bonus, it may help eliminate the contribution of interviewer biases to ensure that you're hiring the most talented people, and not just the best interviewers.
Step 5: Source talent
Reach out to university career services or CS departments. Attend career fairs, hackathons, and meetups. Increase your visibility among the most talented candidates. If you're working on cool stuff, the best candidates will gravitate to you, but only if they know who you are. If you're unable to reach talent through traditional avenues, you may want to try sourcing candidates through online competitions like http://www.kaggle.com/ or https://www.hackerrank.com/.