To become a data scientist...

I think the "Data Science Venn Diagram" (http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram) is a great place to start. You need three things to be a good data scientist:

Statistical knowledge

You need to be able to think "statistically": you need to be able to turn sample data into inferences about the underlying population. I'm not sure how you develop statistical thinking - I did it through a masters and then PhD in statistics, but that's obviously a big time investment!

I think you need some knowledge of specific statistical/machine learning techniques, but a deep theoretical understanding is not that important. You need to understand the strengths and weaknesses of each technique, but you don't need a deep theoretical understanding. The vast majority of data science problems can be solved by a creative assembly of off-the-shelf techniques, and don't require new theory.

I'd recommend developing a familiarity with linear models and their variations (esp. generalised linear models, splines and the lasso). Yes, they are linear, but a linear approximation is a good place to start for many problems. For problems that focus more on prediction than understanding, make sure you're familiar with the most popular ML techniques, e.g. random forests and support vector machines.

Programming skills

You need to be fluent with either R or python. There are other options, but none of them have the community that R and python have, which means you'll need to spend a lot of time reinventing tools that already exist elsewhere. Obviously, I prefer R, and unlike what some people claim it is a well founded programming language that is well tailored for its domain.

If you use R you want to be conversant with a set of packages that allows you to solve the following practical problems:

My recommendations for starting places are:

You should also invest some time in learning how to be a productive R programmer (e.g. http://adv-r.had.co.nz) and learning how to write packages (http://r-pkgs.had.co.nz). Start by learning the basics of functional programming - this will have the biggest payoff for your productivity in R.

Domain knowledge

This obviously depends on the domain, but as a data scientist should be able to contribute meaningfully to any project, even if you're not intimately familiar with the specifics. I think this means you should be generally well read (e.g. at the level of New Scientist for the sciences) and an able communicator. A good data scientist will help the real domain experts refine and frame their questions in a helpful way. Unfortunately I don't know of any good resources for learning how to ask questions.