How to become a Data Scientist ?

Developerking
4 min readAug 6, 2020

Data science is one of the most hyped fields of the 21st century.
Every Tom, Dick and Harry you meet would probably know (or have heard) about it. They would definitely know about Andrew Ng’s ML classes on Coursera and about Weather forecasting using R (if they are serious).

Now, I want you to ask yourself: What happens when this field becomes obsolete or congested? What do you do then? What happens if people no more require a data scientist? (This day will probably never come, I am just putting forward an hypothetical case).

It took me a lot of time to write this answer and I have tried to provide a comprehensive and holistic approach to make yourself not just capable enough to be a data scientist, but much more than that.

Step 1 : Get your mathematics strong and clear.

The following topics must be learnt with utmost clarity:

  1. Linear Algebra: Start from basic vector spaces and go up to Singular Value decomposition. Don’t underestimate G-S orthogonalization process.
  2. Matrix Theory: Learn to find the inverse, transpose, multiplication of matrices, determinants, Eigen values and vectors.
  3. Calculus: Integrals (This must be useful ), Differentiation (See here). Differential Equations (See here).
  4. Numerical Analysis: Numerical methods to find the solution of a Differential equation and Integrals.
  5. Statistics : Distributions, Different kinds of Charts, Mean — Mode — Median (Different methods of finding each and relation between them).
  6. Probability : Basic Probability theory, Bayes Theory, What is Likelihood?, Expectation.

Don’t overdo the math. These suggestions are based on the assumption that you are strong in high school math syllabus.

Step 2: Start reading blogs on data science:

  1. This site is like the hub of all data science, ML and AI related posts — KDNuggets
  2. Analytics Vidhya (See here)
  3. Anil Batra’s Blog (See here)
  4. BzST (See here)
  5. Data Science 101
  6. Data Tau

For more go here: 90+ blogs on Analytics, Data Science etc.

Step 3: Keep up the learning spirit.

Keep yourself updated. Keep reading research papers and re-research the sections that you don’t understand.

Google all the words and phrases you don’t understand, this process can be frustrating and might force you to go “Fuck this shit! I am gonna go study something simpler”. But believe me when I say that if you cross this phase, nothing can ever beat you.

Be a regular visitor of Arxiv, read papers on a regular basis. If you are on Android, Arxiv comes as an Android App too.

Step 4: Start some serious data science.

(By data science I mean fuzzy Logic, ML, neural nets, AI, NLP etc.)

  1. ML by Andrew Ng at Stanford (Please avoid the one at coursera).
  2. ML by Abu Mostafa at Caltech.
  3. ML at NPTEL at IIT-M (For people outside India, NPTEL is an Indian site where students can see lectures by awesome professors from IITs and other institutes of national importance).
  4. Introduction to Data Analytics at Udacity.
  5. Introduction to Deep learning at Udacity by Google.
  6. Introduction to NLP using nltk-python (here).
  7. Introduction to Neural Networks: Book by Simon Haykin. Also learn how to use the neural network toolbox of MATLAB, comes real handy in visualising concepts.
  8. Introduction to Fuzzy Logic: Book by Timothy Ross. Same instructions as above.
  9. Learning data science concepts and R (or Python) simultaneously at DataCamp.
  10. I recently came across this site and this has made me fall in love with it. Check it out here: CodeMentor.

Step 5: The Real Action Starts now

Now that you are thorough with the basics, dive into making cool stuff with R, Python or any language of your choice.

First and foremost let me give a very useful piece of advice.

DO NOT TAKE UP A HUGE PROJECT AND END UP DOING NOTHING.

Because, let’s face it, you (and me too) don’t have enough patience to sit and face the same dead-end a hundred times.

Instead, take up a project and implement one functionality a day. This way you will have a balanced diet for your growing data scientist brain.

  1. Learn Numpy (Numpy|Official).
  2. Learn Pandas.
  3. Learn Matplotlib, Jupyter and Seaborn.

These are the basic Python modules for data science using Python.

I believe learning R is far easier than learning Python.

Most of the R packages required for data science come pre-installed. However, a thorough list is as follows:

  1. stringr (string manipulation)
  2. Database connection packages RPostgreSQL, RMYSQL, RMongo, RODBC,RSQLite
  3. lubridate (time and date manipulation)
  4. ggplot2 (data visulization)
  5. qcc (statistical quality control and QC charts)
  6. reshape2 (data restructuring)
  7. plyr (data aggregation)
  8. dplyr

There are many more packages, you may explore them in your own time.

Tips and Tricks:

  • To save yourself from the hassle of installing so many modules each and every time on Python, use WinPython (download here). This will have most of the modules pre-installed in it and you can also use it in portable mode.
  • For R, download the <package name>-release.zip from CRAN, and save it in a local directory. However, you will need an internet connection to install it, for that you can simply tether your phone’s data to your pc (if you don’t have Wi-Fi).
  • For downloading any package, simply search <package name> Python module, or <package name> R package on Google.

After you are done with the above, and if you still have some juice left, dive into the following:

  1. Data Visualization: d3.js, tableau, qlikview.
  2. Learn Julia.
  3. Compete at Kaggle.
  4. Learn MongoDb.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Developerking
Developerking

Written by Developerking

We are creating the world's largest community of developers & Share knowledgeable contents , So Stay Tuned & Connect with us !! Become a Dev King Member.

No responses yet

Write a response