DataHack Hour Revealed – the best way to learn data science through hands on problems!
Machine Learning
Introduction As part of DataFest 2017, we launched a new initiative – DataHack Hour. DataHack Hour was inspired by numerous queries we get related …
Behavioral Analytics : When Psychology collides with analytics
Business Analytics
Introduction Today’s post is going to be very different from all the post I have published till now. For past few months, I have …
40 Questions to test a data scientist on Deep Learning [Solution: SkillPower – Deep Learning, DataFest 2017]
Deep Learning
Introduction Deep Learning has made many practical applications of machine learning possible. Deep Learning breaks down tasks in a way that makes all kinds …
Analytics Vidhya turns 4 – A journey from a part-time blog to Top Data Science Knowledge Portal
Analytics Vidhya turns 4 today Analyticsvidhya.com was registered on this day 4 years ago. In these 4 years, what started as a part time …
40 Questions to test a data scientist on Time Series [Solution: SkillPower – Time Series, DataFest 2017]
Business Intelligence
Python
R
Introduction Time Series forecasting & modeling plays an important role in data analysis. Time series analysis is a specialized branch of statistics used extensively …
40 Questions on Probability for data science – [Solution: SkillPower – Probability, DataFest 2017]
Business Analytics
Introduction Probability forms the backbone of many important data science concepts from inferential statistics to Bayesian networks. It would not be wrong to say …
Deep Learning vs. Machine Learning – the essential differences you need to know!
Deep Learning
Machine Learning
Introduction Machine learning and deep learning on a rage! All of a sudden every one is talking about them – irrespective of whether they …
Feature Engineering in IoT Age – How to deal with IoT data and create features for machine learning?
Machine Learning
Introduction If you ask any experienced analytics or data science professional, what differentiates a good model from a bad model – chances are that …
Winner’s Approach – Rampaging DataHulk MiniHack, AV DataFest 2017
Machine Learning
Introduction Who are you competing with? While participating in a hackathon, a lot of people think that they are competing against the top data …
Moving beyond frontiers in Data Science – Interview with Mahesh Kumar, Founder & CEO, Tiger Analytics
Introduction This DataFest, we are bringing thought leaders & influencers from industry as part of our interview series – Moving Beyond Frontiers in Data Science section. We …
Natural Language Processing Made Easy – using SpaCy (​in Python)
Machine Learning
Python
Introduction Natural Language Processing is one of the principal areas of Artificial Intelligence. NLP plays a critical role in many intelligent applications such as …
AV DataFest 2017 – The Panel discussion, Knowledge Intensive Webinars and Prize details!
Introduction If something is important to you, you will try it even when the odds are against you.  – Elon Musk As we rush …
Measuring Audience Sentiments about Movies using Twitter and Text Analytics
Business Analytics
Introduction The practice of using analytics to measure movie’s success is not a new phenomenon. Most of these predictive models are based on structured …
Extracting information from reports using Regular Expressions Library in Python
Machine Learning
Python
Introduction Many times it is necessary to extract key information from reports, articles, papers, etc. For example names of companies – prices from financial …
TensorFlow 101: Understanding Tensors and Graphs to get you started in Deep Learning
Deep Learning
Introduction TensorFlow is one of the most popular libraries in Deep Learning. When I started with TensorFlow it felt like an alien language. But …
Beginner’s Guide on Web Scraping in R (using rvest) with hands-on example
Machine Learning
R
Introduction Data and information on the web is growing exponentially. All of us today use Google as our first source of knowledge – be it …
Big Data Learning Path for all Engineers and Data Scientists out there
Big data
Machine Learning
Introduction The field of big data is quite vast and it can be a very daunting task for anyone who starts learning big data …
How I created a package in R & published it on CRAN / GitHub (and you can too)?
Machine Learning
R
Introduction Most popular programming languages have one thing in common – they are all “Open source”. Open source is a decentralised development model which is based …
40 Must know Questions to test a data scientist on Dimensionality Reduction techniques
Machine Learning
Introduction Have you come across a dataset with hundreds of columns and wondered how to build a predictive model on it? Or have come …
AV DataFest 2017 – Out in its Full Glory
AV DataFest 2017 – Here we begin !! This April, the world will see a battle fought by data scientists and data managers across …
How to handle Imbalanced Classification Problems in machine learning?
Machine Learning
R
Introduction If you have spent some time in machine learning and data science, you would have definitely come across imbalanced class distribution. This is …
Introduction to Conditional Probability and Bayes theorem for data science professionals
Business Analytics
Introduction Understanding of probability is must for a data science professional. Solutions to many data science problems are often probabilistic in nature. Hence, a better …
Celebrating Women’s Day: 33 Women in Data Science from around the World & AV Community
Introduction She Believed, she could. So, she did This Women’s Day we are celebrating the women power. We are celebrating all those women who …
Introduction to Gradient Descent Algorithm (along with variants) in Machine Learning
Deep Learning
Machine Learning
Python
R
Introduction Optimization is always the ultimate goal whether you are dealing with a real life problem or building a software product. I, as a …
How to read most commonly used file formats in Data Science (using Python)?
Machine Learning
Python
Introduction If you have been part of data industry, you would know the challenge of working with different data types. Different formats, different compression, …
Introductory guide on Linear Programming for (aspiring) data scientists
Business Analytics
Introduction Optimization is the way of life. We all have finite resources and time and we want to make the most of them. From …
5 More Deep Learning Applications a beginner can build in minutes (using Python)
Deep Learning
Python
Introduction Deep Learning is fundamentally changing everything around us. A lot of people think that you need to be an expert to use power of …
Interview with Harish Subramanian, Program Director, PGP- Big Data Analytics by GLIM
Big data
Introduction Big data is being generated all around us. Every social media exchange, every digital process, every connected device and machine are generating data …
How to leverage Social Media Analytics for your business?
Business Intelligence
Introduction Conventional media, such as television, radio or newspapers transmits information only in one direction. Users can consume the information which the media offers, …
Brace Yourself – DATAFEST 2017 is coming & Call for AV Volunteers!
The start Big things often have small beginnings What is common between Richard Branson, Pierre Omidyar, Mark Zuckerberg and Colonol Sanders? They all started small, …
Top 28 Cheat Sheets for Machine Learning, Data Science, Probability, SQL & Big Data
Big data
Machine Learning
Python
R
Introduction Data Science is an ever-growing field, there are numerous tools & techniques to remember. It is not possible for anyone to remember all …
How to build Ensemble Models in machine learning? (with code in R)
Machine Learning
R
Introduction Over the last 12 months, I have been participating in a number of machine learning hackathons on Analytics Vidhya and Kaggle competitions. After …
40 Questions to ask a Data Scientist on Ensemble Modeling Techniques (Skilltest Solution)
Machine Learning
Python
R
Introduction Ensemble modeling is a powerful way to improve the performance of your machine learning models. If you wish to be on the top …
6 Deep Learning Applications a beginner can build in minutes (using Python)
Deep Learning
Python
Introduction Deep Learning has been the most researched and talked about topic in data science recently. And it deserves the attention it gets, as some …
40 Questions to test a Data Scientist on Clustering Techniques (Skill test Solution)
Business Analytics
R
Introduction The idea of creating machines which learn by themselves has been driving humans for decades now. For fulfilling that dream, unsupervised learning and …
40 must know Questions on Base SAS for Analysts (Skill test Solution)
Business Analytics
Introduction SAS probably holds the highest market share in analytics solutions for enterprises. With its good data handling and graphical capabilities, SAS is an …
Basics of Probability for Data Science explained with examples
Business Analytics
Introduction Statistically, the probability of any one of us being here is so small that you’d think the mere fact of existing would keep …
Comprehensive & Practical Inferential Statistics Guide for data science
Business Analytics
Introduction Statistics is one of the key fundamental skills required for data science. Any expert in data science would surely recommend learning / upskilling yourself …
45 Questions to test a data scientist on basics of Deep Learning (along with solution)
Machine Learning
Introduction Back in 2009, deep learning was only an emerging field. Only a few people recognised it as a fruitful area of research. Today, …
Infographic – Learning Plan 2017 for beginners in data science
Infographics
Machine Learning
Python
R
Introduction Through this plan, we aim to remove the confusion in learning data science for beginners. The biggest challenge which beginners face while learning data …
Infographic – Learning Plan 2017 for Transitioners in data science
Infographics
Machine Learning
Python
R
Introduction This plan is for people planning a career shift in analytics and data science this year. Entering a new field can be overwhelming. What to …
Infographic – Learning Plan 2017 for Intermediates in data science
Infographics
Machine Learning
Python
R
Introduction We believe, learning should never stop. This plan is for people with basic knowledge of machine learning or deep learning. You can advance …
Introduction to Structuring Customer complaints explained with examples
Machine Learning
Python
Introduction In past, if you were not particularly happy with a service or a product, you would go to the service provider or the …
21 Steps to Get Started with Apache Spark using Scala
Machine Learning
Introduction If you ask any industry expert, what language should you learn for big data, they would definitely suggest you to start with Scala. …
Comprehensive Guide on t-SNE algorithm with implementation in R & Python
Machine Learning
Python
R
Introduction Imagine you get a dataset with hundreds of features (variables) and have little understanding about the domain the data belongs to. You are expected …
MyStory: How I became a Data Science Hacker from being a Delivery Head
Stories
It was a hot Sunday afternoon in June 2014. I still remember that day and recalling that day still gives me goosebumps. I was …
Simple Beginner’s guide to Reinforcement Learning & its implementation
Machine Learning
Python
Introduction One of the most fundamental question for scientists across the globe has been – “How to learn a new skill?”. The desire to …
The most comprehensive Data Science learning plan for 2017
Business Analytics
Machine Learning
Python
R
I joined Analytics Vidhya as an intern last summer. I had no clue what was in store for me. I had been following the …
MyStory: How I became a Data Science Analyst from a Software developer?
Stories
 Background Don’t let the noise of others’ opinions drown out your own inner voice. -Steve Jobs To be honest, my inner voice always told …
Sentiment Analysis of Twitter Posts on Chennai Floods using Python
Python
Introduction The best way to learn data science is to do data science. No second thought about it! One of the ways, I do …
Ultimate Guide to Understand & Implement Natural Language Processing (with codes in Python)
Machine Learning
Python
According to industry estimates, only 21% of the available data is present in structured form. Data is being generated as we speak, as we tweet, …
46 Questions on SQL to test a data science professional (Skilltest Solution)
Business Analytics
Introduction If there is one language, every data science professional should know – it is SQL. SQL stands for Structured Query Language. It is …
19 MOOCs on Mathematics & Statistics for Data Science & Machine Learning
Business Analytics
R
Introduction Before creation, God did just pure mathematics. Then he thought it would be pleasant change to do some applied         …
How to create Beautiful, Interactive data visualizations using Plotly in R and Python?
Python
R
Introduction The greatest value of a picture is when it forces us to notice what we never expected to see. ―John Tukey Data visualization …
Welcome 2017 – Are you prepared for a year of data based disruption?
We are in exciting and challenging times. The pace of change in data science industry is increasing by the day. It is difficult to …
Top 35 Articles and Resources from Analytics Vidhya for the year 2016
Machine Learning
Python
R
Introduction Reflection time! Yes – it is that time of the year, when you stand and look back. You take a small pause, soak …
[Announcement] Launching Analytics Vidhya glossary & new revamped Job portal
Machine Learning
Introduction As 2016 comes to a close, we are thinking about one and one thing only- how to make Analytics Vidhya more useful for …
Who is the Superhero of Cricket battlefield? An In-Depth Analysis
Business Analytics
Introduction The cricket battlefield is competitive and challenging. Players have become extremely professional and disciplined about their training. Companies have optimized the weight of …
Artificial Intelligence Demystified
Machine Learning
Introduction Artificial Intelligence has become a very popular term today. There is sure to be at least one article in the newspaper daily on …
30 Top Videos, Tutorials & Courses on Machine Learning & Artificial Intelligence from 2016
Machine Learning
Python
R
Introduction 2016 has been the year of “Machine Learning and Deep Learning”. We have seen the likes of Google, Facebook, Amazon and many more …
Data Pre-Processing: A Crucial Element of Analytics – Driven Embedded Systems
Business Analytics
Business Intelligence
Introduction The goal of the Internet of Things (IoT) is to acquire data from various embedded systems and impart analytical processes on that data …
45 questions to test a Data Scientist on Regression (Skill test – Regression Solution)
Machine Learning
Python
R
Introduction Regression is much more than just linear and logistic regression. It includes many techniques for modeling and analyzing several variables. This skill test …
Cheatsheet: Scikit-Learn & Caret Package for Python & R respectively
Infographics
Machine Learning
Python
R
Introduction For any Python or R practitioner, this article will prove to be a boon. We provide you cheatsheets for the most widely used machine …
Launching Analytics Vidhya Secret Santa – Kick start 2017 with this gift!
The first things which come to my mind, when I think about Christmas are holidays, family time and festivities! Yes – it is that …
Getting ready for AI based gaming agents – Overview of Open Source Reinforcement Learning Platforms
Machine Learning
Python
Introduction We are living in exciting times. We are all set to create an army of smart machines and robots. Creating these machines has …
21 Deep Learning Videos, Tutorials & Courses on Youtube from 2016
Machine Learning
Python
R
Introduction Until a few years back, deep learning was considered of a lesser importance as compared to machine learning. The emergence of neural networks & …
Exclusive AMA with Data Scientist – Sebastian Raschka
Machine Learning
Python
Introduction At Analytics Vidhya, we are always in pursuit of providing you learning and networking opportunities. We bring you closer to the best data scientists …
10 Super exciting Data Science / Machine Learning / Artificial Intelligence based startups in India
Business Analytics
Machine Learning
Introduction Data technologies have been around for some time now. But, increase in data generation and availability of servers on the cloud has enabled …
Cheatsheet – Excel Functions & Keyboard Shortcuts
Business Analytics
Introduction What is the most commonly used tool in data industry? You might have guessed it because of the title of the article – …
Practical guide to implement machine learning with CARET package in R (with practice problem)
Machine Learning
R
Introduction One of the biggest challenge beginners in machine learning face is which algorithms to learn and focus on. In case of R, the …
Analytics Roadshow with UpGrad & IIIT-Bangalore (3 Dec ’16 – 11 Feb ’17)
At Analytics Vidhya, we love evangelizing Analytics and Data Science. One of the biggest reason behind creating Analytics Vidhya was to address the knowledge …
Medium.com – Top 14 handles & publications to follow for Data Science
Machine Learning
Introduction Medium is an awesome product! The easy interface, no distraction and high readability are some of the drivers of popularity of Medium.  I …
45 questions to test Data Scientists on Tree Based Algorithms (Decision tree, Random Forests, XGBoost)
Machine Learning
Introduction Tree Based algorithms like Random Forest, Decision Tree, and Gradient Boosting are commonly used machine learning algorithms. Tree based algorithms are often used …
21 Reason why you should NOT become a Data Scientist
Introduction Time for some Friday Fun! In last few years, the growth of Data Scientists has been following the growth in data . You …
Introduction to Feature Selection methods with an example (or how to select the right variables?)
Machine Learning
Python
R
Introduction One of the best ways I use to learn machine learning, is by benchmarking myself against the best data scientists in competitions. It …
Building a machine learning / deep learning workstation for under $5000
Big data
Machine Learning
Introduction Building a machine learning / deep learning workstation can be difficult and intimidating. There are so many choices out there. Would you go …
In talk with Manvender Singh, CEO – UpX Academy – Taking Data Science certification to new heights
Big data
Business Analytics
Introduction I have been following the startup ecosystem in Data Science education sector very closely for some time now. Recently, I came across UpX Academy, …
Mystory: I became a Data Scientist after 8 years working as a Software Test Engineer
Stories
Background I am Bindhya Rajendran, an Electronics and Communication Engineer, with more than 8 years of experience in Quality assurance and an aspiring Analytics …
25+ websites to find datasets for data science projects
Big data
Business Analytics
Business Intelligence
Machine Learning
Introduction If there is one sentence, which summarizes the essence of learning data science, it is this: The best way to learn data science …
Fine-tuning a Keras model using Theano trained Neural Network & Introduction to Transfer Learning
Machine Learning
Python
Introduction We have seen the in-depth detailed implementation of neural networks in Keras and Theano in the previous articles. I think both the libraries are …
Solutions for Skilltest Machine Learning : Revealed
Machine Learning
Python
R
Introduction Automation and Intelligence has always been a driving force for technological advancements. Techniques like machine learning enable these advancements in every domain possible. …
An Introduction to APIs (Application Programming Interfaces) & 5 APIs a Data Scientist must know!
Machine Learning
Introduction If you are in tech domain, you will invariably bump in references to something called an “API”. You just can’t skip it – …
Exclusive Interview with Data Scientist – Bishwarup Bhattacharjee (Analytics Vidhya Rank 8)
Machine Learning
Introduction Energy and Persistence conquers all things!                                   …
8 Interesting Data Science Games to break the ice & Monday Blues!
Machine Learning
Python
R
Introduction All of us have been there – coming to office after a hectic weekend trip or a late night binge on Sunday! It …
Tryst with Deep Learning in International Data Science Game 2016
Machine Learning
Stories
Introduction Proof of the pudding lies in the eating. It takes working on the Deep network and witness it progressively produce good accuracy, to …
Creating an artificial artist: Color your photos using Neural Networks
Machine Learning
Introduction Art has always transcended eons of human existence. We can see its traces from pre-historic time as the Harappan art in the Indus …
An Introduction to Clustering and different methods of clustering
Business Analytics
Machine Learning
Introduction Have you come across a situation when a Chief Marketing Officer of a company tells you – “Help me understand our customers better …
Investigation on handling Structured & Imbalanced Datasets with Deep Learning
Machine Learning
Python
Introduction While Deep Learning has shown remarkable success in the area of unstructured data like image classification, text analysis and speech recognition, there is …
Complete Study of Factors Contributing to Air Pollution
Business Analytics
Machine Learning
R
Introduction The air pollution is one of the main causes of death in the world. Several cities are on the radar of WHO, which …
18 New Must Read Books for Data Scientists on R and Python
Machine Learning
Pandas
Python
R
Introduction “It’s called reading. It’s how people install new software into their brain” Personally, I haven’t learnt as much from videos & online tutorials …
Winners Approach & Codes from Knocktober : It’s all about Feature Engineering!
Machine Learning
Python
R
Introduction If you don’t challenge yourself, you will never realize what you can become Knocktober – the machine learning competition held last weekend sure made history. …
17 Ultimate Data Science Projects To Boost Your Knowledge and Skills (& can be accessed freely)
Machine Learning
Python
R
Introduction Data science projects offer you a promising way to kick-start your analytics career. Not only you get to learn data science by applying, you …
Complete Guide on DataFrame Operations in PySpark
Machine Learning
Python
Introduction In my first article, I introduced you to basic concepts of Apache Spark like how does it work, different cluster modes in Spark …
Winning Strategies for ML Competitions from Past Winners
Machine Learning
Python
R
Introduction We launched Knocktober last night and we were happy to see the excitement it has created among all the participants. This time we …
16 New Must Watch Tutorials, Courses on Machine Learning
Machine Learning
Python
R
Introduction Most of us fail to acknowledge that Youtube has a massive resource center of machine learning tutorials which are free to access. You no longer …
Creating Interactive data visualization using Shiny App in R (with examples)
Machine Learning
R
Introduction There is magic in graphs. The profile of a curve reveals a whole situation in a flash – history of an epidemic, a …
Tutorial: Optimizing Neural Networks using Keras (with Image recognition case study)
Machine Learning
Python
Introduction In my previous article, I discussed the implementation of neural networks using TensorFlow. Continuing the series of articles on neural network libraries, I have …
Exclusive Interview & AMA with Data Scientist – Rohan Rao (Analytics Vidhya Rank 4)
Machine Learning
Introduction There are several aspects to learning a new technical skill. You obviously need to learn the technical stuff, the applications, the hacks and obviously …
Winner’s Solution from the super competitive “The Ultimate Student Hunt”
Machine Learning
Python
R
Introduction The Ultimate Victory in a competition is derived from the inner satisfaction, of knowing that you have done your best and made most …
Using PySpark to perform Transformations and Actions on RDD
Big data
Machine Learning
Python
Introduction In my previous article, I introduced you to the basics of Apache Spark, different data representations (RDD / DataFrame / Dataset) and basics …
An Introduction to Implementing Neural Networks using TensorFlow
Machine Learning
Python
Introduction If you have been following Data Science / Machine Learning, you just can’t miss the buzz around Deep Learning and Neural Networks. Organizations are …
Most Active Data Scientists, Free Books, Notebooks & Tutorials on Github
Machine Learning
Python
R
Introduction “Who’s your favorite data scientist?” asked the recruiter. None of the candidates could give a satisfactory answer. May be, they thought becoming a data …
A Beginner’s guide to Shelf Space Optimization using Linear Programming
Business Analytics
Python
Introduction Have you ever wondered why products in a Retail Store are placed in a certain manner? In the world of analytics, where retail …
AI startups are in the money: What are you doing?
Big data
Business Analytics
Machine Learning
Introduction You are either investing in AI or you are not. If you are, you are making a bet which might continue to pay …
Solutions for Skill test: Data Science in Python
Pandas
Python
Introduction Python is gaining ground very quickly among the data science community. We are increasingly moving to an ecosystem, where data scientists are comfortable …
18 Free Exploratory Data Analysis Tools For People who don’t code so well
Big data
Business Analytics
Business Intelligence
Introduction Some of these tools are even better than programming (R, Python, SAS) tools. All of us are born with special talents. It’s just …
This Machine Learning Project on Imbalanced Data Can Add Value to Your Resume
Machine Learning
R
Introduction It takes sheer courage and hard work to become a successful self-taught data scientist or to make a mid career transition. But, with …
Comprehensive Introduction to Apache Spark, RDDs & Dataframes (using PySpark)
Big data
Introduction Industry estimates that we are creating more than 2.5 Quintillion bytes of data every year. Think of it for a moment – 1 Qunitillion …
40 Interview Questions asked at Startups in Machine Learning / Data Science
Machine Learning
Introduction Careful! These question can make you think THRICE! Machine learning and data science are being looked as the drivers of the next industrial …
Skilltest Statistics II – Solutions
Business Analytics
Introduction Statistics is one of the key ingredient any data scientist must know to have a long successful career in data science industry. After the …
How to prepare for your first data science hackathon in less than 2 weeks?
Big data
Machine Learning
Hackathons are super fun! The thrill of finding a solution in a time bound, high pressure, competitive situation is addictive. However, if you are …
Manipal Global Academy of Data Science Launches Full Time & Part Time Data Science Program
Big data
Business Analytics
Introduction It is a blessing to see an industry with your passion grow leaps and bounds. I was probably lucky to get into analytics …
18 Data Science & IoT Startups from Y Combinator School – Summer 2016
Business Analytics
Introduction Y Combinator recently conducted demo day for their Summer session of 2016. As usual, they had some awesome startups, which are bound to impact …
Our new section – Stories and Why I am super excited about them?
I started Analytics Vidhya with an itch to help out as many people as I can – people who needed technical help, people who …
MyStory: I became a Data Scientist after working for 10 years in IT Industry
Stories
Distant Memories (Prologue) Getting into Analytics or Data science stream was never my dream. I got into this out of an accident. Prior to …
MyStory: How I transitioned to Data Science after 6 years in Data warehousing?
Stories
Background Prior to getting initiated to Data Science,  I was working in data intensive Data-warehousing for more than 6 years. In 2013, I had …
A Complete Guide on Getting Started with Deep Learning in Python
Machine Learning
Python
Introduction Deep Learning, a prominent topic in Artificial Intelligence domain, has been in the spotlight for quite some time now. It is especially known …
Full Solution – Skilltest on R for Data Science
R
Introduction R is the most commonly used tool in analytics industry today. No doubt, python is catching up quickly. Many companies which were heavily reliant …
Solutions for Skilltest in Statistics Revealed
Business Analytics
Introduction Statistics is one of the founding pillars for a career in data science and business analytics. Unless a person understands the basics of …
10 Real World Applications of Internet of Things (IoT) – Explained in Videos
Big data
Introduction Do you know what separates humans from other living beings? Curiosity. Humans are curious. We question a lot. We are the ones who …
Beginners Guide to Topic Modeling in Python
Business Analytics
Python
Introduction Analytics Industry is all about obtaining the “Information” from the data. With the growing amount of data in recent years, that too mostly …
Bringing Analytics into Indian Film Industry with Back Tracing Algorithm
Business Analytics
Introduction With a turnover of 2.23 Billion USD and overall marketing spend of roughly 50% per film, the film business in India is one …
Industry Insight – Fighting Cyber Fraud with Analytics
Business Analytics
Introduction Dave Palmer, CTO of Darktrace, a global leader in cyber threat defence believes that technological progress has propelled society in to a “golden …
Launch of AV Casino – An Introduction to Probability
Business Analytics
Over the last few months, you would have seen us experimenting with various formats to aid learning and knowledge exchange among our community members. …
Winner’s Secrets Decoded from “The Smart Recruits”
Machine Learning
Introduction   Lao Tzu philosophy matches our thoughts behind AV hackathons. We believe knowledge can only be useful when it is applied and tested time and …
Practicing Machine Learning Techniques in R with MLR Package
Machine Learning
R
Introduction In R, we often use multiple packages for doing various machine learning tasks. For example: we impute missing value using one package, then build a …
Innovation in Analytics Education: Great Lakes using mentored learning for Online Courses
Business Analytics
Analytics education industry is increasingly becoming a competitive landscape. This is primarily fuelled by the fact that Analytics & Data Science industry is one …
The Evolution and Core Concepts of Deep Learning & Neural Networks
Machine Learning
Introduction With the evolution of neural networks, various tasks which were considered unimaginable can be done conveniently now. Tasks such as image recognition, speech recognition, …
Tutorial – Data Science at Command Line with R & Python (Scikit Learn)
Machine Learning
Python
R
Introduction The thought of doing Data Science at Command Line may possibly cause you to wonder, what new devilry is that? As if, it …
Making Predictions on Test Data after Principal Component Analysis in R
Machine Learning
R
Introduction This is an update of my previous article on Principal Component Analysis in R & Python. After having received several request on describing …
How to start applying for Analytics / Data Science Masters in the US Universities?
Big data
Business Analytics
Introduction Planning a masters program in data science in US? But, not completely aware of the application process? Or afraid of the application process? …
12 Winning Tips to Clinch Your First Win in Data Science Competitions
Machine Learning
Introduction So, what are you doing this weekend ? We have an amazing opportunity you wouldn’t want to miss (if you are crazy about machine …
20 Challenging Job Interview Puzzles which every analyst should solve atleast once
Business Analytics
Introduction In current scenario, getting your first break into analytics can be difficult. Around 30% of analytics companies (specially the top ones) evaluate candidates on …
Practical Guide on Data Preprocessing in Python using Scikit Learn
Business Analytics
Machine Learning
Python
Introduction This article primarily focuses on data pre-processing techniques in python. Learning algorithms have affinity towards certain data types on which they perform incredibly well. …
Going Deeper into Regression Analysis with Assumptions, Plots & Solutions
Business Analytics
Machine Learning
Introduction All models are wrong, but some are useful – George Box Regression analysis marks the first step in predictive modeling. No doubt, it’s fairly easy to …
3 Must Know Analytical Concepts For Every Professional / Fresher in Analytics
Business Analytics
Introduction The use of analytical methods have gained immediate importance in the last few years. The practice of gaining useful insights from data have helped several …
10 Analytics / Data Science Masters Program by Top Universities in the US
Big data
Business Analytics
Machine Learning
Introduction Doing Post-graduation in the United States of America (USA) is a dream of countless students across the world. Every year, million of students worldwide appear in examinations like …
Using Platt Scaling and Isotonic Regression to Minimize LogLoss Error in R
Machine Learning
R
Introduction This article is best suited for people who actively (or are aspiring to) participate in data science / machine learning competitions and try …
Tapping Twitter Sentiments: A Complete Case-Study on 2015 Chennai Floods
Business Analytics
Introduction We did this case study as a part of our capstone project at Great Lakes Institute of Management, Chennai. After we presented this study, …
Solving Case study : Optimize the Products Price for an Online Vendor (Level : Hard)
Business Analytics
R
Introduction Solving case studies is a great way to keep your grey cells active. You get to use math, logic and business understanding in …
Learning Path : Step by Step Guide for Beginners to Learn SparkR
Big data
R
Introduction Lately, I’ve been reading the book Data Scientist at Work to draw some inspiration from successful data scientists. Among other things, I found that most …
12 Free Mind Mapping Tools For a Data Scientist To Enhance Structured Thinking
Business Analytics
Introduction Let us start this with a simple exercise, the kind of which every data scientist faces regularly: You have been appointed as a …
Operations analytics case study (level : hard)
Business Analytics
In previous few articles (beginner, intermediate and queuing theory), we have completed a variety of case studies used in operation analytics. One of which …
Bayesian Statistics explained to Beginners in Simple English
Business Analytics
R
Introduction Bayesian Statistics continues to remain incomprehensible in the ignited minds of many analysts. Being amazed by the incredible power of machine learning, a …
Winners of Mini DataHack (Time Series) – Approach, Codes and Solutions
Business Analytics
Python
R
Introduction It takes sheer commitment and knowledge to build a predictive model in 3 hours. The motive of this competition was to make people …
11 Must Read Books This Summer on Internet of Things (IoT)
Big data
Business Analytics
Introduction Imagine a world where your car texts you saying, ‘You didn’t close the back door properly. Please come and do it before it’s …
9 Challenges on Data Merging and Subsetting in R & Python (for beginners)
Business Analytics
Python
R
Introduction Juggling with multiple data sets is a common task for a data scientist. And, it’s immensely important for a beginner or intermediate to learn this …
Exclusive Python Tutorials & Talks from PyCon 2016 Portland, Oregon
Big data
Business Analytics
Machine Learning
Python
Introduction Working with Python has always been a good experience for me. Not just because of its easy code syntax, but due to its phenomenal community support. …
Getting Started with Big Data Integration using HDFS and DMX-h
Big data
Introduction The data researchers no longer depend only on interviews, surveys, observational studies to collect data. Instead, they have switched to the faster ways of data …
Quick Guide to Build a Recommendation Engine in Python
Machine Learning
Python
Introduction This could help you in building your first project! Be it a fresher or an experienced professional in data science, doing voluntary projects …
8 Reasons Why Analytics / Machine Learning Models Fail To Get Deployed
Big data
Business Analytics
Introduction Don’t be a data scientist whose models fail to get deployed! An epic example of model deployment failure is from Netflix Prize Competition. In …
Learning Path for Developers & IT Professionals to become a Data Scientist
Business Analytics
Introduction This guide to meant to help web developers, software engineers and other IT industry people to transition into analytics / data science industry. Last week, I …
Infographic: 16 Genius Minds Whose Inventions Made Data Science Easier For Us
Big data
Business Analytics
Introduction Did you know that the concept of Regression was invented almost 2 centuries ago ? Neither did I, until I decided to step into the glorious …
A comprehensive beginner’s guide to start ML with Amazon Web Services (AWS)
Big data
Business Analytics
Introduction Learn to connect AWS instance with your laptop / desktop for faster computation! Do you struggle with working on big data (large data sets) …
Solve Interview Case Studies 10x Faster Using Dynamic Programming
Business Analytics
R
Introduction The ability to solve case studies comes with regular practice. Many a times, if you find yourself failing at thinking like a pro, perhaps, it’s just …
Use H2O and data.table to build models on large data sets in R
Machine Learning
R
Introduction Last week, I wrote an introductory article on the package data.table. It was intended to provide you a head start and become familiar with …
Winners Talk: Top 3 Solutions of The Seer’s Accuracy Competition
Machine Learning
R
Introduction Surprises arrive when you expect them least to arrive. The Seer’s Accuracy turned out to be a challenging surprise for data scientists.  So, what changed …
19 Data Science Tools for people who aren’t so good at Programming
Business Analytics
Business Intelligence
Machine Learning
Introduction Programming is an integral part of data science. Among other things, it is considered that a mind which understands programming logic, loops, functions …
data.table() vs data.frame() – Learn to work on large data sets in R
Machine Learning
R
Introduction R users (mostly beginners) struggle helplessly while dealing with large data sets. They get haunted by repetitive warnings, error messages of insufficient memory …
How to predict waiting time using Queuing Theory ?
Business Analytics
R
Introduction Queuing Theory, as the name suggests, is a study of long waiting lines done to predict queue lengths and waiting time. It’s a …
15 Must Read Books for Entrepreneurs in Data Science
Big data
Business Analytics
Introduction The roots of entrepreneurship are old. But, the fruits were never so lucrative as they have been recently. Until 2010, not many of …
It’s our 3rd Birthday – Come & Celebrate
A big Thank You to our community. We just turned 3! On this day, 3 years back – I started Analytics Vidhya. I knew …
Practical Guide to implementing Neural Networks in Python (using Theano)
Machine Learning
Python
Introduction In my last article, I discussed the fundamentals of deep learning, where I explained the basic working of a artificial neural network. If …
Case Study For Freshers (Level : Medium) – Call Center Optimization
Big data
Business Analytics
R
Introduction Last week, I introduced you to a classic problem of operational analytics. If you didn’t get a chance to check it, you can …
A Complete Tutorial on Tree Based Modeling from Scratch (in R & Python)
Machine Learning
Python
R
Introduction Tree based learning algorithms are considered to be one of the best and mostly used supervised learning methods. Tree based methods empower predictive models …
Operational Analytics Case study For Freshers: Call Center optimization
Business Analytics
Introduction I’ve seen freshers struggling to solve case studies during interview. Do you also find it difficult ? Yes? That’s okay. But, since you now have an …
Deep Learning for Computer Vision – Introduction to Convolution Neural Networks
Machine Learning
Python
Introduction The power of artificial intelligence is beyond our imagination. We all know robots have already reached a testing phase in some of the …
New Case Study for Analytics Interviews: Dawn of Taxi Aggregators
Business Analytics
Introduction Cab aggregator is a new business concept in India. Today we have Ola, Uber, Taxi for Sure etc. which not only compete to …
News: Praxis Business School launches PGP Business Analytics Program in Bangalore
Business Analytics
Introduction Praxis Business School launched its one year full time Business Program in 2011. The Praxis brand was new to this domain. When I …
13 Machine Learning & Data Science Startups from Y Combinator Winter 2016
Business Analytics
Introduction Entrepreneur’s inspiration lies in a business idea! If you’ve been planning to build a product, I’d suggest you to check these startups first. May be, …
Practical Guide to deal with Imbalanced Classification Problems in R
R
Introduction We have several machine learning algorithms at our disposal for model building. Doing data based prediction is now easier like never before. Whether …
Exploring Recommendation System (with an implementation model in R)
Business Analytics
R
Introduction How do we make recommendations in our lives ? We do it based on our past experiences. Now imagine, what if we start …
How to perform feature selection (i.e. pick important variables) using Boruta Package in R ?
R
Introduction Variable selection is an important aspect of model building which every analyst must learn. After all, it helps in building predictive models free …
Course Review – Big data and Hadoop Developer Certification Course by Simplilearn
Big data
Business Analytics
Introduction There is no question that the Big Data revolution sweeping through the world of business has made its impact on companies big and …
Practical Guide to Principal Component Analysis (PCA) in R & Python
Python
R
Introduction Too much of anything is good for nothing! What happens when a data set has too many variables ? Here are few possible …
Winning Solutions of DYD Competition – R and XGBoost Ruled
Machine Learning
Python
R
Introduction It’s all about an extra mile one is willing to walk! Winning a data science competition require 2 things: Persistence and Willingness to try …
Fundamentals of Deep Learning – Starting with Artificial Neural Network
Machine Learning
Introduction Did you know the first neural network was discovered in early 1950s ? Deep Learning (DL) and Neural Network (NN) is currently driving …
What did you miss ? Complete Solution of Mini Hack Excel
Business Analytics
Introduction Excel is a powerful and easy to use tool for data analysis. Often it is seen that the journey of a data analyst begins …
Complete Solution: How I got in Top 11% of Kaggle Telstra Competition ?
Machine Learning
Python
Introduction Telstra is Australia’s largest telecommunications network. Telstra Network Disruptions (TND) Competition ended on 29th February 2016. This was a recruiting competition. At Analytics Vidhya, I’ve …
10 Questions R Users always ask while using ggplot2 package
Business Analytics
R
Introduction Sometimes numbers do have a beautiful story to share! Visualizing data is crucial in today’s world. Without powerful visualizations, it is almost impossible …
Tutorial on 5 Powerful R Packages used for imputing missing values
R
Introduction Missing values are considered to be the first obstacle in predictive modeling. Hence, it’s important to master the methods to overcome them. Though, some …
Complete Guide to Parameter Tuning in XGBoost (with codes in Python)
Machine Learning
Python
Introduction If things don’t go your way in predictive modeling, use XGboost.  XGBoost algorithm has become the ultimate weapon of many data scientist. It’s a highly …
A Complete Tutorial to learn Data Science in R from Scratch
Business Analytics
Machine Learning
R
Introduction R is a powerful language used widely for data analysis and statistical computing. It was developed in early 90s. Since then, endless efforts …
Guide to Build Better Predictive Models using Segmentation
Business Analytics
Introduction We use linear or logistic regression technique for developing accurate models for predicting an outcome of interest. Often, we create separate models for …
Quick Insights: India Analytics and Big Data Salary Report 2016
Big data
Business Analytics
Business Intelligence
Python
R
SAS
Introduction There are several burning questions which run down in the mind of an experienced / aspiring analytics professionals. The popular ones are: Which …
India Exclusive: Analytics and Big Data Salary Report 2016
Big data
Business Analytics
Python
R
SAS
Introduction Let us go a few years back (remember the pre-iphone era?), none of us imagined we would live in a world full of …
Complete Guide to Parameter Tuning in Gradient Boosting (GBM) in Python
Machine Learning
Python
Introduction If you have been using GBM as a ‘black box’ till now, may be it’s time for you to open it and see, …
BML Munjal University launches MBA in Business Analytics to create future leaders!
Business Analytics
The need for data science talent is increasing by the day. While this demand is increasing, the supply of the talent with the right …
7 Important Model Evaluation Error Metrics Everyone should know
Machine Learning
Python
Introduction Predictive Modeling works on constructive feedback principle. You build a model. Get feedback from metrics, make improvements and continue until you achieve a desirable …
Free Must Read Books on Statistics & Mathematics for Data Science
Machine Learning
Python
R
Introduction The selection process of data scientists at Google gives higher priority to candidates with strong background in statistics and mathematics. Not just Google, other top companies (Amazon, …
Advanced Learning Path – Now Learn R with Best Online Resources
Business Analytics
Machine Learning
R
Introduction This good news is only for (future) R Users! If you are new to data science, and keen to begin your career, bookmarking this …
Approach and Solution to break in Top 20 of Big Mart Sales prediction
Machine Learning
Python
Introduction Practice problems or data science projects are one of the best ways to learn data science. You don’t learn data science until you …
Step by step guide to building sentiment analysis model using graphlab
Business Analytics
Python
I have been using graph lab for quite some time now. The first Kaggle competition I used it for was Click Trough Rate (CTR) …
What I learnt about Time Series Analysis in 3 hour Mini DataHack?
Machine Learning
Last weekend, I participated in the Mini DataHack by Analytics Vidhya and I learnt more about Time Series in those 3 hours than I …
A comprehensive beginner’s guide to create a Time Series Forecast (with Codes in Python)
Business Analytics
Machine Learning
Python
Introduction Time Series (referred as TS from now) is considered to be one of the less known skills in the analytics space (Even I …
Mini DataHack and the tactics of the three “Last Man Standing”!
Business Analytics
Machine Learning
Python
Introduction: February started on a high for us. “Last Man Standing” saw more than 1600 Data Scientists compete from all over the world making more …
Launching learning path to master D3.js
Business Intelligence
We launched our learning paths last year and to say the least, they were a runaway hit! The aim of these learning paths is …
How to use Multinomial and Ordinal Logistic Regression in R ?
Business Analytics
R
Introduction Most of us have limited knowledge of regression. Of which, linear and logistic regression are our favorite ones. As an interesting fact, regression has extended capabilities to …
A Complete Tutorial on Ridge and Lasso Regression in Python
Machine Learning
Python
Introduction When we talk about Regression, we often end up discussing Linear and Logistics Regression. But, that’s not the end. Do you know there are …
My AMA & our biggest ever hackathon – less than 24 hours away!
If you are a regular here, you would know that my excitement is going through the roof. If you are asking “Why?”, it is …
Top Certification Courses in SAS, R, Python, Machine Learning, Big Data, Spark ( 2015-16 )
Big data
Business Intelligence
Machine Learning
Python
Qlikview
R
SAS
Introduction What could be more convenient than upgrading skills online ? There are plenty of courses / certifications available to kick-start your career in analytics. …
How to use XGBoost algorithm in R in easy steps
Machine Learning
R
Introduction Did you know using XGBoost algorithm is one of the popular winning recipe of data science competitions ? So, what makes it more powerful than …
Improvising Hackathon platform, Blogathon, Profile pages, Points and much more
Continuous improvement is better than delayed perfection! -Mark Twain I can’t stop smiling and deeply appreciating this statement from Mark Twain. He has summarized …
Tutorial – Python List Comprehension With Examples
Python
Introduction List comprehension is powerful and must know concept in Python. Yet, this remains one of the most challenging topic for beginners. With this …
[Infographic] 10 Popular TV Shows on Data Science and Artificial Intelligence
Infographics
Machine Learning
Introduction The development of full artificial intelligence could spell the end of human race. – Stephen Hawking The world is now rapidly moving towards …
A Complete Tutorial to Learn Data Science with Python from Scratch
Machine Learning
Pandas
Python
Introduction It happened few years back. After working on SAS for more than 5 years, I decided to move out of my comfort zone. Being …
20 Powerful Images which perfectly captures the growth of Data Science
Business Analytics
Machine Learning
Introduction Data can’t make your past better. However, it you surely can create an awesome future. In recent years, companies have invested millions of dollars …
A Comprehensive Guide to Data Exploration
Business Analytics
Introduction There are no shortcuts for data exploration. If you are in a state of mind, that machine learning can sail you away from …
The Ultimate Plan to Become a Data Scientist in 2016
Business Analytics
Infographics
Machine Learning
Python
Qlikview
R
Introduction Data Scientist is one of the hottest jobs of this decade. The demand for data scientists is much higher than available candidates (Source). …
3 Tricky Puzzles which most people get Wrong in Job Interviews
Business Analytics
Introduction If you ever go for an interview, prepare well for puzzles and guess estimate questions. They’d surely be thrown at you. Sometimes, the …
AV Blogathon is Live – Inspiring a new breed of Data Scientists
Do you also have the passion to write and inspire ? Writing is an art which let’s you convey a piece of information without …
12 Useful Pandas Techniques in Python for Data Manipulation
Machine Learning
Pandas
Python
Introduction Python is fast becoming the preferred language for data scientists – and for good reasons. It provides the larger ecosystem of a programming language …
Here comes a year full of knowledge & learning!
Dear AVian, 2015 was year of growth for us – we transformed from a blog to a community of data scientists. We launched our …
New Year Resolutions for a Data Scientist
Business Analytics
Machine Learning
Python
R
Introduction   New Year is not just replacing your table calendar with a new one or waking up next morning rubbing your eyes. It’s celebrating …
8 Proven Ways for improving the “Accuracy” of a Machine Learning Model
Machine Learning
Introduction Enhancing a model performance can be challenging at times. I’m sure, a lot of you would agree with me if you’ve found yourself stuck …
Year in Review: Best of Analytics Vidhya from 2015
Business Analytics
Machine Learning
Introduction People say that 90% of startups fail by the time they reach their year 2! I would like to thank you all that …
Kaggle Solution: What’s Cooking ? (Text Mining Competition)
Machine Learning
R
Introduction Tutorial on Text Mining, XGBoost and Ensemble Modeling in R I came across What’s Cooking competition on Kaggle last week. At first, I was intrigued …
A Complete Tutorial on SAS Macros For Faster Data Manipulation
Business Analytics
SAS
Introduction If you’ve been writing the same lines of code repeatedly in SAS, you can stop now. It shouldn’t be as laborious as you’ve …
SQL commands for Commonly Used Excel Operations
Business Intelligence
Introduction Learning SQL after Excel couldn’t be simpler! I’ve spent more than a decade working on Excel. Yet, there is so much to learn. …
Top Business Analytics Programs in India (2015 – 16)
Big data
Business Analytics
Machine Learning
Introduction December stands out for us for multiple reasons – we are planning for 2016 & reflecting back on the fabulous year 2015 has …
A Complete Tutorial on Time Series Modeling in R
Business Analytics
Introduction ‘Time’ is the most important factor which ensures success in a business. It’s difficult to keep up with the pace of time.  But, technology …
10 Machine Learning Algorithms Explained to an ‘Army Soldier’
Machine Learning
Introduction If you think deep, you’d realize the whole process of predictive modeling is a war. Ruthless war. Don’t you believe me? Consider the …
7 Important Ways to Summarise Data in R
Business Analytics
R
Introduction People remain confused when it comes to summarizing data real quick in R. There are various options. But, which one is the best …
Interview with Dr. Bibek Banerjee – Dean, BRIDGE School of Management
Business Analytics
Introduction Analytics Industry in India is expanding fast. More and more companies are introducing analytics in their core processes. This has resulted in increased demand of analytics professionals. …
Do Faster Data Manipulation using These 7 R Packages
Business Analytics
R
Introduction Data Manipulation is an inevitable phase of predictive modeling. A robust predictive model can’t just be built using machine learning algorithms. But, with an …
10 Ultimate Tips and Tricks on Data Visualization in QlikView
Business Intelligence
Qlikview
Introduction QlikView is a popular and simple to learn tool for data visualization. Its simple interface makes it a favorite among newbies in analytics. I loved it …
Hilarious Jokes & Videos on Statistics and Data Science
Big data
Business Analytics
Introduction  Take it Easy. Data Science is Fun-tastic! If you love data science, you’d find many aspects to it. A month back, I found …
Learn to Build Powerful Machine Learning Models with Amazon Service
Big data
Machine Learning
Introduction After using Azure ML last week, I received multiple emails to publish a tutorial on Amazon’s ML. Thankfully, some of my meetings got …
Important Job Roles in Data Science Industry Today – Who Does What ?
Big data
Business Analytics
Business Intelligence
Infographics
Introduction One evening, I was catching up with a friend over a few drinks – let’s call him Jon (name changed). He seemed determined …
Tutorial – Getting Started with GraphLab For Machine Learning in Python
Machine Learning
Python
Introduction GraphLab came as an unexpected breakthrough on my learning plan. After all, ‘ Good Things Happen When You Expect Them Least To Happen’. …
18 Useful Mobile Apps for Data Scientist / Data Analysts
Business Analytics
Introduction Does your passion lie in Data Science / Analytics ? Currently, data science and machine learning are changing the world. Here’s your chance …
Tutorial – Build a simple Machine Learning Model using AzureML
Big data
Machine Learning
Introduction How difficult is it to build a machine learning model on R or Python? For beginners, it’s a Herculean task. For intermediates and …
8 Ways to deal with Continuous Variables in Predictive Modeling
Business Analytics
R
Introduction Let’s come straight to the point on this one – there are only 2 types of variables you see – Continuous and Discrete. …
Simple Methods to deal with Categorical Variables in Predictive Modeling
Business Analytics
Machine Learning
Python
Introduction Categorical variables are known to hide and mask lots of interesting information in a data set. It’s crucial to learn the methods of …
Secrets from winners of our best ever Data Hackathon!
Business Analytics
Machine Learning
Python
R
Introduction One of the books I read in initial days of my career was titled “What got you Here, Won’t Get You There”. While …
The Machine Learning Times of Year 2015 – A Powerful Growth Story
Business Analytics
Machine Learning
Introduction Machine learning is a core, transformative way by which we’re rethinking everything we’re doing. We’re thoughtfully applying it across all our products, be it …
Getting started with Machine Learning in MS Excel using XLMiner
Business Analytics
Machine Learning
Introduction Machine Learning is nothing but building a ‘machine’ which ‘learns’ from its experience. And, becomes better with experience – just like humans. We also …
Improve Your Model Performance using Cross Validation (in Python and R)
Business Analytics
Python
R
Introduction I have closely monitored the series of Data Hackathons and found an interesting trend (shown below). This trend is based on participant rankings on public …
Lifetime Lessons: 20 Things Every Data Scientist Must Know Today
Business Analytics
Machine Learning
Introduction I’ve spent close to a decade in data science & analytics now. Over this period, I have learnt new ways of working on data …
7 Must Watch Documentaries on Statistics and Machine Learning
Big data
Business Analytics
Machine Learning
Introduction “Soon, our habitat will be invaded by unreal humans. Not only they’ll influence our way of living, but also intervene in our modus …
Nobody Tells You – 5 things Big Data ‘CAN’ and ‘Cannot’ Do
Big data
Business Analytics
Introduction   “Big Data makes us smarter, not wiser.” – Tim Leberecht. The term ‘Big Data’ got introduced in 1940s. Companies around the world have put …
Exclusive Interview with SRK, Sr. Data Scientist, Kaggle Rank 25
Business Analytics
Machine Learning
Introduction It took him just 2 years to secure a rank in Kaggle Top 30 from scratch Mr. Sudalai Rajkumar a.k.a SRK, Sr. Data …
10 Must Watch Movies on Data Science and Machine Learning
Business Analytics
Machine Learning
Introduction Some members of our team (including me) live by just 2 passions in life – Data Science & Movies! For us, slicing and …
Quick Introduction to Boosting Algorithms in Machine Learning
Machine Learning
Python
Introduction Lots of analyst misinterpret the term ‘boosting’ used in data science. Let me provide an interesting explanation of this term. Boosting grants power to machine …
Tips for freshers to crack campus interviews for analytics / data science companies
Business Analytics
Business Intelligence
Introduction I’m blessed that my memories last for long. 1st week of December 2011 was a critical week for me. This was the first time when I had …
Free Resources for Beginners on Deep Learning and Neural Network
Business Analytics
Python
Introduction Machines have already started their march towards artificial intelligence. Deep Learning and Neural Networks are probably the hottest topics in machine learning research today. …
Simple Yet Powerful Excel Tricks for Analyzing Data
Business Analytics
Introduction I’ve always admired the immense power of Excel. This software is not only capable of doing basic data computations, but you can also …
Simple Guide to Logistic Regression in R
Business Analytics
R
Introduction Every machine learning algorithm works best under a given set of conditions. Making sure your algorithm fits the assumptions / requirements ensures superior …
6 Practices to enhance the performance of a Text Classification Model
Business Analytics
Introduction A few months back, I was working on creating a sentiment classifier for Twitter data. After trying the common approaches, I was still …
26 Things I Learned in the Deep Learning Summer School
Business Analytics
Introduction This article was originally published at marekrei.com In the beginning of August, I got the chance to attend the Deep Learning Summer School …
Must Read Books for Beginners on Big Data, Hadoop and Apache Spark
Big data
Introduction   How many of you would agree / disagree to this statement: Google knows and understands you better than what you yourself do? …
13 Tips to make you awesome in Data Science / Analytics Jobs
Business Analytics
Business Intelligence
Introduction I was fortunate to get early opportunities of working on numerous data science projects. I enjoyed this part the most. Even more, when I realized that my …
Must Read Books for Beginners on Machine Learning and Artificial Intelligence
Business Analytics
Python
R
Introduction Machine Learning has granted incredible power to humans. The power to run tasks in automated manner, the power to make our lives comfrotable, the …
Beginner’s guide to Web Scraping in Python (using BeautifulSoup)
Business Analytics
Python
Introduction The need and importance of extracting data from the web is becoming increasingly loud and clear. Every few weeks, I find myself in …
The D Hack, Ask Us Anything with past hackathon winners and practice problems
Business Analytics
Here is a famous quote from John Keats: Nothing ever becomes real till it is experienced -John Keats While Keats might have said it …
Powerful ‘Trick’ to choose right models in Ensemble Learning
Business Analytics
R
Introduction I hope you’ve followed my previous articles on ensemble modeling. In this article, I’ll share a crucial trick helpful to build models using ensemble …
Job Comparison – Data Scientist vs Data Engineer vs Statistician
Big data
Business Analytics
Infographics
Introduction Data Science is a flourishing industry. Countries and companies around the world are continuously experiencing a rush in the amount of data collected. They …
How Amazon re-invented Data Science at Amazon AWS re:Invent 2015?
Big data
Business Analytics
Introduction The ways of analyzing and visualizing data is changing, we must embrace this change. Let me begin with a 8 line (quick) story. King …
5 Questions which can teach you Multiple Regression (with R and Python)
Business Analytics
Python
R
Introduction A journey of thousand miles begin with a single step. In a similar way, the journey of mastering machine learning algorithms begins ideally …
Interview with Data Scientist- Gregory Piatetsky Shapiro, Ph.D, President KDnuggets
Business Analytics
Introduction How do you feel when you get a chance to interview your role model? Some one who has not only gone through the …
Quick Guide to learn Statistics for R Users (with Titanic Data Set)
Business Analytics
R
Introduction People are keen to pursue their career as a data scientist. And why shouldn’t they be? After all, this comes with a pride of …
Understanding basics of Recommendation Engines (with case study)
Business Analytics
Introduction Ever wondered, “what algorithm google uses to maximize its target ads revenue?”. What about the e-commerce websites which advocates you through options such as ‘people who …
8 Productivity hacks for Data Scientists & Business Analysts
Business Analytics
Introduction I was catching up with one of my friends from a past organization. She had always been interested in data science, but was …
News – Full Time / Part Time Big Data and Analytics Program at SP Jain School of Global Management
Big data
Business Analytics
Introduction As you read this, freshly generated terabytes of would have been collected this second and stored in huge database management systems. With new …
Understanding Support Vector Machine algorithm from examples (along with code)
Business Analytics
Python
Introduction Mastering machine learning algorithms isn’t a myth at all. Most of the beginners start by learning regression. It is simple to learn and …
Cheatsheet – 11 Steps for Data Exploration in R (with codes)
Business Analytics
Infographics
R
Introduction If you wish to build an impeccable predictive model, trust me, neither any programming language nor any machine learning algorithm can award it …
Notes, impressions, experience and excitement from PyCon India 2015, Bengaluru
Business Analytics
Python
If there is one conference I love to follow, it is PyCon! We have been following the PyCon U.S. for last 2 years and …
Building a Logistic Regression model from scratch
Business Analytics
R
Do you understand how does logistic regression work? If your answer is yes, I have a challenge for you to solve. Here is an …
Beginner’s guide to Design of Experiments (with case study on banner advertisement)
Business Analytics
Introduction   When you visit a supermarket, you might feel overwhelmed with the discounts and free gifts that you get with your purchase. Have you ever …
5 Easy questions on Ensemble Modeling everyone should know
Business Analytics
Machine Learning
Introduction If you’ve ever participated in data science competitions, you must be aware of the pivotal role that ensemble modeling plays. In fact, it is …
Damn Good Hiring Path to get yourself hired as a Data Scientist
Business Analytics
Introduction The race to become a data scientist doesn’t end at just mastering R or Python. In fact, it starts from there. A Data …
Hacks to perform faster Text Mining in R
Big data
Business Analytics
R
Introduction Data science demands versatility. Move away from your regular methods, challenge your ways of working, explore new ways of doing things more efficiently. …
Running scalable Data Science on Cloud with R & Python
Business Analytics
Python
R
Introduction The complexity in data science is increasing by the day. This complexity is driven by three fundamental factors: Increased Data Generation – Look around, …
Build a Predictive Model in 10 Minutes (using Python)
Business Analytics
Python
Introduction I came across this strategic virtue from Sun Tzu recently: What has this to do with a data science blog? This is the …
13 Amazing Applications / Uses of Data Science Today
Big data
Business Analytics
Introduction One of the questions people ask me commonly is: Is Big Data /  Data Science really a buzz or a once in a …
Your Guide to Master Hypothesis Testing in Statistics
Business Analytics
Introduction – the difference in mindset I started my career as a MIS professional and then made my way into Business Intelligence (BI) followed by Business …
Perfect way to build a Predictive Model in less than 10 minutes
Big data
Business Analytics
R
Introduction In the last few months, we have started conducting data science hackathons. These hackathons are contests with a well defined data problem, which …
24 Ultimate Data Scientists To Follow in the World Today
Business Analytics
Introduction Having a hero / heroine helps you navigate through the difficult times. You look up to them and then think that the problems you …
Cheatsheet – Python & R codes for common Machine Learning Algorithms
Business Analytics
Infographics
Python
R
Introduction In his famous book – Think and Grow Rich, Napolean Hill narrates story of Darby, who after digging for a gold vein for …
6 Easy Steps to Learn Naive Bayes Algorithm (with code in Python)
Business Analytics
Python
Introduction Here’s a situation you’ve got into: You are working on a classification problem and you have generated your set of hypothesis, created features and …
Learn Gradient Boosting Algorithm for better predictions (with codes in R)
Business Analytics
R
Introduction The accuracy of a predictive model can be boosted in two ways: Either by embracing feature engineering or by applying boosting algorithms straight …
How Good is the Executive business analytics program by Jigsaw Academy and MISB Bocconi ?
Business Analytics
Introduction The demand of skilled data science / analytics professionals is surging with every bit of data being collected across the globe. Same is …
Startups bringing analytics and data science closer to you!
Big data
Business Analytics
Introduction: Data Science and analytics are changing every industry as you read this article! If you have been following Analytics Vidhya lately, I am …
News: Spring intake for the Analytics program at Praxis Business School
Business Analytics
Introduction The academic scene in analytics continues to heat up. Freshers / Professionals continue to enter analytics industry with a determined objective of becoming …
My recommendations – SlideShare Presentations on Data Science
Business Analytics
Python
R
Introduction Every one has their own learning sytle! If you need close hand holding and guidance – an easy going MOOC is probably the best …
Powerful Guide to learn Random Forest (with codes in R & Python)
Business Analytics
Python
R
Introduction Random Forests is panacea to all data science problems! Random Forest models have risen significantly in their popularity – and for some real good …
3 Compelling reasons why you must compete in Data Hackathon 3.x
Business Analytics
Introduction Give a man a fish – feed him for a day Teach him how to fish – feed him for a life time! …
Learn to use Forward Selection Techniques for Ensemble Modeling
Big data
Business Analytics
R
Introduction Ensemble methods have the ability to provide much needed robustness and accuracy to both supervised and unsupervised problems. Machine learning is going to evolve …
Infographic: Data Visualization Tools For Data scientists & analysts
Business Analytics
Infographics
Introduction Here is a famous quote on learning: We Learn . . . 10% of what we read 20% of what we hear 30% …
Top Datapreneurs who made data science what it is today
Business Analytics
Introduction I am deeply passionate about 2 fields: Data Science and start-ups. I feel data science is the only way to enable logical decisions …
Capstone Projects – Great Lakes Business Analytics Program
Business Analytics
Uncategorized
Introduction I have been associated with the Great Lakes Business Analytics Program as a visiting faculty for some time now. This is one of …
Interactive Data Visualization using Bokeh (in Python)
Business Analytics
Python
Introduction Recently, I was going through a video from SciPy 2015 conference, “Building Python Data Apps with Blaze and Bokeh“, recently held at Austin, Texas, USA. …
R-analyst Cheat sheet: Data Visualization in R
Business Analytics
Infographics
R
Introduction Data visualization has become an integral part of data science work flow. Hence, your main tool needs to have strong capabilities on both …
Big Data / Analytics based startups at Y Combinator, Summer 2015 batch
Big data
Business Analytics
If there is one startup accelerator, the tech world keeps a watch on – it is Y Combinator! The accelerator has produced the likes …
Finding Optimal Weights of Ensemble Learner using Neural Network
Big data
Business Analytics
Introduction Encountering ensemble learning algorithm in winning solutions of data science competitions has become a norm now. The ability to train multiple learners on a set …
List of Machine Learning Certifications and Best Data Science Bootcamps
Big data
Business Analytics
Introduction Every one has a different style of learning. Hence, there are multiple ways to become a data scientist. You can learn from tutorials, blogs, …
Best way to learn kNN Algorithm using R Programming
Business Analytics
R
Uncategorized
Introduction In this article, I’ll show you the application of kNN (k – nearest neighbor) algorithm using R Programming. But, before we go ahead on …
Data scientist hack to find the right Meetup groups (using Python)
Business Analytics
Python
Introduction Data Scientists are a breed of lazy animals! We detest the practice of doing any repeatable work manually. We cringe at mere thought …
Ultimate app to find the best Data Science resources
Big data
Business Analytics
Business Intelligence
Have you felt at loss in the jungle of data science resources? Did you try finding a resource only to conclude there are too …
11 things you should know as a Data Scientist
Business Analytics
Background During the meetups we conduct, we get a mix of audience. From complete starters in data science to experts in the field, every one …
7 Types of Regression Techniques you should know!
Business Analytics
Introduction Linear and Logistic regressions are usually the first algorithms people learn in predictive modeling. Due to their popularity, a lot of analysts even …
News – Great Lakes launches Analytics Program in Bangalore, India
Business Analytics
Analytics training landscape in India is evolving quickly and I have been lucky to be involved in this evolution. Outside of Analytics Vidhya, I …
Beginners Guide to learn about Content Based Recommender Engines
Business Analytics
Introduction One of the most surprising part about Recommender Systems is, ‘we summon to its suggestions / advice every other day, without even realizing …
Essentials of Machine Learning Algorithms (with Python and R Codes)
Business Analytics
Python
R
Introduction Google’s self-driving cars and robots get a lot of press, but the company’s real future is in machine learning, the technology that enables …
Marketing Analytics: Essentials of Cross-Selling and Upselling (with a case study)
Business Analytics
Introduction Cross selling and Upselling is one of the most widely discussed concept in marketing analytics. Every other day when you visit a supermarket, restaurant …
What is the role of analytics in E-Commerce industry?
Big data
Business Analytics
If you are preparing for an interview into role of analytics, you need to do your ground work to get a basic understanding of …
Get Knowledge from Best Ever Data Science Discussions on Reddit
Big data
Python
R
Introduction   While composing this enriching this list of data science discussions, I found this awesome ‘poem’ drafted statistically. Ain’t it pretty cool ? Dedicated to that …
List of useful packages (libraries) for Data Analysis in R
R
Introduction R offers multiple packages for performing data analysis. Apart from providing an awesome interface for statistical analysis, the next best thing about R is …
Basics of Ensemble Learning Explained in Simple English
Business Analytics
Introduction Ensemble modeling is a powerful way to improve the performance of your model. It usually pays off to apply ensemble learning over and …
Learn Big Data Analytics using Top YouTube Videos, TED Talks & other resources
Big data
Introduction There has been a lot of investment in Big Data by various companies in last few years. This rise in usage of big …
Beginners Guide To Learn Dimension Reduction Techniques
Business Analytics
Introduction Brevity is the soul of wit This powerful quote by William Shakespeare applies well to techniques used in data science & analytics as well. …
Overview of Analytics Industry in India (my notes and views)
Business Analytics
One of the most common question I get asked around is What is your view about Analytics industry in India? and it comes in …
Food for thought: How to Measure Influence in a Network?
Business Analytics
A quick exercise Let’s say you are a customer service executive working for a bank (and the only one for this hypothetical case). You …
Top Data Scientists to Follow & Best Data Science Tutorials on GitHub
Business Analytics
Introduction Twitter started the trend of ‘People to Follow’. This later got replicated by other platforms such as Facebook, Linkedin, Quora and GitHub. This cool …
CheatSheet: Data Exploration using Pandas in Python
Infographics
Introduction If some one would ask me to mention 2 most important libraries in Python for data science, I’ll probably name “pandas” and “scikit-learn”. …
Learning path for Tableau – visualization tool with awesome execution capabilities!
Business Intelligence
Let us take a look at Gartner’s Magic quadrant for Business Intelligence and Analytics platforms: If there is one company which stands out even …
Must Watch Data Science Videos from SciPy Conference 2015
Business Analytics
Python
Introduction This was in my first year of engineering degree. A hungry, home-food sick student (me) was treated (by a college senior) with a lavish buffet …
Getting into Top 10 in Kaggle Facebook Recruiting Competition
Big data
R
Facebook recently wrapped up its Recruitment competition on Kaggle. This was by far the richest data I have seen on Kaggle. The amount of …
Simple infographic to help you compete in Data Science Competitions!
Business Analytics
Infographics
Introduction There are only 2 possible outcomes for every serious participant in a data science competition – you either win it or you learn …
Comprehensive Guide to Data Visualization in  R
Business Analytics
R
Let us look at this chart for a second: This visualization (originally created using Tableau) is a great example of how data visualization can help …
Online Hackathon: Predict the gem of Auxesia for Magazino!
Business Analytics
Just wanted to announce the launch of online hackathon. We have got an exciting problem for our publishing friend Magazino! Brief about Magazino Magazino, …
Getting started with Julia – a high level, high performance language for computing
Business Analytics
Learning new tools and techniques in data science is sort of like running on treadmill – you have to run continuously to stay on …
My playlist – Top YouTube Videos on Machine Learning, Neural Network & Deep Learning
Business Analytics
Python
Introduction One of the best way to get better at machine learning and deep learning is to watch a lecture from an expert and …
The guide to quickly learn Cloud Computing in R Programming
Business Analytics
Infographics
R
Introduction Cloud computing is becoming a natural extension for problems / data sets bigger than what laptops and desktops can process. However, for complete …
Must for Data Scientists & Analysts: Brain Training for Analytical Thinking
Business Analytics
Introduction Let’s start this article with a small exercise. Take a pen and paper and write the answer as it comes to your mind. …
Learning Path : Best way to learn Machine Learning in 6 easy steps
Business Analytics
Python
R
After immense popularity of our learning paths on various tools, we are delighted to announce our learning path for machine learning. Needless to say, …
Difference between Machine Learning & Statistical Modeling
Big data
Business Analytics
One of the most common question, which gets asked at various data science forums is: What is the difference between Machine Learning and Statistical modeling? …
Quick Guide: Steps To Perform Text Data Cleaning in Python
Business Analytics
Infographics
Python
Introduction Twitter has become an inevitable channel for brand management. It has compelled brands to become more responsive to their customers. On the other hand, the damage …
Beware – interviewer for analytics job is observing you closely!
Analysts are people with high attention to details! This trait is visible across any endeavor they are involved in. I once went to a …
Kaggle Bike Sharing Demand Prediction – How I got in top 5 percentile of participants?
Business Analytics
R
Introduction There are three types of people who take part in a Kaggle Competition: Type 1: Who are experts in machine learning and their motivation is …
7 most commonly asked questions on Correlation
Business Analytics
Introduction The natural trajectory of learning statistics begins with measures of central tendency followed by correlation, regression to other advanced concepts. Amongst these initial concepts, …
Infographic: Must Read Books in Analytics / Data Science
Business Analytics
Business Intelligence
Infographics
Web Analytics
Drink Coffee, Read Books, Learn More, Be Happy! There are 2 attributes all the members in our team at Analytics Vidhya share: We all …
Getting started with Cloud Computing using R Programming
Big data
R
Introduction Almost any domain / business today is being transformed through SMAC. SMAC is a collective term referring to changes happening in Social, Mobile, …
Use of variables in QlikView to create powerful data stories
Business Intelligence
Qlikview
Introduction An application with good Front-end and poor Back-end is like Beauty without brains. You are awed by it initially, but you get irritated …
What’s the difference between Causality and Correlation?
Business Analytics
Introduction Causation and Correlation are loosely used words in analytics. People tend to use these words interchangeably without knowing the fundamental logic behind them. …
Test your fit as a Data Scientist
Business Analytics
While there has been a lot of buzz lately around the demand of data scientists. There are limited resources, which provide a clear answer …
Getting Mongo-ed in NoSQL manager, R & Python
Big data
Python
R
Introduction I started my journey on Analytics Vidhya (AV) as a follower. AV, and now especially the discussion portal, always stay open in one …
Machine Learning basics for a newbie
Business Analytics
Introduction There has been a renewed interest in machine learning in last few years. This revival seems to be driven by strong fundamentals – …
In Conversation with Mr. Stefan Groschupf, Founder and CEO, Datameer
Big data
Introduction With the growing usage of Hadoop, Datameer launched a custom big data analytics application to help people generate insights from data faster than …
Tuning the parameters of your Random Forest model
Business Analytics
Why to tune Machine Learning Algorithms? A month back, I participated in a Kaggle competition called TFI. I started with my first submission at 50th …
Cheat Sheet for Exploratory Data Analysis in Python
Business Analytics
Infographics
Python
Introduction The secret behind creating powerful predictive models is to understand the data really well. Thereby, it is suggested to maneuver the essential steps of data …
Beginners Tutorial for Regular Expressions in Python
Business Analytics
Python
Importance of Regular Expressions In last few years, there has been a dramatic shift in usage of general purpose programming languages for data science …
Hackathon Problem Description: Do you know who’s a Megastar?
Business Analytics
Python
R
SAS
Predict the category of a working professional Welcome to the final stage of this contest. If you have reached till here, we assume either …
The Hackathon Practice Guide by Analytics Vidhya
Business Analytics
Python
R
Introduction Data Hackathons are a platform where you get a chance of intense workout with your knowledge and techniques learnt in analytics. It is a …
Kaggle Competitions: How and where to begin?
Business Analytics
Python
R
Introduction                      Do I have the necessary skills to take part in Kaggle Competitions? Did …
Data visualization guide for SAS
Business Intelligence
SAS
Introduction A picture is worth a thousand words! In today’s competitive environment, companies want faster decision making process, thus ensuring they stay ahead in the race. …
Cheat sheet: Data Visualisation in Python
Business Analytics
Infographics
Python
Introduction It is said ‘A visually presented data speaks for itself’. Data, served in the right visual form, brings out hidden trends and insights …
All out beginner’s guide to MongoDB
Big data
Business Intelligence
Introduction Necessity is the mother of innovation! This is an old proverb, but it still holds damn good! Last decade has pushed the boundaries …
Why Business Intelligence Should Be a Piece of the Security Puzzle?
Business Intelligence
Introduction Whenever a business faces a security failure, they turn to their logs, security information and event management (SIEM) software, and security software to …
k-Fold Cross Validation made simple
Business Analytics
Python
R
Does your high performing model degrade/perform poorly on an out of time sample? Has your Kaggle Private score come down from your Public score significantly? Not …
Moving into analytics after a break in career? Don’t expect a rosy land!
Business Analytics
Let’s look at Vinita’s (name changed) story: After completing her M.Sc. in Statistics, Vinita worked as a call center executive for more than three years. …
Infographic: Quick Guide on SAS vs R vs Python
Infographics
Introduction One of the perennial points of debate in data science industry has been – “Which is the best tool for the job?“. Traditionally, …
Case study – Building and implementing a predictive model in 3 days
Business Analytics
We launched Analytics Professional salary test last week and got awesome response from our audience. People loved it and shared it across social media …
Launching Analytics Professional Salary Test, India
Business Analytics
I’ll keep it short. We are pleased to launch Analytics professional salary test, India – a test, which you can take and see what is …
In Conversation with Mr. Sanjeev Mishra, CEO, Convergytics
Business Analytics
Introduction It is usually interesting to meet business leaders and hear their views on whether they perceive analytics as a cost center or profit …
Getting smart with Machine Learning – AdaBoost and Gradient Boost
Business Analytics
Introduction Machine Learning algorithms are like solving a Rubik Cube. You grapple at the beginning to figure out the hidden algorithm, but once learnt, some can even …
Infographic – Quick Guide to learn Python for Data Science
Infographics
Python
Introduction A situation has been described below. Has it ever happened to you? I wanted to learn Python for Data Science, so I googled ‘I …
9 popular ways to perform Data Visualization in Python
Python
Introduction The beauty of an art lies in the message it conveys. At times, reality is not what we see or perceive. The endless efforts …
List of amazing talks from New York R Conference 2015
Business Analytics
R
R is undoubtedly the most popular open source data science tool loved by statisticians and analysts across the globe. It provides one of the best …
Interview with Daniel Graham, General Manager, Teradata
Big data
Business Analytics
Business Intelligence
Data is probably the biggest asset for an organization in today’s digital economy. But, with huge data, comes a challenge to mine it quickly …
A Comprehensive guide to Parametric Survival Analysis
Business Analytics
Introduction Survival analysis is one of the less understood and highly applied algorithm by business analysts. That is a dangerous combination! Not many analysts …
Why Business Analytics Degrees Aren’t Just a Fad
Business Analytics
Introduction Data can be defined as a collection of information, used to derive useful insights for informed decision making. Right from analyzing traffic patterns …
Ultimate resource for understanding & creating data visualization
Business Analytics
Business Intelligence
Introduction There are 3 fundamental changes driving penetration of data science industry: The amount of data generation and storage has become very cheap. Every smart …
Interview with Joe Doliner, Co-Founder and CEO of Pachyderm ( Y Combinator Startup)
Big data
Business Analytics
Introduction There is a reason why analytics, big data & data science are being considered hot fields today. The landscape of tools is changing …
How to create Parametric Survival model that gets right distribution?
Business Analytics
For an updated guide on parametric survival model, visit this post.
Review – Business Analytics Post Graduate Program – Praxis Business School, Kolkata
Business Analytics
Recently, I had to travel to Kolkata for a short trip. While I was going there, I thought, I should visit Praxis Business School. …
Swirl Package – Easy way to learn R, in R
Business Analytics
R
People usually quote a steep learning curve as one of the reasons against R, when comparing R vs. Python. The reality is that people …
Comprehensive guide for Data Exploration in R
Business Analytics
R
Introduction   Till now we have already covered a detailed tutorials on data exploration using SAS and Python. What is the one piece missing to …
Best Data Science talks from PyCon Montreal 2015
Business Analytics
Python
Introduction Last week, we wrote an article on workshops in PyCon Montreal 2015- Hands on way to learn Python. Workshops are 3 hour long hands-on …
Application of PageRank algorithm to analyze packages in R
Big data
R
Introduction In the previous article, we talked about a crucial algorithm named PageRank, used by most of the search engines to figure out the popular/helpful pages on …
Key Takeaways from Andrew Ng and Adam Coates AMA on Reddit
Business Analytics
‘ At Baidu, our goal is to develop hard AI technologies that impact hundreds of millions of users across the world’. – Andrew Ng In …
AVturns2: Let the celebrations begin!
Two years back, on this very day (well…actually night), I posted first article on Analytics Vidhya. Little did I know, what was in store! …
Effective data exploration / processing using FIRST. & LAST. in SAS PDV
Business Intelligence
SAS
Efficiency in coding differentiates a good coder from a bad coder. While you don’t need to be an awesome coder necessarily to be a …
PyCon Montreal 2015 tutorials – Hands-on way to learn Data Science in Python
Python
Introduction PyCon(s) carry a benevolent motive of helping the Python community worldwide by providing extensive knowledge resources. I started following PyCon conferences from 2013. My first learning …
PageRank explained in simple terms!
Big data
Business Analytics
In my previous article, we talked about information retrieval. We also talked about how machine can read the context from a free text. Let’s talk about …
Ultimate guide for Data Exploration in Python using NumPy, Matplotlib and Pandas
Python
Introduction Exploring data sets and developing deep understanding about the data is one of the most important skill every data scientist should possess. People …
Information Retrieval System explained in simple terms!
Big data
Business Analytics
Introduction While searching for things over internet, I always wondered, what kind of algorithms might be running behind these search engines which provide us with the most relevant …
Internet of Things (IoT) and its impact on data science!
Big data
Eric Schmidt & Jared Cohen, in their book “The New Digital Age” describe typical future morning for a professional like this: There will be no …
Comprehensive guide for Data Exploration in SAS (using Data step and Proc SQL)
Business Analytics
Business Intelligence
SAS
Introduction I would like to extend my sincere gratitude to our readers for their overwhelming response on my previous articles on data exploration. These …
Text Mining hack: Subject Extraction made easy using Google API
Business Analytics
Python
Let’s do a simple exercise. You need to identify the subject and the sentiment in following sentences: Google is the best resource for any kind …
Basics of SQL and RDBMS – must have skills for data science professionals
Business Analytics
Business Intelligence
If you meet 10 people who have been in data science for more than 5 years, chances are that all of them would know …
Big Data / Analytics based startups at Y Combinator, Winter 2015 batch
Big data
Business Analytics
If you have even a mini / micro / nano doubt about how analytics and Big Data are re-shaping this world, you should look …
Importance of actionable insights in analytics (with case from ICC Cricket World Cup)
Business Analytics
Have you ever been to a meeting, where everyone in the room has good stats to share, but no one knows how to use …
Hacking Google Maps to create distance features in your model / applications
Business Analytics
Python
This article is going to be different from the rest of my articles published on Analytics Vidhya – both in terms of content and format. …
How to re-use data models in Qlikview using Binary Load?
Qlikview
What will you learn? After receiving a lot of queries on use optimization techniques in QlikView, I am compelled to write this article. The …
Why most data science trainings fail to deliver? How to overcome these failures?
Business Analytics
At the outset, it looks like there is no dearth of data science / analytics trainings available today. Our training listing page probably has more than …
Building additional features & variables through open data sources
Big data
Power of Analytics Recently, while travelling, I met a few people who perceived analytics as a passive industry. They considered it to be a limited …
Feature Engineering: How to transform variables and create new ones?
Business Analytics
One of common advice machine learning experts have for beginners is – focus on Feature Engineering. Be it a beginner building his first model or some …
How Apple Watch would re-define Apple’s products in next 3 years?
Big data
By the time I publish this post, the internet and blogosphere would be swamped with Apple Watch written all over it. So, why am …
Framework and Applications of ARIMA time series models
Business Analytics
R
Quick Recap Hopefully, you would have gained useful insights on time series concepts by now. If not, don’t worry! You can quickly glance through …
Thanking all successful women in the world of data analytics
 Foreword Women like Lady Ada Lovelace, Marie Curie, Mother Teresa, Indra Nooyi, Oprah Winfrey, Marissa Mayer, Sheryl Sandberg and many others have passionately served their …
2 Simple Hacks to improve information density in your QlikView dashboards
Business Intelligence
Qlikview
Foreword Few months back, I got an opportunity to work on a QlikView project with an insurance company. The objective was to create a sales dashboard to …
Learning R couldn’t get easieR and betteR!
Business Analytics
R
We launched our learning paths in January this year – and they have done phenomenally well. They solve a problem every person wanting to …
Introduction to ARMA Time Series Models – Simplified
Business Analytics
Business Intelligence
R
ARMA models are commonly used for time series modeling. In ARMA model, AR stands for auto-regression and MA stands for moving average. If the sound of these …
How to detect Outliers in your dataset and treat them?
Business Analytics
In the last two articles of this series (data exploration & preparation), we looked at Variable identification, Univariate, Bi-variate analysis and Missing values treatment. In this article, …
How to choose the right data science / analytics / big data training?
Big data
Business Analytics
Business Intelligence
Over the last 2 years, this is the most common query I receive from our readers: Which data science / analytics training should I …
Exploration of Time Series Data in R
Business Analytics
R
This is the second part of the step by step guide to Time Series Modelling. In the first part, we looked at basics of …
7 Steps of Data Exploration & Preparation – Part 2
Business Analytics
Introduction In Part-1 of this series, we looked at the first three steps of Data Exploration & Preparation, namely Variable identification, Univariate and Bivariate analysis. In this …
How to get the most out of Massive Open Online Courses (MOOCs)?
Big data
Business Analytics
I bought my first fitbit recently and I am loving it! While tracking my activities is obviously good, what makes it a great product …
Step by Step guide to learn Time Series Modeling
Business Analytics
R
Introduction Regression Models, both linear and logistic are an inevitable part of Analytics industry. Take a flashback & recall, when did you built your last Time …
7 Steps of Data Exploration & Preparation – Part 1
Business Analytics
Introduction I have been a Business Analytics professional for close to three years now. In my initial days, one of my mentor suggested me …
Learning path for Weka – GUI based way to learn Machine Learning
Business Analytics
Did you feel like a lost bird when you started learning Machine learning? Do you learn coding on a language first? Or you focus …
How to avoid Over-fitting using Regularization?
Business Analytics
Occam’s Razor, a problem solving principle states that “Among competing hypotheses, the one with the fewest assumptions should be selected. Other, more complicated solutions may ultimately …
Geo-Searching & Analytics Using AWS Cloud Search
Big data
Business Analytics
One of the common challenge faced by Analytics professionals is geo & radius related questions such as: How to get the cheapest houses in …
Interview with Industry expert – Ajay Ohri, Founder, decisionstats.com
Business Analytics
R
Recently I caught up with Ajay Ohri, founder of decisionstats.com over a cup of coffee. Ajay has been a friend and a mentor to …
Better, faster and more helpful Analytics Vidhya is now live!
If you are reading this article, you would have seen the new look of Analytics Vidhya by now! I can’t tell how much we …
How to create Box-Plot chart in Qlikview?
Business Intelligence
Qlikview
The use of this article is best illustrated by a case study. So let’s dive straight in. Business Situation: Recently, we entered 2015 and …
Introduction to Online Machine Learning : Simplified
Big data
Business Analytics
Python
R
Data is being generated in huge quantities everywhere. Twitter generates 12 + TB of data every day, Facebook generates 25 + TB of data …
Learning path for SAS – from beginner to a Business Analyst
Business Analytics
SAS
This is now becoming a theme! But some thing we are very excited about and our audience is loving. Those who are late in …
Decision Tree Algorithms – Simplified
Business Analytics
In last article, we looked at the basics of Decision tree and how it helps in classifications. We also looked at advantages and disadvantages …
Model Performance metrics: How well does my model perform? – Part 2
Business Analytics
The popularity of the last article forces us to publish this article this soon. In the last article, we discussed a few performance metrics …
QlikView learning path – the only resource you need to master QlikView
Business Intelligence
Qlikview
We launched our learning paths last week with Data Science in Python. The Python learning path received awesome response from not only our audience, …
Decision Tree – Simplified!
Business Analytics
I started working as a business analyst in my previous organisation. I transitioned from a Business Intelligence (BI) Analyst to become a Business Analyst. …
Launch of learning path – Data Science in Python
Business Analytics
Python
We are jumping on our feets right now! We can’t find any other way to express our excitement. We said that 2015 is going …
Model performance metrics: How well does my model perform? – Part 1
Business Analytics
In case you are preparing for an analytics interview, you have hit a jackpot. This blog will give you answers to at least 2 …
Comprehensive Introduction to merging in SAS
Business Analytics
SAS
In my previous article, “Combining data sets in SAS – Simplified“, we discussed three methods to combine data sets – appending, concatenating and Interleaving. …
Image processing and feature extraction using Python
Big data
Business Analytics
Python
No doubt, the above picture looks like one of the in-built desktop backgrounds. All credits to my sister, who clicks weird things which somehow become really tempting …
Scikit-learn in Python – the most important Machine Learning tool I learnt last year!
Business Analytics
Python
This article went through a series of changes! I was initially writing on a different topic (related to analytics). I had almost finished writing it. …
Welcome 2015 with new, better and more helpful Analytics Vidhya
Over the last 12 months, we went from a small, little known blog on analytics to one of the most engaging and helpful community …
Basics of Image Processing in Python
Big data
Business Analytics
Python
Writing today’s article was a fascinating experience for me and would also be for the readers of this blog. What’s so different? Two things: firstly the …
Our top 10 Data Science articles in 2014
Business Analytics
2014 has been a year of growth for us. We now get 10x traffic compared to what we used to get 12 months back. …
Simple Framework to crack a Kaggle problem statement
Big data
Business Analytics
Python
It is an exciting times at Kaggle: 5 simultaneous competitions with significant prize value – Santa is definitely out there looking for good data scientists across …
The “caret” Package – One stop solution for building predictive models in R
Business Analytics
R
Predictive Models play an important role in the field of data science and business analytics, and tend to have a significant impact across various …
Combining datasets in SAS – simplified!
Business Analytics
SAS
One of the most common task, every analyst performs multiple times in a project is combining data sets. There are various ways to combine …
Data Science trends 2015 to help you plan your learning!
Business Analytics
2014 is coming to an end shortly. What a glorious year it has been for technology and data science! Constant change, better products, faster …
NoSQL Databases : Simplified
Big data
Business Intelligence
My father always hesitates while making big ticket transaction online. He is always scared of machine making an error. Just imagine that you transfer …
QlikView Section Access for defining data access in your applications
Business Intelligence
Qlikview
New age BI tools like QlikView and Tableau are making it easy to access information on the go. With this ease of access, there …
Top certifications for SAS, R, Python, Machine Learning or Big Data
Big data
Business Analytics
Python
R
SAS
We released our rankings for various long duration analytics programmes in India for 2014 – 15 last week. They were greeted with unparalleled enthusiasm and response …
All you need to know to start a career in analytics
Business Analytics
In last two years, we have successfully tried creating a community dedicated to share best practices across industry and give a kick start to …
How to remove Synthetic Key using Concatenation & Link table in QlikView?
Business Intelligence
Qlikview
In one of my previous articles, we discussed about synthetic keys (Synthetic keys in Qlikview – Simplified). We discussed why synthetic keys are generated …
Top 5 Analytics Programs in India (2014 – 15)
Business Analytics
We created our first set of rankings of analytics programmes about 18 months back. We didn’t expect the roaring response those rankings received. The …
Introduction to PIG Latin
Big data
In previous article, we discussed the Hadoop ecosystem ( link ). We also spoke about two most heavily used Hadoop tools i.e. PIG and …
Comprehensive guide to SAS PROC Format
Business Analytics
Business Intelligence
SAS
I have spent a significant part of my career as a data visualization guy. I am very particular about the formatting and presentation of …
5 things every data science manager should do before leading a team
Big data
Business Analytics
Business Intelligence
I took very different roles and responsibilities while I was doing my corporate data science jobs. They not only gave me a lot of …
Types of database management system and their evolution
Big data
Business Analytics
Business Intelligence
Various researches have revealed that whenever we hear an object, we retrieve it using an image from our brain. For instance, if I ask …
Tips to prepare an outstanding CV for data science roles
Business Analytics
Business Intelligence
Here is a CV I received for a position of “Research scientist” some time back: Sadly, the person who applied for the CV had …
Hadoop beyond traditional MapReduce – Simplified
Big data
Business Analytics
Business Intelligence
In previous articles on Hadoop, our focus have been on MapReduce routines. MapReduce are the basic functional unit of a Hadoop system. Following are …
Steps for effective text data cleaning (with case study using Python)
Big data
Business Analytics
Python
The days when one would get data in tabulated spreadsheets are truly behind us. A moment of silence for the data residing in the …
Synthetic Keys in Qlikview – simplified!
Business Intelligence
Qlikview
Before I discuss about Synthetic Keys, let’s look at a typical QV data model (in the diagram on right hand). Here, we can see three …
Applications of SAS Proc IML in Analytics
Business Analytics
SAS
In the last two articles on IML, we discussed basics of IML and how to perform various matrix operations using SAS IML. IML makes …
Five data science projects to learn data science
Big data
Business Analytics
Business Intelligence
Nothing beats the learning which happens on the job! Whether it is the challenges you face while collecting the data or cleaning it up, …
A tribute to Sachin! Qlikview dashboard for his glorious test career
Business Intelligence
Qlikview
Sachin has inspired an entire generation of cricketers in India and abroad. He is one of the few “Universal God” in India, i.e. people …
Next step in the world of SAS IML
Big data
Business Analytics
SAS
In the last article on IML (here) , we introduced you to the world of Matrix language on SAS. We also talked about some …
7 tips to overcome your analytics learning hurdles today
Business Analytics
Business Intelligence
I have been writing and answering queries on career transition into analytics for more than 18 months now. While this experience has been very …
Introduction to PROC IML : Making matrix handling on SAS as easy as R
Big data
Business Analytics
SAS
I have been using SAS for more than 3 years now. When I started using R, I found a few operations extremely easy. R …
Index page to learn everything about Analytics
Big data
Business Analytics
Business Intelligence
Python
Qlikview
R
SAS
Web Analytics
Analytics Vidhya has been a tremendous journey for us. Today, when we look back at the journey we have covered so far – it …
Commonly asked interview puzzles – Part II
Business Analytics
Most of the analysts love solving and asking puzzles. Some of the best analysts I know, have a glean in their eyes at mention …
How does Artificial Neural Network (ANN) algorithm work? Simplified!
Big data
Business Analytics
In the last article (click here), we briefly talked about basics of ANN technique. But before using the technique, an analyst must know, how does …
Introduction to SAS Macros – Functions
Business Analytics
SAS
In last 2 articles, we looked at the basic concept of SAS Macros and how they become useful to accomplish repetitive tasks easily. We …
Introduction to Artificial Neural Network : Simplified
Big data
Business Analytics
Here is yet another algorithm used by the industry to scare ignorant freshers. The tag line for this algorithm is “It works in a …
Review – Post Graduate Program in Business Analytics from Great Lakes Institute of Management
Business Analytics
Recently, I spent half a day speaking about Insurance Analytics at Great Lakes Institute of Management on their Gurgaon campus. Interacting with students / …
Introduction to k-nearest neighbors : Simplified
Big data
Business Analytics
In four years of my career into analytics I have built more than 80% of classification models and just 15-20% regression models. These ratios …
Introduction to SAS Macros – Conditional & Iterative statements
Business Analytics
Business Intelligence
SAS
In my previous article, we developed a basic understanding of SAS Macros and SAS Macro variables. We also looked at how macros can be used …
Learning path & resources to start your data science (analytics) career today
Business Analytics
R
SAS
Marie said it correctly – the most difficult step in any process is the first step!   Recently, we launched a list of various …
Support Vector Machine – Simplified
Business Analytics
The first time I heard the name “Support Vector Machine”, I felt, if the name itself sounds so complicated the formulation of the concept …
Introduction to SAS Macros
Business Analytics
Business Intelligence
SAS
A quick example: Let’s look at the following SAS program: Above SAS code is written to extract policy level details for 09-Sep-14 and let …
An exciting update from us – hopefully a learning aid for you!
One of the most common queries, we receive through several forums is: Which is the best training for me? OR What is the right …
The case of lost customer centricity in your analysis!
Business Analytics
Today’s article is different (but very important!). It will not tell you about techniques to perform cutting edge analytics! On the contrary, it is to emphasize …
Tutorial on Web scrapping, text mining and predictive modeling (a.k.a. Solution to AV Author identification challenge)
Big data
Business Analytics
R
Web Analytics
While working late in office, something very strange happened to me. I got a message from Google that I should leave office in next …
Data Munging in Python (using Pandas) – Baby steps in Python
Big data
Business Analytics
Python
Time flies by! I see Jenika (my daughter) running around in the entire house and my office now. She still slips and trips – …
How to automate your Excel models and reporting using dynamic Range?
Business Intelligence
About some time back, we hired a smart analyst in our team (let’s call him Sam). Sam’s role required him to create and maintain …
Commonly asked puzzles in analytics interviews
Business Analytics
Business Intelligence
The earlier you land up to the industry which interests you, the better it is for your carrer. I landed up into analytics industry …
How To Become a Data Scientist (Business Analyst)?
Big data
Business Analytics
Python
SAS
Last week, I shared a framework to help you answer the question, “Should I become a data scientist (or business analyst)?“. For the people, …
Forum mining challenge – Get the right questions!
Big data
Business Analytics
Analytics community is relatively small but very vocal on the web world. We subscribe to various Linkedin groups, Facebook groups and other websites to …
How to implement Incremental Load in QlikView?
Business Intelligence
Qlikview
In my previous article, we discussed “How to use QVDs to make your QlikView application more efficient?”. In this article, we will go one step …
Test your level of expertise with SAS/R/Python
Big data
Business Analytics
Python
R
SAS
Currently R, SAS and Python are the three languages ruling the analytics industry. Expertise in at least one of the three language is a …
Should I become a data scientist (or a business analyst)?
Big data
Business Analytics
Business Intelligence
One of the common queries I come across repeatedly across several forums is “Should I become a data scientist (or an analyst)?” The query …
Framework to build a niche dictionary for text mining
Big data
Business Analytics
Having the right dictionary is at the heart of any text mining analysis. Dictionary for text mining can be compared to maps while travelling …
Mining YouTube using Python & performing social media analysis (on ALS ice bucket challenge)
Big data
Business Analytics
Python
If you are someone like me, you would have been swamped by the constant feed of people pouring ice buckets over them – but …
Understanding and analyzing the hidden structures of unstructured dataset
Big data
Business Analytics
R
The key to using unstructured data set is to identify the hidden structures in the data set. This enables us to convert it to a …
How to use QVDs (QlikView Data files) to make your Qlikview application efficient?
Business Intelligence
Qlikview
In June 2013, I had been using QlikView for about a year. During those days, I was working on a QlikView project where I …
How to use big data to profit from the stock market?
Big data
Business Analytics
I have to admit I am avid believer that everyone should invest in the stock market. We are in the age of online discount …
Step by step guide to extract insights from free text (unstructured data)
Big data
Business Analytics
Text Mining is one of the most complex analysis in the industry of analytics. The reason for this is that, while doing text mining, …
Market mix modeling – Simplified!
Business Analytics
SAS
US market spends on an average more than $140 Billion on just marketing every year. Provided that marketing is such an important component of …
Bring it on! Analytics Vidhya Author identification challenge
Business Analytics
Python
SAS
What is the best form of analytics learning? Applying it to practical problems! This is exactly what led us to create this interesting problem, …
How Big data & Analytics can help Government agencies run better?
Big data
Business Analytics
Over the past decade or so, analytics has undergone a rapid transformation. During its initial stages, analytics was used more as a reactionary measure …
Baby steps in Python – Exploratory analysis in Python (using Pandas)
Business Analytics
Python
In the last 2 posts of this series, we looked at how to install Python with iPython interface and several useful libraries and data …
Training review – A Big Data Course with a ‘Big’ Difference from Jigsaw Academy
Big data
Business Analytics
Big Data has emerged as one of the fastest growing fields in recent times and every business is looking to leverage Big Data to …
Visualizing product relationships in a market Basket analysis
Big data
Business Analytics
R
Last week had been very hectic. I had slogged more than 100 hours to come out with an awesome recommender based on market basket …
Effective Cross Selling using Market Basket Analysis
Business Analytics
Have you come across a hair-dresser in the saloon offering you to undergo a head massage or a hair coloring when you go for …
How to interpret hidden state in Latent Markov Model
Big data
Business Analytics
In some of my previous articles, I have illustrated how Markov model can be used in real life forecasting problems. As described in these …
Review: Qlik Sense Desktop – Is this the next gen visualization tool you need?
Business Intelligence
Qlikview
QlikTech recently announced a free version of its next-generation data visualization application – Qlik Sense. According to QlikTech, the product delivers a simple drag-and-drop interface …
Solve a business case using simple Markov Chain
Business Analytics
Markov process fits into many real life scenarios. Any sequence of event that can be approximated by Markov chain assumption, can be predicted using …
Using statistics: How to understand population distributions?
Business Analytics
One of the common queries, which I get on the blog is: I am not a Mathematics / Statistics graduate. Can I still become …
Introduction to Markov chain : simplified!
Big data
Business Analytics
Markov chain is a simple concept which can explain most complicated real time processes.Speech recognition, Text identifiers, Path recognition and many other Artificial intelligence …
Baby steps in Python – Libraries and data structures
Business Analytics
Python
In one of the posts last month, we started taking baby steps in learning Python for data analysis. This post will take you one …
How to use “VLOOKUP()” like functionality in QlikView?
Business Intelligence
Qlikview
Whenever I interact with a Qlikview user, who has migrated from Excel recently – one of the most common queries which comes through is: …
Who is the world cheering for? 2014 FIFA WC winner predicted using Twitter feed (in R)
Big data
Business Analytics
R
Sports are filled with emotions! Cheering of audience, reactions to events on various media channels are some of the factors, which make a huge impact on the …
Definitive guide to prepare for an analytics interview
Big data
Business Analytics
Business Intelligence
Let’s face it! Facing an analytics interview can be daunting at times! I have met a lot of analysts, who are good analysts when …
Using Facebook as an analyst (Hint – using R)
Big data
Business Analytics
R
Facebook has huge data bank and it allows us to make use of it to some extent. October is a month of celebration in India. We have …
Baby steps in learning Python for data analysis
Business Analytics
Python
Last weekend turned out to be a very special one! My 10 month old daughter took her first baby steps and watching her take …
Comparing a Random Forest to a CART model (Part 2)
Business Analytics
R
Random forest is one of the most commonly used algorithm in Kaggle competitions. Along with a good predictive power, Random forest model are pretty simple …
What is deep learning and why is it getting so much attention?
Big data
Business Analytics
A few days back, the content feed reader, which I use, showed 2 out of top 10 articles on deep learning. This is when …
Comparing a CART model to Random Forest (Part 1)
Business Analytics
R
I created my first simple regression model with my father in 8th standard (year: 2002) on MS Excel. Obviously, my contribution in that model was minimal, but …
SAS launches a free version – but, is it good enough?
Business Analytics
SAS
I have spent the entire 7 years of my corporate work experience working on SAS. So, when I heard that SAS launched a free …
Unveiling Analytics Vidhya Apprentice – a programme to graduate with recognition for your knowledge!
Business Analytics
It has been more than a year since we started our journey to change how Analytics knowledge flows in communities. The experience has been …
Introduction to Random forest – Simplified
Big data
Business Analytics
With increase in computational power, we can now choose algorithms which perform very intensive calculations. One such algorithm is “Random Forest”, which we will discuss …
Must have books for data scientists (or aspiring ones)
Big data
Business Analytics
I am back to one of my favourite topics – books! To double up the excitement, this time the list is for data scientists …
Tricking your elephant to do data manipulations (using MapReduce)
Big data
Business Analytics
Magic is a performing art that entertains audiences by staging tricks or creating illusions of seemingly impossible or supernatural feats using natural means (Source : Wikipedia) . If you understand the …
Introduction to MapReduce
Big data
Business Analytics
MapReduce is a programming model for processing large data sets with a parallel , distributed algorithm on a cluster (source: Wikipedia). Map Reduce when coupled with HDFS …
The lack of analytics work experience and how to overcome it?
Business Analytics
Business Intelligence
Let me present the two sides of a debate going on in my mind: One of the most common reasons quoted for rejection of …
What is Hadoop? – Simplified!
Big data
Business Analytics
Scenario 1: Any global bank today has more than 100 Million customers doing billions of transactions every month Scenario 2: Social network websites or eCommerce …
9 best practices for analytics talent management
Business Analytics
Business Intelligence
Interviewing for Analytics positions is fun! I have conducted hundreds of interviews and I still grab most of the opportunities which come my way. …
4 Tricky R interview questions
Business Analytics
R
SAS
Analytics industry in India is dominated by SAS currently. But, it will be too optimistic to hope that this remains to in years to …
Planning a late career shift to Analytics / Big data? Better be prepared!
Big data
Business Analytics
Business Intelligence
I feel lucky to be part of the data revolution happening around us. Because of the attention and the focus on Analytics / Big Data, …
Build a word cloud using text mining tools of R
Big data
Business Analytics
R
 This is how a word cloud of our entire website looks like! A word cloud is a graphical representation of frequently used words in …
Analytics events in 2014 – India and abroad
Big data
Business Analytics
Business Intelligence
One of the queries I frequently get on my blog is: Which events / conferences are happening in India and are they worth attending? …
Data Visualization: Creating Geo-spatial dashboards in Qlikview
Business Intelligence
Qlikview
In my previous article, we discussed how to use Qlikview for visualization of tabular information. Now, let’s think of a scenario, where we need to …
Simple framework to build a survival analysis model on R
Big data
Business Analytics
R
In the last article, we introduced you to a technique often used in the analytics industry called Survival analysis. We also talked about some …
Training recommendation (Tutorials from PyCon 2014 – USA) and contest update
Big data
Business Analytics
I am usually very selective about attending conferences! Its not because I don’t like networking or talking to people. Its because I have a very high …
Is survival analysis the right model for you?
Big data
Business Analytics
I was a post-graduate in Mechanical Engineering when I joined the analytics industry as a fresher. The only background I had in analytics industry was …
We just turned 1!
Big data
Business Analytics
Business Intelligence
Yes, that’s right! The first article on Analytics Vidhya went live exactly an year ago (20th April 2013). In less than a year, Analytics …
The importance of context for an analyst!
Business Analytics
Business Intelligence
We analysts enjoy crisp, objective and to the point conversations. An ideal conversation for us is when we come straight to the point, discuss …
Tricky Base SAS interview questions : Part-II
Big data
Business Analytics
SAS
SAS is the largest market-share holder for advanced analytics. If you are going to work in analytics industry, it is impossible to escape from the …
8 rules for new age analytics learning!
Big data
Business Analytics
Business Intelligence
Data science has become one of the most dynamic field. Every alternate month I hear about a start up coming up with next gen …
Probability in action: Could Monty Hall have made more money on the show?
Business Analytics
Monty Hall could have lost 66.7% of times in the show, if contestant consistently took the best strategy. Could he reduce these losses and …
Freelancing consultant – SAS, India’s leading travel portal
Jobs - Business Intelligence
We are looking at a consultant who would help in setting up & integrating SAS for us. This would be a 2 week role and …
Excitement going up at Analytics Vidhya (and I can’t stop smiling)!
Guess what…we will celebrate our first year anniversary shortly. Last year has just flown by! I still remember the excitement and the anxiety I …
Solving Accuracy vs. Cost using probabilities (with case study)
Business Analytics
Business Intelligence
As a manager, you face cost vs. quality / accuracy trade-offs on a regular basis. This can be in the form of any of …
SAS vs. R (vs. Python) – which tool should I learn?
Big data
Business Analytics
Python
R
SAS
We love comparisons! From Samsung vs. Apple vs. HTC in smartphones; iOS vs. Android vs. Windows in mobile OS to comparing candidates for upcoming …
How analytics help organizations becoming customer centric?
Big data
Business Analytics
Business Intelligence
Whatever industry you work in, you will hear following line in almost all the top management presentation : “Our target for this is year …
Part III – Interview with Industry expert, Mr. Srikanth Velamakanni, CEO, Fractal Analytics
Big data
Business Analytics
Business Intelligence
This is third and concluding part of the interview with Srikanth Velamakanni, Co-founder & CEO, Fractal Analytics. In the first part, Srikanth shared how …
Data Visualization for Tabular Information (with Qlikview case)
Business Intelligence
Qlikview
BI industry has its roots in MIS and spreadsheets have been the most commonly used MIS tool in the last decade. Due to this, …
Interview with Industry expert, Mr. Srikanth Velamakanni, CEO, Fractal Analytics – Part II
Business Analytics
Business Intelligence
Last week, we released the first part of this interview. In that part, Srikanth had shared his experience with starting up Fractal, the challenges …
Learn Analytics using a business case study : Part III
Business Analytics
Business Intelligence
Data based analytics and intelligence practices typically continue to grow complex over time. This is because we get more data over time, computational power …
Interview with Industry expert, Mr. Srikanth Velamakanni, CEO, Fractal Analytics
Big data
Business Analytics
Business Intelligence
The best part of running this blog has been, connecting with some of the best people in the Analytics Industry. I recently, got an …
Learn Analytics using a business case study : Part II
Business Analytics
Business Intelligence
The sequel episode, of most trilogies, is often the most interesting. This is because the first episode builds a foundation, of the overall plot, …
Tools for improving structured thinking (for analysts)
Business Analytics
Business Intelligence
There are 4 ingredients required to make a good an awesome business analyst: Passion for Business Analytics Structured thinking Love for statistics and numbers …
Learn Analytics using a business case study : Part I
Business Analytics
Business Intelligence
Best way to learn analytics is through experience and solving case studies. Here, I will present you a complete business model and take you through a …
How to Use AGGR () function in Qlikview?
Business Intelligence
Qlikview
The main purpose behind creating any dashboard is to summarize and aggregate information in a manner that can be communicated visually and clearly. Traditionally, …
Maintaining fearless monk-like attitude while leading Analytics teams
Business Analytics
Business Intelligence
One of my mentors had the following thought written in one of the presentation he was making: “Your attitude, not your aptitude, will determine …
Demystifying LinkedIn using probabilities
Business Analytics
Web Analytics
You will be able to view only 1-2 Million of profiles out of 18 Million profile on LinkedIn, if you are a non premium …
An analytics interview case study
Business Analytics
Case study is the most important round for any analytics hiring. However, a lot of people feel nervous with the mention of undergoing a …
Update on new year resolutions (with training & reading recommendations)
Business Analytics
One of my mentors used to say: “Any project / initiative that does not gets tracked, does not happen.” While this might sound like …
Tips for creating a winning dashboard
Business Intelligence
Qlikview
Recently I came across this article from Software Advice, a website that reviews business intelligence tools, called “Winning Dashboard Creation Tips from the Qlikview Open Data …
How to train your mind for analytical thinking?
Business Analytics
Business Intelligence
I recently started going to the GYM. Quite a big achievement for me to be going to the gym regularly for more than a …
Set Analysis in QlikView – simplified!
Business Intelligence
Qlikview
One of the best practices I follow while preparing any report / dashboard is to provide a lot of context. This typically makes a …
Boon from big data or loss of privacy?
Big data
Business Analytics
Today’s post is going to be different. There is no technical subject matter I am going to talk about. But the article is far …
Framework to build logistic regression model in a rare event population
Business Analytics
SAS
Only 531 out of a population of 50,431 customer closed their saving account in a year, but the dollar value lost because of such …
Starting a big-data analytics practice? Answer these 5 questions first
Big data
Business Analytics
Web Analytics
Are you a business owner who is wondering how can Analytics / Big data help me out? Or you are convinced that data mining …
Customized Reporting in Qlikview
Business Intelligence
Qlikview
As a BI professional, I am used to receiving ad-hoc reporting requirements from business users which need a fast turn-around (sometimes under the name …
Tips to crack a guess estimate (Analytics case study)
Business Analytics
Business Intelligence
After a wait for 3 long hours, it was my turn to enter the interview room. The first question asked to me by the …
My resolutions for 2014
Business Analytics
In my last post, I mentioned how 2013 has been a phenomenal year for me. I can’t wait to continue the momentum in 2014. …
Highlights of 2013
Big data
Business Analytics
Business Intelligence
Qlikview
SAS
Web Analytics
2013 has been an outstanding year for me personally. Among other things, there have been 2 key highlights for this year: Becoming a father …
Extracting right variables for your Regression model
Business Analytics
SAS
Getting the right variables in your model and cleaning them can make or break your model. The precision of the model depends on the …
Being paranoid about data accuracy!
Big data
Business Analytics
Business Intelligence
SAS
As the day was coming to a close, I thought of fitting in another meeting. Two analysts in my team had been working for …
How to create waterfall chart in Qlikview?
Business Intelligence
Qlikview
  Given that 2013 is coming to a close, a very common question analysts get asked around this time of the year is: “How …
How do banks identify the next best product need of its customer?
Big data
Business Analytics
Web Analytics
Recently, I got a message of a new unknown transaction done on my credit card. I raised a dispute against the transaction. Within, 10 minutes my …
Diagnosing residual plots in linear regression models
Big data
Business Analytics
SAS
My first analytics project involved predicting business from each sales agent and coming up with a targeted intervention for each agent. I built my …
Interview with data scientist and top Kaggler, Mr. Steve Donoho
Big data
Business Analytics
It’s our pleasure to introduce top data Scientist (as per Kaggle), Mr. Steve Donoho, who has generously agreed to do an exclusive interview for Analytics …
4 tricky SAS questions commonly asked in interview
Business Analytics
SAS
While working extensively on SAS-EG , I lost touch of coding in Base SAS. I had to brush up my base SAS before appearing …
Review: Tableau 8.1
Business Intelligence
Qlikview
As a Business Analyst, I have been a predictive modeler for most of my career. Majority of this time was spent on SAS along …
Getting your clustering right (Part II)
Big data
Business Analytics
SAS
I was starring at the computer screen for the final clustering result. Finally, I opened the output file and found the first cluster with …
5 Simple manipulations to extract maximum information out of your data
Business Analytics
Business Intelligence
SAS
Web Analytics
How would you distinguish between best, good and worst analyst from a group of analysts? I would simply provide them same problem and data set …
Getting your clustering right (Part I)
Business Analytics
SAS
Web Analytics
Clustering is one of the toughest modelling techniques. It takes not only sound technical knowledge, but also good understanding of business. We have split …
How to find inefficient branches when considering multiple outputs?
Business Analytics
SAS
Recently, I was working on a business problem, which required me to find out inefficient branches of a bank X in North America and …
Questions to ask while designing A/B (or multi-variate) tests
Business Analytics
Web Analytics
Testing / experimentation can help Organization find hidden information or insights about their own customers and business. Sadly, not many Organizations realize the amount …
Festive season special: Building models on seasonal data
Business Analytics
Business Intelligence
SAS
Has your model ever failed on out of time validation because of seasonality? If yes, then you need to know that one of the …
Analytics training recommendations from last 2 months
Business Analytics
Business Intelligence
Web Analytics
One of the best part about being in Analytics industry is the opportunity (and need) to continuously learn new things and upgrade yourself. I …
Trick to enhance power of Regression model
Business Analytics
SAS
We, as analysts, specialize in optimization of already optimized processes. As the optimization gets finer, opportunity to make the process better gets thinner.  One …
Must read books (and blogs) on Web Analytics
Business Analytics
Business Intelligence
Web Analytics
I love reading! By reading something every day before sleeping, I not only continue my learning, but also end my day on a fulfilling …
Four rules for creating insightful and actionable reports (and metrics)
Business Analytics
Business Intelligence
Web Analytics
Here is a typical situation during performance reviews: “All the business leaders and stakeholders are present in a room. A performance report / MIS …
News: Edvancer Eduventures start CBAP course in Mumbai
Business Analytics
SAS
In continuation to our previous articles (here and here), another training institute, Edvancer Eduventures has started offering a range of analytics courses in India. Currently …
Segmentation of customers for effective implementation of analytical projects
Big data
Business Analytics
Web Analytics
According to a survey conducted by Bloomberg in 2011 (on companies exceeding $100 Mn in revenues), 97% of these companies have embraced Analytics in …
Five habits of highly successful analysts
Big data
Business Analytics
Business Intelligence
I have interacted with various successful analysts over last 7 years. During these interactions, I found out some common habits in them. After observing these …
Taking a new job in Analytics? Ask these 5 questions first!
Business Analytics
Web Analytics
Taking up the right job can accelerate your career. On the other hand, getting into a wrong job can de-rail you for couple of …
Must read books on data visualization
Big data
Business Analytics
Business Intelligence
It is not a co-incidence that all highly successful analyst have excellent data visualization skills. As a matter of fact, I think data visualization …
How to create a High performance Analytics team?
Big data
Business Analytics
Business Intelligence
Web Analytics
Lets assume that you are CEO of a MNC with operations across the globe! Over last few years, you have heard a lot of …
Upcoming trends in data visualization
Big data
Business Analytics
Business Intelligence
Recently, we were blessed with a baby girl. Among a lot of other things, one thing which keeps mesmerizing me is the continuous change …
A small break to celebrate!
Let me not give it away simply….   Assuming that today’s date is T and T – 1 represents yesterday.   Further, If I …
Common myths about a career in Business Analytics: Busted!
Business Analytics
Web Analytics
Some time back, I wrote an article on “How to start a career in Business Analytics?“. The article was well received by people who …
Importance of Segmentation and how to create one?
Business Analytics
Web Analytics
Average is one of the biggest enemy of analysts! Why do I say so? The amount of reporting which happens on averages is astonishingly …
How to create compelling analytical stories using infographics?
Business Analytics
Business Intelligence
Let me go back a few years: After spending slightly more than a year in my previous role, I moved into a new role …
Common data preparation mistakes and how to avoid them?
Business Analytics
Business Intelligence
SAS
A few days back, one of my friend was building a model to predict propensity of conversion of leads procured through an Online Sales …
How freshers can ace interviews for Business Analytics roles?
Business Analytics
Business Intelligence
Campus interviews can be very competitive, especially so, if you want to secure a job with the best companies. Further, if you are a …
Nine productivity boosting tips for SAS Enterprise Guide Users
Business Analytics
SAS
SAS Enterprise Guide is a versatile tool for everyone from novice analysts to experienced programmers. It has revolutionized the way people use and access SAS …
What is big data and how is big data architecture designed?
Big data
Business Analytics
Web Analytics
Consider following fact: Let us spend a few seconds to think what information Facebook typically stores about its users. Some of this is: Basic …
Must read books for Analysts (or people interested in Analytics)
Business Analytics
Business Intelligence
Web Analytics
One of the ways I continue my learning is reading. I read for 30 minutes before hitting the bed every day. This not only …
How to become an analytics rockstar?
Business Analytics
Business Intelligence
I still remember first day in my first job.   I walked in the office with high ambitions and little understanding of what it …
How to start a career in Business Analytics?
Business Analytics
Business Intelligence
SAS
Web Analytics
Every time I attend any analytics forum or interact with students, two questions stand out on account of number of times they are asked: …
How to create Financial models flawlessly?
Business Analytics
Business Intelligence
Qlikview
Recently, I met one of my friend working in strategy team of a bank over lunch. I felt bad for something which he mentioned …
Advanced analytics certifications in India
Business Analytics
Business Intelligence
One question a lot of MIS professionals face day to day is: “How do I shift my career to work in Advanced Analytics?” A …
Common mistakes analysts make during analysis and how to avoid them?
Business Analytics
Quite often I come across situations where people end up making wrong inferences based on half baked analysis. Or when people force fit data …
The art of structured thinking and analyzing
Business Analytics
It took me 3 months to complete my first analytics project. If I would have worked on a similar project 6 months into the …
How to implement an analytics solution for a business problem?
Business Analytics
Business Intelligence
During a panel discussion in Gartner Business Intelligence and Analytics Summit early this year in Barcelona, vendors estimated: 70% of Analytics projects fail to meet …
What is Business Analytics and which tools are used for analysis?
Big data
Business Analytics
Business Intelligence
SAS
Business Analytics has become a catch all word for any thing to do with data. So if you are new to this field and …
Limitations of Pre vs. Post analysis and Importance of testing
Business Analytics
During a recent interview with an analyst working for a big multi-national retail store chain, I asked: How do you check if promotion of …
How to apply web analytics for e-Commerce websites?
Business Analytics
Web Analytics
E-commerce is a dynamic and developing industry and so is web analytics. Mix the two and you can expect a world dynamic to its …
How to identify a good (and bad) Business Analyst?
Business Analytics
Business Intelligence
Over last 6 years, I have come across more than hundreds of analysts and have conducted almost equal number of interviews. Over this time, …
Creating a simple and effective Sales dashboard (with Qlikview) – Part 2
Business Intelligence
Qlikview
In my last post, I discussed how a simple dashboard can provide information effectively. I will continue from where I left and go through …
Creating a simple and effective Sales dashboard (with Qlikview)
Business Intelligence
Qlikview
Appropriate space allocation and compelling visuals which convey business insights in meaningful manner are key to creating a good dashboard. The dashboard / chart …
Joining / Merging in SAS – alternate approaches (including really efficient ones!)
Business Analytics
SAS
  One of the most common operation for any analyst is merging datasets. As per my estimate, an analyst spends at least 10 – …
Basics of Predictive modeling
Business Analytics
Imagine how the world would change when any advertisement you receive is only about a product you are interested in. How beautiful it would …
Welcome to Analytics Vidhya!
Welcome to Analytics Vidhya! For those of you, who are wondering what is “Analytics Vidhya”, “Analytics” can be defined as the science of extracting …