Latest updates

Kickstart Your ML Journey: Scoping, Structuring, and Exploring Data (Part 1)


We will cover the following topics in this post

  • Understand the business problem
  • Set up your working environment and directory layout
  • Gather data (use multithreading to speed up 2 to 4x)
  • Pre-process data (use vectorization to speed up 10x)
  • Gain valuable insights through EDA
  • Build interactive visualizations (in Part 2 of this series)
  • Finally use ML to answer questions (in Part 3 of this series)
  • Extras: you will also learn how to modularize the code into independent and reusable components, as well as how to use abstraction.

Note: this post is intended for beginner to mid level data scientists.

Almost all data science and ML projects start with a business problem. So, let’s define the problem that we are trying to solve here first.

Say, you work for a taxi service company in NYC and your team is trying to…



Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button