Data Science

A collision between math, stats, programming and the scientific method.

Bella Fried
3 min readOct 19, 2021
thedatascientist.com

What is Data Science?

Data Science is a collision between math, statistics, programming, and the scientific method to extract insights from massive amounts of data. And then, present the findings in a clear and meaningful way for organizations to make strategic decisions.

What do Data Scientists do? What is their Scientific Method?

  1. Identify a hypothesis to test
  2. Gather data via web scrape, manual entry, systems, etc.
  3. Organize/clean the data and put it into a consistent format to prepare it for analysis
  4. Analyze the data through statistical analysis and machine learning algorithms to adjust for biases and extract insights
  5. Translate findings into actionable solutions
  6. Communicate findings through visuals to business executives
  7. Program data models and deploy them into apps to achieve business goals
  8. Monitor the accuracy of the model continually through scientifically designed experiments

What tools are available for Data Scientists?

  • Programming — R, Python
  • Open source notebooks (web apps to write and run code, visualize data, and view the results all in one place) — Jupyter, RStudio, Zeppelin
  • Data processing — Apache Spark, Hadoop
  • Visualization — Tableau, Microsoft PowerBI

How are different industries using Data Science?

E-commerce — analyze call center data to understand the customer churn rate and take appropriate action through product refinement.

Entertainment — analyze customer viewing history to determine new content to produce and generate personalized recommendations.

Financial Services — analyze data to flag suspicious activity and detect fraudulent transactions, identify qualified loan and credit line applicants, and evaluate client portfolios for upselling opportunities.

Healthcare — analyze reported symptoms and medical test results to diagnose diseases earlier and treat them more effectively.

HR — analyze employee data across applicants to identify common characteristics of top performers.

Manufacturing — analyze equipment to schedule timely maintenance checks and predict machinery malfunctions.

Sales — analyze buying patterns to generate personalized product recommendations and targeted advertising and promotions, and manage inventories.

Sportsanalyze athlete performance to strategize game plans.

Transportation — analyze weather and traffic patterns to optimize delivery routes, schedules, and transportation methods.

Travel — analyze customer behavior to to optimize pricing.

What makes Data Science challenging?

  1. Finding the right data to analyze amongst the vast amounts available
  2. Identifying bias so findings are not faulty and business decisions are not misguided
  3. Choosing the right analytical tools
  4. Managing deployment with IT
  5. Quantifying business value
  6. Maintaining models

What are benefits of using Data Science effectively?

  1. More informed decision-making
  2. Refined target audiences
  3. Reduced costs
  4. Sales growth
  5. Increased customer satisfaction
  6. More efficient supply chains and logistics
  7. Stronger cybersecurity protections/reduced fraud
  8. Improved patient outcomes

Why is Data Science so crucial?

Thanks to modern tech, more data is generated and stored than ever before. For example, Facebook users on average upload 10 million pictures every hour. Interpreting the monstrous amounts of available data can yield a wealth of transformative benefits to organizations. For example, Data Science can help uncover insights used to make better decisions and create more innovative products and services.

--

--