--- title: detail localeTitle: 详情 --- ## 什么是数据科学 ### 数据科学是一个多学科领域,结合了技能 ``` software engineering and statistics with domain experience to support the end-to-end analysis of large and diverse data sets, ultimately uncovering value for an organization and then communicating that value to stakeholders as actionable results. ``` ## 数据科学家 ``` Person who is better at statistics than any software engineer and better at software engineering than any statistician. ``` ## 你需要什么技能? ``` * Mathematics - Calculus, Linear Algebra * Statistics - Hypothesis, Testing, Regression * Programming - SQL, R/Python * Machine Learning - Supervised and Unsupervised Learning, Model Fitting * Business/Product Intuition - Interpret and communicate results to non-technical audience ``` ## 生命周期 ``` 1 - Identify or Formulate Problem 2 - Data Preparation 3 - Data Exploration 4 - Transform and Select 5 - Build Model 6 - Validate Model 7 - Deploy Model 8 - Evalute or Monitor Results ```