Global Data Science Forum

Expand all | Collapse all

All about Data Scientist

  • 1.  All about Data Scientist

    Posted Wed February 20, 2019 02:44 AM
    Definitions:

    #1: Role to analyse and interpret complex digital data, especially in order to assist a business in its decision-making.

    # 2: Responsible for collecting, analyzing and interpreting large amounts of data to identify ways to help a business to gain.
    # 3: "A data scientist is: better at statistics than any software engineer and better at software engineering than any statistician"

    Data science requires knowledge of a number of big data platforms and tools, including HadoopPigHiveSpark and MapReduce, and programming languages that include structured query language (SQL), PythonScala and Perl, as well as statistical computing languages such as R.

    Data scientist vs. Data analyst: there is overlap in many of the skills, there are significant differences.

    • Data analyst varies depending on the company, in general, these professionals collect data, process that data and perform statistical analysis using standard statistical tools and techniques. Analysts also identify patterns and make correlations in data sets to identify new opportunities for improvements in business processes, products or services. 
    • Data scientistsare responsible for those tasks and many more. These professionals are equipped to analyze big data using advanced analytics tools and are expected to have the research background to develop new algorithms for specific problems. They may also be tasked with exploring data without a specific problem to solve.

    Data scientist skills

    • Soft skills required for data scientists include intellectual curiosity combined with skepticism and intuition, along with creativity.
    • Interpersonal skills are also a critical part of the role, and many employers want their data scientists to be data storytellers who know how to present data insights to people at all levels of an organization.
    • They also need leadership skills to steer data-driven decision-making processes in an organization.
    • Technically around Spark, Hadoop, Hive, Pig, SQL, Neo4J, MySQL, Python, R, Scala, Tensorflow, A/B Testing, NLP, anything Machine Learning 
    Certifications
    For more, refer my blog -  https://mjvreddy-jobopenings.blogspot.com/2018/07/data-scientist.html

    ------------------------------
    MEKALA V REDDY, Business Operations Leader for: IBM Cloud Innovation Labs-AP & India , IBM Cloud Labs-India , Product Ops: PureApp, UrbanCode, Cloud Automation Manager (CAM)
    ------------------------------


  • 2.  RE: All about Data Scientist

    Posted Thu February 21, 2019 03:00 AM
    Hi!

    I would like to point to Drew Conway's Venn Diagram of Data Science:
    The Data Science Venn Diagram
    Drew Conway remove preview
    The Data Science Venn Diagram
    On Monday I-humbly-joined a group of NYC's most sophisticated thinkers on all things data for a half-day unconference to help O'Reily organize their upcoming Strata conference . The break out sessions were fantastic, and the number of people in each allowed for outstanding, expert driven, discu
    View this on Drew Conway >
    In my experience data science is always a collaborative effort between experts in maths/statistics, IT professionals and subject matter experts, that all bring in their knowledge.
    But keep clear of the danger zone :-)

    ------------------------------
    RUDOLF PAILER
    ------------------------------



  • 3.  RE: All about Data Scientist

    Posted Thu February 28, 2019 04:57 AM
    Data science is a multidisciplinary blend of data inference, algorithmm development, and technology in order to solve analytically complex problems.

    At the core is data. Troves of raw information, streaming in and stored in enterprise data warehouses. Much to learn by mining it. Advanced capabilities we can build with it. Data science is ultimately about using this data in creative ways to generate business value

    Business Value
    Data science – discovery of data insight
    This aspect of data science is all about uncovering findings from data. Diving in at a granular level to mine and understand complex behaviors, trends, and inferences. It's about surfacing hidden insight that can help enable companies to make smarter business decisions. For example:

    Netflix data mines movie viewing patterns to understand what drives user interest, and uses that to make decisions on which Netflix original series to produce.
    Target identifies what are major customer segments within it's base and the unique shopping behaviors within those segments, which helps to guide messaging to different market audiences.
    Proctor & Gamble utilizes time series models to more clearly understand future demand, which help plan for production levels more optimally.
    How do data scientists mine out insights? It starts with data exploration. When given a challenging question, data scientists become detectives. They investigate leads and try to understand pattern or characteristics within the data. This requires a big dose of analytical creativity.

    Then as needed, data scientists may apply quantitative technique in order to get a level deeper – e.g. inferential models, segmentation analysis, time series forecasting, synthetic control experiments, etc. The intent is to scientifically piece together a forensic view of what the data is really saying.

    This data-driven insight is central to providing strategic guidance. In this sense, data scientists act as consultants, guiding business stakeholders on how to act on findings.

    Data science – development of data product
    A "data product" is a technical asset that: (1) utilizes data as input, and (2) processes that data to return algorithmically-generated results. The classic example of a data product is a recommendation engine, which ingests user data, and makes personalized recommendations based on that data. Here are some examples of data products:

    Amazon's recommendation engines suggest items for you to buy, determined by their algorithms. Netflix recommends movies to you. Spotify recommends music to you.
    Gmail's spam filter is data product – an algorithm behind the scenes processes incoming mail and determines if a message is junk or not.
    Computer vision used for self-driving cars is also data product – machine learning algorithms are able to recognize traffic lights, other cars on the road, pedestrians, etc.
    This is different from the "data insights" section above, where the outcome to that is to perhaps provide advice to an executive to make a smarter business decision. In contrast, a data product is technical functionality that encapsulates an algorithm, and is designed to integrate directly into core applications. Respective examples of applications that incorporate data product behind the scenes: Amazon's homepage, Gmail's inbox, and autonomous driving software.
    Data scientists play a central role in developing data product. This involves building out algorithms, as well as testing, refinement, and technical deployment into production systems. In this sense, data scientists serve as technical developers, building assets that can be leveraged at wide scale.


    ------------------------------
    rajesh kumar
    ------------------------------



  • 4.  RE: All about Data Scientist

    Posted Tue March 12, 2019 06:13 AM
    A Data Scientist is a person who analyse and interpret complex or huge Data . They analyse huge Data with the knowledge and skills of Mathematics, Statistic , programming language like Python or java. They also create various machine Learning tools or Processes within the Company. People with this role i.e as a Data Scientist should have good knowledge of Statistics Analysis. They are needed in almost all fields at Today's world because we tends to store every Data whether it is a Institutions or a Factory or a Company or Shop. That is the reason why the demand for data science is increasing each Day. For becoming a Data Scientist it is very much important to have good Concept of Python as it is used very frequently in Data Science Projects . A complete Data Science Courses includes all the necessary knowledge required to become a Data Scientist.

    ------------------------------
    Raj Shivakoti
    ------------------------------