Data Science

Expand all | Collapse all

what are the main library used in data science

  • 1.  what are the main library used in data science

    Posted Fri February 22, 2019 05:57 AM
    what are the main library used in data science

    ------------------------------
    rajesh kumar
    ------------------------------


  • 2.  RE: what are the main library used in data science

    Posted Fri February 22, 2019 07:57 AM
    Hello 
    As per your question there are different technology attached in the field of Data Science. Data Science is an open platform where the concept of Python,java,machine Learning is being implemented to analyse Data. Each technology has different libraries which are used very often in Data Science. 
    Some of the popular  Libraries used in Field of  Data Science are:
    1.Panda
    2.SciPY
    3.NumPy
    4.scikit-learn
     Libraries of machine being used in Data Science:

    ------------------------------
    Raj Shivakoti
    ------------------------------



  • 3.  RE: what are the main library used in data science

    Posted Fri February 22, 2019 09:02 AM
    That's a pretty good list. It's worth pointing out that:

    • The libraries are for very different things. Scikit-learn, for example, is specifically for machine-learning. Whereas numpy is used much more generally for scientific computing in Python.
    • The libraries are at different levels of abstraction. Pandas is higher-level than Numpy, and in fact encapsulates its numeric features. So you can use Pandas and be using NumPy under the covers!
    • Data science has lots of different areas--machine learning is just one of them. Pick an area of data science to focus on, like machine learning (ML), and you'll quickly discover the dominant libraries in that area.

    HTH (Hope that helps)
    I/O

    ------------------------------
    Ian Oeschger
    Software Developer and Architect for IBM Community
    IBM
    910-742-4504
    ------------------------------



  • 4.  RE: what are the main library used in data science

    Posted Mon March 11, 2019 05:18 PM
    To large dataframes, Pandas require LARGE memory (RAM) => it's not indicated.
    Alternative solution: GraphLab
    •  GraphLab substitute Pandas and Scikit-Learn, because process using disk/memory alternating process, giving new possibilities without crash your system.
    • charges for commercial use.
    • site: GraphLab site
    Good luck @rajesh kumar


    ------------------------------
    Romulo Magalhaes
    ------------------------------



  • 5.  RE: what are the main library used in data science

    Posted Tue March 19, 2019 07:03 AM
    Hello @rajesh kumar Great work you ask there. There are main libraries of course but it all depends on what insight you want to get from your data. Nevertheless, based on experience and years of practice, there are the main ones i know of.
    They are :

    I hope this helps you.
    Cheers :)

    ------------------------------
    Damilola Omifare
    ------------------------------



  • 6.  RE: what are the main library used in data science

    Posted 12 days ago
    I really like this "curated list of awesome resources for practicing data science using Python", which is organized into categories like "Big Data", "Extraction", "Visualization", and others.

    Here, as an excerpt, is the curated list of "core" Python tools for DS:

    pandas - Data structures built on top of numpy.
    scikit-learn - Core ML library.
    matplotlib - Plotting library.
    animatplot - Animate plots build on matplotlib.
    seaborn - Python data visualization library based on matplotlib.
    pandas_summary - Basic statistics using DataFrameSummary(df).summary().
    pandas_profiling - Descriptive statistics using ProfileReport.
    sklearn_pandas - Helpful DataFrameMapper class.
    janitor - Clean messy column names.
    missingno - Missing data visualization.

    Full list:
    https://github.com/r0f1/datascience

    ------------------------------
    Ian Oeschger
    Software Developer and Architect for IBM Community
    IBM
    910-742-4504
    ------------------------------