Data Science

Expand all | Collapse all

what are the main library used in data science

  • 1.  what are the main library used in data science

    Posted Fri February 22, 2019 05:57 AM
    what are the main library used in data science

    ------------------------------
    rajesh kumar
    ------------------------------


  • 2.  RE: what are the main library used in data science

    Posted Fri February 22, 2019 07:57 AM
    Hello 
    As per your question there are different technology attached in the field of Data Science. Data Science is an open platform where the concept of Python,java,machine Learning is being implemented to analyse Data. Each technology has different libraries which are used very often in Data Science. 
    Some of the popular  Libraries used in Field of  Data Science are:
    1.Panda
    2.SciPY
    3.NumPy
    4.scikit-learn
     Libraries of machine being used in Data Science:

    ------------------------------
    Raj Shivakoti
    ------------------------------



  • 3.  RE: what are the main library used in data science

    Posted Fri February 22, 2019 09:02 AM
    That's a pretty good list. It's worth pointing out that:

    • The libraries are for very different things. Scikit-learn, for example, is specifically for machine-learning. Whereas numpy is used much more generally for scientific computing in Python.
    • The libraries are at different levels of abstraction. Pandas is higher-level than Numpy, and in fact encapsulates its numeric features. So you can use Pandas and be using NumPy under the covers!
    • Data science has lots of different areas--machine learning is just one of them. Pick an area of data science to focus on, like machine learning (ML), and you'll quickly discover the dominant libraries in that area.

    HTH (Hope that helps)
    I/O

    ------------------------------
    Ian Oeschger
    Software Developer and Architect for IBM Community
    IBM
    910-742-4504
    ------------------------------



  • 4.  RE: what are the main library used in data science

    Posted Mon March 11, 2019 05:18 PM
    To large dataframes, Pandas require LARGE memory (RAM) => it's not indicated.
    Alternative solution: GraphLab
    •  GraphLab substitute Pandas and Scikit-Learn, because process using disk/memory alternating process, giving new possibilities without crash your system.
    • charges for commercial use.
    • site: GraphLab site
    Good luck @rajesh kumar
    ​

    ------------------------------
    Romulo Magalhaes
    ------------------------------



  • 5.  RE: what are the main library used in data science

    Posted Tue March 19, 2019 07:03 AM
    Hello @rajesh kumar Great work you ask there. There are main libraries of course but it all depends on what insight you want to get from your data. Nevertheless, based on experience and years of practice, there are the main ones i know of.
    They are :

    I hope this helps you.
    Cheers :)

    ------------------------------
    Damilola Omifare
    ------------------------------



  • 6.  RE: what are the main library used in data science

    Posted Thu April 11, 2019 01:36 PM
    I really like this "curated list of awesome resources for practicing data science using Python", which is organized into categories like "Big Data", "Extraction", "Visualization", and others.

    Here, as an excerpt, is the curated list of "core" Python tools for DS:

    pandas - Data structures built on top of numpy.
    scikit-learn - Core ML library.
    matplotlib - Plotting library.
    animatplot - Animate plots build on matplotlib.
    seaborn - Python data visualization library based on matplotlib.
    pandas_summary - Basic statistics using DataFrameSummary(df).summary().
    pandas_profiling - Descriptive statistics using ProfileReport.
    sklearn_pandas - Helpful DataFrameMapper class.
    janitor - Clean messy column names.
    missingno - Missing data visualization.

    Full list:
    https://github.com/r0f1/datascience

    ------------------------------
    Ian Oeschger
    Software Developer and Architect for IBM Community
    IBM
    910-742-4504
    ------------------------------



  • 7.  RE: what are the main library used in data science

    Posted Tue April 30, 2019 05:39 AM
    Hi Rajesh,

    You already have some great list of packages / libraries listed by friends here. (most of those are in Python). If you are interested in R, then you can refer to my GitHub reference here - https://github.com/kkm24132/BRUG where I have captured some high level key packages/libraries in R by categories (data manipulation, html widgets, graphics for visualisation, machine learning tasks, database management, optimisation, natural language processing etc.). You may please find it useful.

    Thanks,
    Kamal

    ------------------------------
    Kamal Mishra
    ------------------------------



  • 8.  RE: what are the main library used in data science

    Posted Tue April 30, 2019 01:23 PM
    Edited by William Roberts Tue April 30, 2019 01:23 PM
    If you're curious about data science in the python space, one of our contributors @Paco Nathan wrote a very rich article ( IBM Data Science Community - Master the art of data science.
    Ibm remove preview
    IBM Data Science Community - Master the art of data science.
    View this on Ibm >
     β€‹) about the essential libraries and frameworks within the data science space. It's great and touches on many of the same packages mentioned above - plus more for your later discovery

    ------------------------------
    William Roberts
    ------------------------------



  • 9.  RE: what are the main library used in data science

    Posted Thu May 02, 2019 03:46 PM
    Edited by Tapan Jatakia πŸ‘¨β€πŸŽ“ Thu May 02, 2019 10:03 PM
    The answers provided by all the authors @Ian Oeschger @Raj Shivakoti @Romulo Magalhaes @Kamal Mishra @Damilola Omifare @William Roberts above suffice the question. Many resourceful libraries which are used in domains such as Machine Learning, Data Science and Artificial Intelligence to not only test the CSV records by Algorithms but also to visualize the outcomes via Graphical models are provided in the attached Word document. This includes the libraries and packages which are widely used in the following fields:
    1. Computer Vision
    2. Natural Language Processing
    3. Automated Machine Learning
    4. Machine Learning Frameworks and Algorithms
    5. Time Series
    6. Statistical and Scientific Computing
    7. Data Cleaning and Preprocessing
    8. Data Visualization
    9. Neural Networks and Deep Learning
    10. Reinforcement Learning
    11. Code Testing

    The Word document can be found here --> Pkgs&Libs_ML-DS-AI.docx

    I would urge all to contribute this list to get a whole lot of libraries and packages used in ML and AI. The libraries and packages verified packages by Python Community as well as some custom developed by individual developers which are available via GitHub.

    ------------------------------
    DCE Tapan Jatakia
    Student & Cyber Practitioner
    DIT University
    Dehradun, Uttarakhand,
    INDIA - 248001.
    +91 9664332984
    ------------------------------