StreamSets

StreamSets

Connect with experts and peers to elevate technical expertise, solve problems and share insights.

 View Only

Train ML Model and register experiment in MLflow

  • 1.  Train ML Model and register experiment in MLflow

    Posted Tue January 25, 2022 09:44 PM

     

    This pipeline is designed to ingest data from Amazon S3 and prepare it for training a ML model using PySpark custom processor. Once the Gradient Boosted model is trained, the model artifacts, features, accuracy of the model and other metrics are registered as an experiment in MLflow. (The pipeline runs on Databricks cluster which comes bundled with MLflow server.)