Generating Finalized Reports with Anovos
This section covers the final execution part where primarily the output generated by the previous step is being fetched upon and structured in the desirable UI layout.
The primary function dealt here is anovos_report which caters to the:
- reading of available data produced from data analyzer module and chart objects as produced by the report preprocessing module
- computation of additional charts based on available data
- populating the reporting layer leveraging an open-sourced python package called datapane.
- capability of producing stand alone reports for individual sections (Descriptive Statistics, Quality Check, Attribute Associations, Data Drift & Stability & Time Series Analyzer)
The following parameters are specified in the function anovos_report:
- master_path: The path which contains the data of intermediate output in terms of json chart objects, csv file (pandas df)
- id_col: The ID column is accepted to ensure & restrict unnecessary analysis to be performed on the same
- label_col: Name of label or target column in the input dataset. By default, the label_col is set as blank.
- corr_threshold: The threshold chosen beyond which the attributes are found to be redundant
- iv_threshold: The threshold beyond which the attributes are found to be significant in terms of model.
- drift_threshold_model: The threshold beyond which the attribute can be flagged as 1 or drifted as measured across different drift metrices specified by the user
- dataDict_path: The path containing the exact name, definition mapping of the attributes. This is eventually used to populate at the report for easy referencing
- metricDict_path: The path containing the metric dictionary
- run_type: Option to specify whether the execution happen locally or in EMR way
- final_report_path: Path where the final report needs to be saved
- output_type: Time category of analysis which can be between "Daily", "Hourly", "Weekly"