General usage and file structure of Python Backend

The Python Backend project provides functionality for accessing inverter data, performing data processing tasks, and exposing these functionalities through a REST API. It connects seamlessly to a PostgreSQL database for storing and retrieving inverter data.

API

Technologies Used

Project Structure

Overview

app/
├── db/           # Database-related components
│  ├── models.py  # ORM models for database interaction
│  └── utils.py   # Database access functions (data layer logic)
├── endpoints/   # API endpoints and request handling
│  └── solar/     # Endpoints specific to solar data management
├── core/          # Business logic and data processing
├── logs/          # Log files
├── main.py        # Application entry point
└── database.ini  # Configuration file for PostgreSQL connection details

Main Project Components

Machine learning

This section implements the machine learning components employed for anomaly detection and client onboarding automation

Technologies Used

Proyect structure

app/
└───ml/                                  # Machine learning components
│  └───data/                             # Training and testing data
│  └───notebooks/                         # Jupyter notebooks for model development
│  │  └───anomaly_detection/              # Anomaly detection notebooks
│  │  │  ├── 1 - Data Preprocessing.ipynb   # Prepares raw data for analysis
│  │  │  ├── 2 - Feature Engineering.ipynb  # Creates informative features for the model
│  │  │  └── 3 - Model Training.ipynb       # Trains and evaluates the anomaly detection model
│  │  └───nlp/                           # Natural Language Processing notebooks
│  │  │  └── 1 - Data Preprocessing.ipynb   # Prepares NLP data
│  └───models/                            # Trained machine learning models

Notebooks

Anomaly Detection

Machine Learning for Solar Panel Performance Anomaly Detection

  1. Data Preprocessing:
    • This notebook focuses on preparing the raw solar panel production data for further analysis and model training.
  2. Feature Engineering:
    • This notebook builds upon the preprocessed data and focuses on crafting informative features that can be used by the machine learning model to identify anomalies.
  3. Model Training:
    • This notebook utilizes the engineered features to train and evaluate different models in the task of anomaly detection.

NLP

This section outlines the Natural Language Processing (NLP) pipeline for streamlining client onboarding by automatically extracting key information from solar installation reports using a Named Entity Recognition (NER) model.

  1. Data Preprocessing:
    • This notebook focuses on preparing the raw solar installation reports for efficient NLP processing.