General usage and file structure of Python Backend

The Python Backend project provides functionality for accessing inverter data, performing data processing tasks, and exposing these functionalities through a REST API. It connects seamlessly to a PostgreSQL database for storing and retrieving inverter data.

API

Technologies Used

FastAPI: - A high-performance web framework, FastAPI, is used to build the REST API. It facilitates the definition of endpoints, handling of incoming requests, and returning of responses.
OpenAPI: - Comprehensive API documentation is automatically generated based on the project’s code. This documentation clarifies the data accepted and returned by the API.
SQLAlchemy: An Object-Relational Mapper (ORM), is used to simplify interaction with the PostgreSQL database. It allows for the definition of models representing tables and relationships.

Project Structure

Overview

app/
├── db/           # Database-related components
│  ├── models.py  # ORM models for database interaction
│  └── utils.py   # Database access functions (data layer logic)
├── endpoints/   # API endpoints and request handling
│  └── solar/     # Endpoints specific to solar data management
├── core/          # Business logic and data processing
├── logs/          # Log files
├── main.py        # Application entry point
└── database.ini  # Configuration file for PostgreSQL connection details

Main Project Components

App
- main.py: The application entry point, defines the API using FastAPI and sets up routing for incoming requests.
- logging.conf: Configures logging for debugging and monitoring purposes.
- database.ini: Defines connection details for the PostgreSQL database, ensuring secure access.
DB
- models.py: Defines ORM models and schemas that represent tables and relationships in the PostgreSQL database.
- utils.py: Contains functions for connecting to the database, performing database operations (CRUD), and facilitating communication between the application and the database.
Core: Houses the core business logic of the application. This module implements necessary data processing functionalities, such as calculations, transformations, or other manipulations.
Endpoints
- solar: Contains files dedicated to handling specific API requests. Each endpoint file defines the input and output parameters of the endpoint, orchestrates calls to the appropriate functions in the DB and Core layers, and returns a meaningful response to the client.

Machine learning

This section implements the machine learning components employed for anomaly detection and client onboarding automation

Technologies Used

IPython:: An interactive environment, is used for exploratory data analysis. IPython notebooks are used to experiment with the data, visualize it, and prototype machine learning models before deployment.
scikit-learn: Provides a wide range of machine learning algorithms. In this project, scikit-learn is leveraged for the anomaly detection task. It offers various models suitable for the task.
xgboost: Provides implementation of ensemble models based on the random forest algorithm.
tensorflow: A flexible deep learning framework, is used for the development and training of more sophisticated neural network models.
spacy: Used for client onboarding automation through its named entity recognition (NER) capabilities.

Proyect structure

app/
└───ml/                                  # Machine learning components
│  └───data/                             # Training and testing data
│  └───notebooks/                         # Jupyter notebooks for model development
│  │  └───anomaly_detection/              # Anomaly detection notebooks
│  │  │  ├── 1 - Data Preprocessing.ipynb   # Prepares raw data for analysis
│  │  │  ├── 2 - Feature Engineering.ipynb  # Creates informative features for the model
│  │  │  └── 3 - Model Training.ipynb       # Trains and evaluates the anomaly detection model
│  │  └───nlp/                           # Natural Language Processing notebooks
│  │  │  └── 1 - Data Preprocessing.ipynb   # Prepares NLP data
│  └───models/                            # Trained machine learning models

Notebooks

Anomaly Detection

Machine Learning for Solar Panel Performance Anomaly Detection

Data Preprocessing:
- This notebook focuses on preparing the raw solar panel production data for further analysis and model training.
Feature Engineering:
- This notebook builds upon the preprocessed data and focuses on crafting informative features that can be used by the machine learning model to identify anomalies.
Model Training:
- This notebook utilizes the engineered features to train and evaluate different models in the task of anomaly detection.

NLP

This section outlines the Natural Language Processing (NLP) pipeline for streamlining client onboarding by automatically extracting key information from solar installation reports using a Named Entity Recognition (NER) model.

Data Preprocessing:
- This notebook focuses on preparing the raw solar installation reports for efficient NLP processing.

Renovus Solarec

Renovus Solarec development project selected by UNICEF Venture Fund’s First Cohort for Climate Action.

General usage and file structure of Python Backend

API

Technologies Used

Project Structure

Main Project Components

Machine learning

Technologies Used

Proyect structure

Notebooks

Anomaly Detection

NLP