Modelling of the residential PV data number 4

residential PV TECHNOLOGY

The collaboration call will focus on modeling residential PV data. Two partners from the Serendi-PV project will share data with the collaboration partners selected for this collaboration. Additionally, other partners from the Serendi-PV project are expected to contribute to the data analysis.


The call will be closed on the 1st of June 2024.


Join this collaboration call by filling in the following information in the participation formulary and send it to with your modelling proposal.


Modelling of residential systems (MyLight150)

The Serendi-PV partners have developed several tools for residential PV systems. These tools are primarily used for fault detection, assessing the degradation of PV modules, and forecasting PV profile production.
In this collaboration, the MyLight150 partner has proposed offering their PV production data in exchange for a detailed analysis. This analysis could lead to recommending specific actions to the PV system owner, such as:
• Identifying panel degradation over time or defects in one or multiple panels, with suggestions for potential replacement.
• Detecting dust accumulation that could cause power loss, indicating the need for cleaning.
• Forecasting energy production, with suggestions for adapting the consumption curve to increase self-consumption.

The data sharing will follow these modalities:

• A prior discussion about the planned analysis and communication.
• An agreement that any communication regarding the dataset and results must be shared with and approved by MyLight150 before being published or communicated.

Find here a template of the data format.

Modelling of systems (Cythelia)

For this collaboration call, our partner Cythelia has proposed offering their PV production data in exchange for detailed modeling to be established. The data sharing will be subject to the following conditions:

• Approval from Cythelia, which will depend on the profile of the requesting organization and the intended purpose.

Signing of a Non-Disclosure Agreement (NDA).

Find here a template of the data format.


The list of the document to participate in the call include:

  1. Formulary for participating in the call
    1. Formulary.docx (updated 13/11/2023)
  2. The bifacial call for collaboration, in which all the requirements are detailed:    Collaboration_call_number_4_residential (updated 22/03/2024)

how to participate?

Please send an email to the to complete the formulary for participating in the call and you will receive support for completing the rest of the documents.

FAQ (Frequently asked questions)

Please send questions to Simone Vitale ( We will post relevant questions and answers below for everyone’s information.

WHAT WILL BE OFFERED by MyLight150 and cythelia

MyLight150 dataset


• Monitoring period: 2015 to 2021.
• Number of PV systems: 2037.
• Data resolution: monitored at a 10-minute resolution.
• Location: primarily located in Belgium, with the remainder across Europe.
• Geographic distribution: most PV systems are located in Belgium and France.

• System specifications: the majority of these systems have a DC nameplate power below 12 kWp.

Example of data
• Resolution: 10 minutes.
• Format: Device_id, Timestamp_UTC, Wh.
• Sample Data (template also available in the CKAN database):

Metadata details
• Included information: Latitude (with reduced precision), Longitude (with reduced precision), kWp, Panel Surface in m², Azimuth in degrees, Tilt in degrees.
• Note: the dataset contains many missing values and some inaccurately declared data.
Data quality and selection criteria
• Data gaps: installations/devices with excessive data gaps during a year have been excluded.
• Operational hours: data is considered only for the hours between 8 am and 7 pm (11 hours of daylight) to avoid excluding systems with nighttime issues.
• Inclusion criteria for shared data:
1. Availability of data from 2015 to 2021.
2. A minimum of 350 days per year with at least one data point.
3. Less than 17% missing data during a year (equating to at least 20,000 data points per year, given that the maximum for 365 days and 11 hours of daylight is 24,090 data points).


Cythelia dataset

• Dataset details: the dataset offers one year of data recorded at 10-minute intervals.
• Composition of the 70 kWp PV Plant:
  • Bifacial modules on sunshades.
  • Bifacial modules on a single-axis tracker with equatorial tracking.
• Data Specifics: Production data is available at the module level, thanks to the use of micro-inverters.
• Environmental Data: Includes irradiance (horizontal, plane of array, rear face), ambient and module temperature, wind speed, and direction.
• Plant Model: a 3D model in Sketchup format.
• Layout: 2D layout available in DXF format.
• Detailed documentation: includes datasheets for modules and micro-inverters, modules’ flash tests.
• Component location: detailed locations for each module and micro-inverter.
• Sensor placement: locations of sensors are specified.
Example of data

Data quality
• Production data: there are several gaps in the production data, with a few timestamps missing each day. The exact extent of these gaps has not been precisely quantified yet.
• Irradiance data: the accuracy of irradiance data may be compromised, as the sensors are not regularly cleaned.

Days before closing the call