Data format for individual sites

The first three lines of the returned CSV file are comments starting with a # character and containing important metadata:

  1. Dataset name, version, license and reference.
  2. Human readable information on the columns and units.
  3. Machine-readable (JSON string) information on the units and simulation parameters.

These three lines are followed by a single row giving the column headers, and then the data.

To parse the CSV files into software that does not automatically ignore comment lines starting with a # character, it is usually possible to tell software to simply skip the first three lines.

An example CSV file layout is shown below:

# Solar PV (Point API) - 42.812, -8.086 - Version: 1.1 (using GSEE v0.3.1) - License: - Reference:
# Units: time in UTC, local_time in Europe/Madrid, electricity in kW
# {"units": {"time": "UTC", "local_time": "Europe/Madrid", "electricity": "kW"}, "params": {"local_time": true, "lat": "42.8115217450979", "lon": "-8.0859375", "date_from": "2014-01-01", "date_to": "2014-12-31", "dataset": "merra2", "capacity": "1", "system_loss": "0.1", "tracking": "0", "tilt": "35", "azim": "180"}}
2014-01-01 00:00,2014-01-01 01:00,0


  • time: UTC time stamp giving the beginning of the time period (one hour by default).
  • local_time: Time stamp in the local timezone of the requested location (if local_time was requested).
  • output: mean power output averaged over the time period, in kW.

Corrected vs uncorrected data

Currently, data is corrected in Europe as described in: Pfenninger and Staffell 2016, Staffell and Pfenninger 2016.

When downloading data, the filename indicates whether corrections were applied.

In addition, the X-Ninja-Corrected HTTP response header indicates whether returned data was corrected for API use.

Data format for country-aggregated data

The country-aggregated data currently looks slightly different from the individual site data – the CSV files start with two comment lines instead of three, one giving metadata, the other describing the units. The third line is the header line, followed by data.

Again, the time column is an UTC time stamp giving the beginning of the time period, and the remaining columns are data averaged over that time period.

An example of country-aggregated data:

"# PV (hourly data, 1985-2016) - ninja_pv_country_DE_merra-2_corrected - Version: 1.1 - License: - Reference:",
"# Units: time in UTC, other columns are capacity factors [0-1].  Bias corrected using national generation data.",
1985-01-01 00:00:00,0