### How to interpret the uncertainty fields in ecoinvent?

The lognormal is the most common distribution chosen to describe the uncertainty in ecoinvent. It has the advantage of not being defined in the negative domain, so credits do not accidentally happen during a Monte Carlo simulation.

The lognormal is not as intuitive as the normal distribution, and is often confusing to the new users. As a primer, we recommend "Log-normal Distributions across the Sciences: Keys and Clues" by Eckhard Limper et al, in BioScience, May 2001, Vol. 51, No.5.

#### Definition and basic properties of the lognormal distribution

A variable is lognormally distributed when the logarithm of the sample is normally distributed. The probability density function (PDF) of the lognormal is:

where x is the random variable, mu and sigma are the median and standard deviation of the distribution of ln(x) (sometimes called "the underlying normal distribution). The median and standard deviation of x, noted mu* and sigma*, can be obtained through the following equations:

mu* = exp(mu)

sigma* = exp(sigma)

The quantity sigma* is useful to calculate intervals of confidence:

In the lognormal distribution, the median corresponds to the geometric mean, and is found at exp(mu). The arithmetic mean is found slightly higher than the geometric mean, at exp(mu + sigma2/2). The mode (the most likely value) is found at a lower value, exp(mu - sigma2). The larger the standard deviation, the larger is the skewedness and the further apart those three quantities will be.

#### From ecoinvent to the lognormal PDF

Three inputs are necessary from the data provider to determine the parameters of the lognormal distribution: the **deterministic value**, the **basic uncertainty** and the **pedigree matrix**.

Going from the **deterministic value** to mu is straightforward: this value is taken as equal to mu*. In ecoEditor and ecoQuery, mu is called "Arithmetic mean of log-transformed data". The deterministic value is also called "Geometric mean" in those tools.

mu = ln(deterministic value)

Then, the **basic uncertainty** is chosen. This value reflects the fact that even a "perfect" data is uncertain: there are fluctuation over time, errors in measurements, etc. The table 10.3 of data quality guideline provides for values, depending on the type of exchange and process modelled. In ecoEditor and ecoQuery, this value is called "Variance of log-transformed data". The field "Standard deviation (SD95)" is equal to exp((Variance of log-transformed data)^0.5)^2, a value that is not used anywhere in the rest of the calculation.

Then, a score from 1 to 5 is selected for 5 indicators: reliability, completeness, temporal correlation, geographical correlation, further technological correlation. These scores are transformed into additional uncertainty in order to reflect that the amount of an exchange might come from sources that are not as reliable as primary data collection. The values can be older, from a different technology, another part of the world or based on estimates rather than calculation or measurement. Table 10.5 of the data quality guidelines shows the relationship between the pedigree scores and the additional uncertainty.

The basic uncertainty is added to the five additional contributions to the uncertainty. This sum is called "Variance of data with pedigree". Finally, the "CI/2wP, half range of confidence interval" is calculated as

exp((Variance of log-transformed data)0.5)2, corresponding to the square of sigma*.

#### A numeric example

Corresponding ecoEditor uncertainty window

Consult the attached excel file for a detailed example.

What does the version number of the database mean?

What is an activity in ecoinvent version 3?

What are global background activities and where do they come from?

What is a market and how is it created?

In which situations are direct activity links used?

Why is the reference product of the treatment activities negative?

Why is the amount of the reference product changing?

What is linking and what happens during the linking process? Attributional System Model

What is linking and what happens during the linking process? Consequential System Model

What is allocation at the point of substitution (APOS)?

What do the shortcuts, such as CH, RER, RoW and GLO mean?

What does it mean, when the production volume of an activity is zero?

Why is the “Allocation, default” system model not available in the ecoquery anymore?

How do I calculate the amount of transport in the market?

What is the Rest-of-the-World (RoW) and how is it calculated?

Why is the LCIA score of a certain product negative?

What are the UUID numbers and how do they work?

What are the LCIA methods ecoinvent is publishing the impact assessment results for?

APOS: why do I have carrots on my recycled aluminium?