Suppose you want to calculate an expression of the form

where . Such expressions can be difficult to evaluate directly since the exponentials can easily cause overflow errors. In this post, I’ll talk about a clever way to avoid such errors.

If there were no terms in the second sum we could use the log-sum-exp trick. That is, to calculate

we set and use the identity

Since for all , the left hand side of the above equation can be computed without the risk of overflow. To calculate,

we can use the above method to separately calculate

and

The final result we want is

Since $A > B$, the right hand side of the above expression can be evaluated safely and we will have our final answer.

### R code

The R code below defines a function that performs the above procedure

```
# Safely compute log(sum(exp(pos)) - sum(exp(neg)))
# The default value for neg is an empty vector.
logSumExp <- function(pos, neg = c()){
max_pos <- max(pos)
A <- max_pos + log(sum(exp(pos - max_pos)))
# If neg is empty, the calculation is done
if (length(neg) == 0){
return(A)
}
# If neg is non-empty, calculate B.
max_neg <- max(neg)
B <- max_neg + log(sum(exp(neg - max_neg)))
# Check that A is bigger than B
if (A <= B) {
stop("sum(exp(pos)) must be larger than sum(exp(neg))")
}
# log1p() is a built in function that accurately calculates log(1+x) for |x| << 1
return(A + log1p(-exp(B - A)))
}
```

### An example

The above procedure can be used to evaulate

Evaluating this directly would quickly lead to errors since R (and most other programming languages) cannot compute . However, R has the functions `lfactorial()`

and `lchoose()`

which can compute and for large values of and . We can thus put this expression in the general form at the start of this post

The following R code thus us exactly what we want:

```
pos <- sapply(1:200, lfactorial)
neg <- sapply(1:500, function(i){2*lchoose(500, i)})
logSumExp(pos, neg)
# 863.237
```