Lasso on Vertically Partitioned Data

Implementing Lasso on vertically partitioned data in a naive fashion provides some information on the issues with vertically partitioned data.

Consider a dataset with response where the predictor matrix is vertically partitioned among three sites.

where the combined is , each is , for , and

We wish to fit a lasso model:

For purposes of illustration, we generate a dataset.

set.seed(129)
n <- 100
p <- 5
X <- matrix(rnorm(n * p), n, p)
X1 <- X[, 1:2]
X2 <- X[, 3, drop = FALSE]
X3 <- X[, 4:5]
beta_true <- c(5, 0, 0, 2, 2)
y <- X %*% beta_true + rnorm(n)

The Aggregated Fit

If the data were not vertically partitioned, the fit would be straightforward for a specified say. We would just solve the optimization problem, the primal problem.

suppressWarnings(suppressMessages(library(CVXR)))

beta <- Variable(p)
lambda <- 2
p_obj <- sum_squares(y - X %*% beta) + lambda / 2 * p_norm(beta, 1)
p_prob <- Problem(Minimize(p_obj))
p_result <- solve(p_prob)

The resulting value of the primal objective is 109.5436324 and the fitted estimate is

(beta_primal <- p_result$getValue(beta))

##              [,1]
## [1,]  4.937721027
## [2,] -0.007471594
## [3,]  0.094597544
## [4,]  2.065937851
## [5,]  2.130020709

The Lasso Dual

The dual problem for lasso for that is

In the above is an -vector of parameters and the constraint is

where each is . It follows that

Thus, if each site provides for a given , the constraint can be computed in a distributed fashion by a master performing the optimization.

So the dual is solvable in a distributed fashion.

Indeed, we can compute this to check.

u <- Variable(n)
d_obj <- 0.5 * sum_squares(y - u)
##d_constraint <- list(p_norm(t(X) %*% u) <= lambda)
d_constraint <- list(
    max(p_norm(t(X1) %*% u, Inf),
        p_norm(t(X2) %*% u, Inf),
        p_norm(t(X3) %*% u, Inf)) <= lambda
)
    
d_prob <- Problem(Minimize(d_obj), d_constraint)
d_result <- solve(d_prob)
uVal <- d_result$getValue(u)
## Print a few values out of the 100
head(uVal)

##             [,1]
## [1,] -1.05733012
## [2,] -0.32031376
## [3,]  0.57115415
## [4,] -1.28355101
## [5,] -1.97281480
## [6,]  0.09022386

So far, so good.

The Catch

Now that we have solved the dual problem, we need to recover the solution to the original primal problem. The correspondence between the primal solution and dual solution is:

The solution is, of course,

This is where we get killed because the matrix involves cross-site terms like :

If we ignored that fact, we can check that we do get the right solution.

XtX <- t(X) %*% X
beta_dual<- solve(XtX) %*% t(X) %*% (y - uVal)
cbind(beta_dual, beta_primal)

##               [,1]         [,2]
## [1,]  4.919405e+00  4.937721027
## [2,] -9.091532e-06 -0.007471594
## [3,]  7.857126e-02  0.094597544
## [4,]  2.047664e+00  2.065937851
## [5,]  2.110122e+00  2.130020709

Lasso on Vertically Partitioned Data

The Aggregated Fit

The Lasso Dual

The Catch

References

Navigation

Related Topics