Archive

Tag Archives: vector

Outersection of two or more vectors

To get the non common elements of two or more vectors

outersect <- function(x, y, ...) {
    big.vec <- c(x, y, ...)
    duplicates <- big.vec[duplicated(big.vec)]
    return(setdiff(big.vec, unique(duplicates)))
}

> v1 <- c(1, 2, 3)
> v2 <- c(2, 3, 4)
> outersect(v1, v2)
[1] 1 4
> v3 <- c(0, 1, 4, 5)
> outersect(v1, v2, v3)
[1] 0 5

Outersection of a list of vectors

To get the non common elements of a list of vectors

outersect.list <- function(list.vec) {
    big.vec <- unlist(list.vec)
    duplicates <- big.vec[duplicated(big.vec)]
    return(setdiff(big.vec, unique(duplicates)))
}

> ll <- list()
> ll$c1 <- c(1, 2, 3)
> ll$c2 <- c(2, 3, 4)
> outersect.list(ll)
[1] 1 4
> ll$c3 <- c(3, 4, 5)
> outersect.list(ll)
[1] 1 5

To create a non exclusive outersection from the vectors in a list it is as easy as to perform an intersection of the vectors in the list and get the vector’s elements that are not in the intersection:

From post [Intersection in R] we recover the way to perform an intersection to a list of vectors:

intersect.list <- function(list.vec) {
    return(Reduce(intersect, list.vec))
}

And now we develop the function to create a non exclusive outersection:

outersect.list.nx <- function(list.vec) {
    common <- intersect.list(list.vec)
    big.vec <- unlist(list.vec)
    nonex <- big.vec[!(big.vec %in% common)]
    return(unique(nonex))
}

Intersection of two vectors

To get the common elements between two vectors we can use the built-in function intersect:

> v1 <- c(1, 2, 3)
> v2 <- c(2, 3, 4)
> intersect(v1, v2)
[1] 2 3

Intersection of a list of vectors

Here we want the common elements between all the vectors in the list.

intersect.list <- function(list.vec) {
    return(Reduce(intersect, list.vec))
}

> ll <- list()
> ll$c1 <- c(1, 2, 3)
> ll$c2 <- c(2, 3, 4)
> intersect.list(ll)
[1] 2 3
> ll$c3 <- c(3, 4, 5)
> intersect.list(ll)
[1] 3

Get unique elements of a vector

> v = c("a", "a", "b", "c", "c")
> unique(v)
[1] "a" "b" "c"

Get duplicated elements of a vector

> duplicated(v)
[1] FALSE  TRUE FALSE FALSE  TRUE
> v[duplicated(v)]
[1] "a" "c"

Get the non duplicated elements of a vector

> v[!duplicated(v)]
[1] "a" "b" "c"

Get duplicated elements of a dataframe

> df <- data.frame(c1 = c(rep("A", 3), rep("B", 3), rep("C", 2)), c2 = 1:8)
> df[duplicated(df$c1),]
  c1 c2
2  A  2
3  A  3
5  B  5
6  B  6
8  C  8

Get the non duplicated elements of a dataframe

We will get the first occurrence of the element in the dataframe

> df[!duplicated(df$c1),]
  c1 c2
1  A  1
4  B  4
7  C  7