Archive

programming

This last week I used lost of Microsoft Excel files that I need in my R scripts. Hence I discovered the excellent and complete package XLConnect. But, for easy and fast working, its not the best solution.

So I codded a wrapper of XLConnect, calling it loadxls that allows to read and write Excel files in an easy way.

loadxls

The R package loadxls can be installed from its github repository with:

devtools::install_github("carleshf/loadxls")

It implements only 4 functions: read_all, read_sheet, write_all and write_sheet.

Functions

read_all

read_all(filename, environment = parent.frame(), verbose = TRUE)

This functions reads a given Excel file and loads each sheet as a data.frame in the current environment. The created objects will have the name of the sheet.

read_sheet

read_sheet(filename, sheetname, varname, environment = parent.frame(), verbose = TRUE)

This functions loads, instead of the full content of a file, only a given sheet. If the argument varname is supplied, the object loaded from the given sheet will take its name.

write_all

write_all(..., filename, verbose = TRUE)

This function writes all the objects given though ... to a new Excel file, saving each objects as a new sheet.

write_sheet

write_sheet(data, sheetname, filename, replace = FALSE, verbose = TRUE)

This function allows to save a single object to and giving the name of the sheet where it will be write.

Advertisements

I had a file containing ~1M SNPs in their rsid (and their position). I needed to complete the information with their chromosome.

Since the list is really large I use scan combined with a bash command (to know the length of the file).

I found this solution:

library("SNPlocs.Hsapiens.dbSNP.20120608")

## Create connection to big file
inputName <- "input_file.gen"
outputName <- "output_file.gen"
inputCon <- file(description = inputName, open = "r")
outputCon <- file(description = outputName, open = "w")

## We need to know the number of lines of the big file
## This will work for GNU/Linux environments
command <- paste("wc -l ", inputName, " | awk '{ print $1 }'", sep="")
nLines <- as.numeric(system(command = command, intern = TRUE))
rm(command)

## Loop over the file connection until end of lines
pb <- txtProgressBar(min = 0, max = nLines, style = 3)
for(ii in 1:nLines) {
  readLine <- scan(file = inputCon, nlines = 1, what = "list", quiet = TRUE)

  ## Get the chr from the SNP
  x <- tryCatch({
    x <- rsidsToGRanges(readLine[[2]])
    as.character(x@seqnames[1])
  }, error = function(e) {
    "---"
  })

  ## Update on list and save on output
  readLine[[1]] <- x
  writeLines(paste(readLine, collapse = " "), outputCon)

  setTxtProgressBar(pb, ii)
}

close(inputCon)
close(outputCon)

This solution is really slow but it gets the SNP’s chromosome and fills it as "---" when the SNP is not found in dbSNP.

WARNING: The SNP’s rsid must be located into the second column of the file. Take a look at readLine[[2]] in the for.


I am open to know different ways of doing this.

Using bash and ImageMagick we can crop all the pictures in a folder in a single shoot:

fix='crop'
it=1
for file in $2/*; 
do
    if [[ -f $file ]]; 
    then
        output=$2/${fix}${it}.png
        echo "$file --> $output"
        convert -crop $1 "$file" "$output"
        it=$((it+1))
    fi
done

The script, called crop.sh is run as:

./crop.sh 525x240+675+150 ~/Pictures/

The first argument is the argument for the ImageMagick’s crop tool and the second argumen is the folder where the pictures to be cropped are.

Today I started my first kata at CodeWars as a sensei. The kata is based on an mobile game called Strata (android, iOS).

The kata in CodeWars

The following sections corresponds to the kata I wrote for CodeWars. Now I’m working on the solution. When all the required stuff get finished I will write another post.

Introduction

Cross-stitch is a popular form of counted-thread embroidery in which X-shaped stitches in a tiled, raster-like pattern are used to form a picture. Cross-stitch is the oldest form of embroidery and can be found all over the world. Many folk museums show examples of clothing decorated with cross-stitch, especially from continental Europe and Asia.

This kata tries to emulate a certain cross-stitch embroidery. We start from a squared canvas where certain tiles are colored, a pattern. The pattern is composed by colored-tiles and empty-tiles (gray in the following example). We need to embroider the canvas and cover each tile with a vertical strand and an horizontal strand. The tile will be colored with the color of the last strand that covers it. The empty tiles can be embroidered with any color.

Take the next picture as reference:

2x2 Canvas

We start with a 2×2 canvas with a total of 4 tiles (1). We can embroider the first row with a blue strand (2). We can do the same for the second row with a purple strand (3). Embroidering the first column with a blue strand (4) we cover the blue tile in the first row with two strands. Since the strand on the top is blue, this tile is colored as blue. This makes the first tile be colored with the requested color.

Then, if we embroider the second column with a blue strand (5a) we fill the last tile (seconds row and second column) as blue, but the tile tells us that it must be colored as purple. So we not match the pattern. For hence, the strand on the second column needs to be purple (5b).

Request

What is requested in this kata is to return a solution given a patterned-canvas. For hence, you are requested to implement the function find_solution that has a two arguments:

  • canvas: This is an array of arrays representing the corss-stitch canvas (an array of rows).
  • colors: This is an array of the colors of the pattern.

I must say that the board will always be squared. An example of canvas and colors is:

canvas = [['b', ' '], [' ', 'p']]
colors = ['b', 'p']

It must be understood as: The canvas has two rows and two columns. And it has two colors, blue (b) and purple (p).

The function find solution must return a valid solution to embroider the canvas and color the given pattern. This solution must follow a specific format detailed here:

  1. It must be an array of hashes
  2. Each hash can have only three keys: :type, :index and :color
  3. The :type key can be 'r' for rows and 'c' for columns (string)
  4. The :index key will be the index of the row or column to be colored (integer). It must be between 0 and board.size - 1
  5. The :color key will be used to fill the row or column (string)

Warning: The canvas will be colore following the array’s order

An example of a solution for a 2×2 canvas:

[{:type=>"r", :index=>0, :color=>"b"}, {:type=>"r", :index=>1, :color=>"b"}, {:type=>"c", :index=>"0", :color=>"b"}, {:type=>"c", :index=>"1", :color=>"b"}]

Today I was working on a simple kata of CodeWars that required to work with combinations and permutations of elements given in array.

The number of permutations of the n elements in a set taken by groups of k is given by:

\frac{n!}{(n-k)!}

In this case, the order within the groups matters. If order does not matter, then we are talking about combinations:

\frac{n!}{k!(n-k)!} = \binom{n}{k}

The array class from ruby comes with a method (permutation) to get the permutations of its elements given a value k:

[1,2].permutation(0).to_a
=> [[]]
[1,2].permutation(1).to_a
=> [[1], [2]]
[1,2].permutation(2).to_a
=> [[1, 2], [2, 1]]
[1,2].permutation(3).to_a
=> []

In the same way, the class array comes with a method combination to get the combinations of its elements given a value k:

[1,2].combination(0).to_a
=> [[]]
[1,2].combination(1).to_a
=> [[1], [2]]
[1,2].combination(2).to_a
=> [[1, 2]]
[1,2].combination(3).to_a
=> []

So with these two methods we can get the sets results of permuting and combining the elements of an array but not to calculate the elements of a k-permutation of k-combination.

To do that I propose:

  1. To include the method factorial to class Fixnum.
  2. To include the methods perm_length and comb_length to class array

These two steps can be done as:

class Fixnum
  def factorial
    f = 1
    (1..self).each{|ii| f *= ii }
    return f
  end
end

class Array
  def perm_length k
    return self.length.factorial / (self.length - k).factorial
  end
  def comb_length k
    return self.length.factorial / (k.factorial * (self.length - k).factorial)
  end
end

Something more complex would be needed for using this in a real scenario like a control for negative values of k. But for me it’s just enough :-)

I was working on a kata from CodeWars that asks to implement a series of methods that involve working with arrays. This led me to discovered a feature from ruby language that impressed me: You can extend the functionality of an existing class! In other words, you can add methods to an existing class.

To extend the functionality of Array class with two methods that allows to get the odds and evens number within the array we can do:

class Array

  def even
    self.select{|x| x % 2 == 0}
  end

  def odd
    self.reject{|x| x % 2 != 0}
  end

end

And now, the object from Array will have both methods:

> [1, 2, 3, 4, 5].even
=> [2, 4]
> [1, 2, 3, 4, 5].odd
=> [1, 3, 5]