Tuesday, July 16, 2019

New paper out: An arduino-based RFID platform for animal research

Since 2017 I've been working with Eli Bridge and J.E. Ruyle (OU Advanced Radar Research Center) to create ETAG, the Electronic Transponder Analysis Gateway.  It will be a data management system for scientists working with radio-frequency identification (RFID) technology for animal studies.  An overview of the hardware side of ETAG with a brief discussion of the software has just been published in Frontiers in Ecology and Evolution.

Tuesday, June 18, 2019

Setting up an RStudio server on Amazon Web Services

I have sometimes used using Amazon Web Services to outsource some very computationally intensive analyses, such as for my paper last year on species distribution modeling.  Here's a quick guide to setting it up.

I first set up an Amazon Elastic Cloud Computer (EC2) instance with the latest Ubuntu following these instructions

Then log into the instance on SSH (AWS has very clear and helpful tutorials on this that depend on whether you are logging on from Windows or a Linux operating system) and run the below commands.  The first one adds your rstudio user (that you will log onto from a web browser), makes a directory, lets you set the rstudio user password that you will use to log on, and then sets permissions so you can write to this directory.

All of the commands I discuss today you will input into the SSH terminal.  This means you are making changes to the remote computer (the Amazon EC2 instance, i.e., your new RStudio webserver), not your local computer.  Do not input the dollar sign - this represents the prompt that the terminal shows.

$sudo useradd rstudio
$sudo mkdir /home/rstudio
$sudo passwd rstudio
$sudo chmod -R 0777 /home/rstudio

Then update your instance.
$sudo apt-get update
$sudo apt-get upgrade

Here I use nano instead of vi to edit a new sources.list file (you have to put the .d or it won't be saved).  Other instructions I've seen use vi but I prefer nano as a simpler for the uninitiated like myself.
$sudo nano /etc/apt/sources.list.d/sources.list

Once you're in nano, add this line to the sources.list file.  You can replace it with your favorite CRAN mirror and whatever version of ubuntu you have.
$deb https://cloud.r-project.org/bin/linux/ubuntu xenial/

Next, add the key to your system.
$sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E084DAB9

Then update again...
$sudo apt-get update
...and install the latest version of R (next code line).  If you don't do the sources.list and key steps from previous lines, the r-base version will be older than current.
$sudo apt-get install r-base

Now that R is installed, install RStudio Server using the three lines below.  Go to the RStudio website for an updated version name for the .deb file (the code below is current as of 11 April 2017 when I first drafted this post for my own reference; I recommend going to find the most up-to-date filename). 


$sudo apt-get install gdebi-core

$wget https://download2.rstudio.org/rstudio-server-1.0.143-amd64.deb
$sudo gdebi rstudio-server-1.0.143-amd64.deb


Then you should be able to go to the IP address for your server (check in the EC2 console in AWS to get the IP address) with :8787 after it, and log in using the username and password you created at the beginning.








Part of the reason I wanted to use an AWS server is that I had a huge dataset that didn't fit on my laptop. Now that RStudio can be logged in, attach an Elastic Block Storage (EBS) volume to allow storing your files.  You can think of it like an external hard drive for your Amazon web server.

These instructions are adapted from the AWS tutorial and assume you are using using an instance and EBS volume that was created with the instance and is currently empty.
$lsblk
$sudo file -s /dev/xvdb
Answer was "data" which means is empty, no file system.

$sudo mkfs -t ext4 /dev/xvdb
$sudo mkdir /data
$sudo mount /dev/xvdb /data

$sudo chmod -R 0777 /data
$sudo chmod -R 0777 /data/*
These two lines add write permission for the folder and files.

I rebooted my instance and was terrified to not find the data drive.  However, it was because I have to mount the drive (the mount step above) each time.  You can change the fstab file to make it mount permanently.  To do this, search for the UUID of your EBS volume after it has been mounted.
$sudo blkid


You should get an answer back something like this:
/dev/xvdb: UUID="ab82e239-b284-4527-922f-b82b6b9ebc8c" TYPE="ext4"

$sudo cp /etc/fstab /etc/fstab.orig
$sudo nano /etc/fstab

Updated to note: Update the fstab file to include the UUID.  (I have forgotten exactly where, but this askubuntu.com answer says how.)


The EBS volume should now be mounted permanently.

To do mount a volume created from scratch (not at the time of the instance creation),
follow these steps and mount the volume as previously.   If the volume size needs increasing later on (i.e., you didn't predict the size of storage you needed), follow these instructions.  Then, if you are using a Linux system like Ubuntu, these.

You should now be able to shut down or reboot your RStudio server and still have the EBS volume still mounted.

Tuesday, June 04, 2019

New paper out: Complex spatiotemporal variation in processes shaping song variation

I am proud to announce the fourth paper out from my dissertation.  This one is now out as online early in Behaviour.  This paper covers variation in titmouse song between the younger and older hybrid zones based on song recordings and vegetation measurements I took in both Texas and Oklahoma.  To quote from the abstract:

In the recent zone, noise and vegetation structure were correlated with several song characteristics, but in the older zone, these features did not correlate despite similar gradients in song features. Our data, combined with previous studies, suggest that despite overall similarities in characteristics, songs in the older zone may be more shaped by sexual selection, whereas songs in the young zone are shaped by environment. Thus, even within the same species, processes shaping signal structure can vary spatially and temporally.

Tuesday, May 21, 2019

Using Git and Git Bash on a network drive

I've run into two problems lately with using Git with a network drive.

First, I use RStudio with R projects that contain a Git repository.  Some of these projects are stored on a network drive.  This is a shared drive that has a sort of odd file path (as described in this question).  As I wasn't sure how to map a network ("UNC path") to a folder (the suggested solution), I poked around to see what else might get it to work.

I ended up running RStudio as administrator.  This allowed me to set the project as having Git (before, when running as non-administrator, this gave a useless error message saying "function error").  Once I set the global options to project having version control, RStudio created a new .Rproj file, restarted itself, and then the Git tab appeared in RStudio.  When I opened the .RProj file again I did not need to run as administrator.

Second, I wanted to run Git in a folder located on a network drive by right-clicking in the repository folder on "Git Bash".   This brings up Git Bash, but it says "CMD.EXE was started with the above path as the current directory.  UNC paths are not supported.  Defaulting to Windows directory."  I found that I could cd there, but I had to copy the file path from the Windows file explorer window, and then change all the slashes to forward slashes (/) from the Windows default back slashes (\).

Tuesday, January 08, 2019

New paper out: Varying dataset resolution alters predictive accuracy of spatially explicit ensemble models for avian species distribution

I'm pleased to announce that the first paper from my postdoc with Eli Bridge has been published (open access!) in Ecology and Evolution.  We compared spatially explicit species distribution models to those without spatial averaging and found that accuracy varied by species and data resolution.  Code and data are available on datadryad.org.  I'd love to hear from anyone who tries out the code on their own studies, as it seems this area needs more work to determine better practices for modeling.

As a librarian here at OU now, I would like to highlight that the OU Libraries' Open Access Subvention Fund helped us publish this by covering the article processing charges (aka page charges).  If you're at OU, do check out the fund eligibility, and if you're not at OU, please check your library to see if they do the same.