Databases Assignment 1

  1. ​a. Create a new MS Access database named assignment2_yourname

    ​b. Create a table named SurveysByHand with the following fields: SampleID, Plot, Year, Species, Mass, HindFoot, Tag. Choose the types appropriately for the following data. Make ID the primary key.

    Table 1. Five data points from the Portal LTREB long-term rodent sampling database

    | ID    | plot | year | species | weight | hindfoot |    tag |
    | ----- | ---- | ---- | ------- | ------ | -------- | ------ |
    | 21012 |    4 | 1993 |      DM |     42 |       36 | 1EA0F9 |
    | 22012 |    4 | 1995 |      DM |     31 |       37 | 0D373C |
    | 23012 |   17 | 1996 |      DM |     25 |       37 | 64C6CC |
    | 24012 |   21 | 1996 |      PP |     26 |       22 | 1F511A |
    | 25012 |   22 | 1997 |      DM |     53 |       35 |   2624 |
    

    ​c. Take a minute and think about why we shouldn’t use tag as the primary key. When considering this question you need to know a simple fact about mammal trapping. Sometimes individuals escape at some point during processing (they may be small, but boy can they bite!) before all information can be collected or before the individual can be tagged.

    ​d. Enter the data above into your table

    ​e. Save and close the table. Congratulations - you’ve created your first relational database!

  2. Import Data in MS Access

    Hand entering data is great if you’re collecting your own data and need to enter it yourself, but it’s a pretty terrible way to use already available data, especially if it’s more than a few dozen lines long. This example will walk you through how to get data that already exists into Access.

    ​a. Download the main table for the full Portal LTREB mammal survey database from Ecological Archives (it’s kind of large so it might take a few seconds). This database is published as a Data Paper on Ecological Archives (http://esapubs.org/archive/default.htm), which is generally a great place to look for ecology data.

    b.Create a new database by selecting a file name and location and clicking Create

    ​c. Click on the External Data tab in Access

    ​d. Select Text File since this is a text file, you’ll notice that you can also import from Excel and from other Access databases

    ​e. Click on Browse and navigate to where you saved the data file and select it

    ​f. Make sure that Import the source data into a new table in the current database is selected and click OK

    ​g. Since the data are comma delimited, click Next

    ​h. Make sure the Comma is selected as the delimiter, check the First Row Contains Field Names option, and click Next

    ​i. Check to make sure that the types for each field are reasonable. Access only checks this first few rows of data to determine these types, so if type of data in a column changes further down this can can cause import errors. If it was me I’d take are careful look hft and wgt (which are weights and hindfoot lenths and should therefore be numbers) and at tag, either by opening the file in Excel or by checking the portion of the table from the Databases 1 problem to see what might be an issue. When you’re finished click Next

    ​j. Select Choose my own primary key, choose recordID, and click Next

    ​k. Name the table that you are importing into Surveys and click Finish

  3. For this and many of the following problems you will create queries that retrieve the relevant information from the Portal small mammal survey data that you imported in the Databases 2 problem. You may need to know more about the database than you currently do in order to answer the questions. For example, you may need to know what species is associated with the two character species ID or you may need to know the units for the individual’s weight. This type of information associated with data is called metadata and the metadata for this dataset is available online at Ecological Archives.

    1. Open the table with the Portal Survey Data that you created in the Databases 2 problem. Click the button at the bottom that looks like >| to jump to the end of the table. How many records are there? If you open the raw data file in Excel how many lines are there? Do they match (keeping in mind that one row in the raw data file is the headers)? If not then there was an error in your import. You should always do some basic santity checks like this one when working with computers to make sure that they are doing what you think they are doing.
    2. Write a query that displays all of the records for all of the fields in the table. Save it as All Survey Data.
    3. We want to generate data for an analysis of body size differences between males and females of each species. We have decided that we can ignore the information related to when and where the individuals were trapped. Create a query that returns all of the necessary information, but nothing else. Save this as Size Differences Among Sexes Data.
  4. A population biologist (Dr. Undómiel) who studies the population dynamics of Dipodomys spectabilis would like to use some data from Portal, but she doesn’t know how to work with large datasets. Being the kind and benevolent person that you are, write a query to extract the data that she needs. She wants only the data for her species of interest, when each individual was trapped, and what sex it was. She doesn’t care about where it was trapped within the site because she is going to analyze the entire site as a whole and she doesn’t care about the size of the individuals. Obviously she doesn’t need the species codes because you’re only providing her with the data for one species, and besides since she isn’t actually looking at the database itself the two character abbreviation would probably be confusing. Save this query as Spectabilis Population Data.

    Scroll through the results of your query. Do you notice anything that might be an issue for the scientist to whom you are providing this data? [you should] Think about what you should do in this situtation…

    You decide that to avoid invoking her wrath, you’ll send her a short e-mail* requesting clarification regarding what she would like you to do regarding this complexity. Dr. Undómiel e-mails you back and asks that you create two additional queries so that she can decided what to do about this issue later. She would like you to add a query to the same data as above, but only for cases where the sex is known to be male, and an additional query with the same data, but only where the sex is known to be female. Save these as Spectabilis Population Data Males and Spectabilis Population Data Females.

    *Short for elven-mail

  5. The graduate students that work at the Portal site are hanging out late one evening drinking… soda pop… and they decide it would be an epically awesome idea to put together a list of the 100 largest rodents ever sampled at the site. Since you’re the resident “computer genius” they text you, and since you’re up late working and this sounds like a lot more fun than the homework you’re working on (which isn’t really saying much, if you know what I’m saying) you decide you’ll make the list for them.

    The rules that the Portal students have come up with (and they did spend a sort of disturbingly long time coming up with these rules; I guess you just had to be there) are:

    1. Data needed include the species ID, year of capture, and the mass. These columns should be presented in this order.
    2. Individuals should be sorted in descending order with respect to mass.
    3. In the event of ties (individuals with the same mass) individuals should be sorted first by hind foot length (descending) and then by the year of capture (ascending).

    Start by writing a query that returns the relevant fields sorted in the appropriate manner. Then, to limit the number of value to the top 100:

    1. Open the query in Design View
    2. Open the Property Sheet
    3. Set Top Values to 100

    Save the final query as 100 Largest Individuals.

  6. Write a query that returns a list of the years that are present in the Portal dataset (with no duplicates). Save it as Years Sampled.

  7. Write a query that returns a list of dates on which individuals of the species Dipodomys spectabilis (indicated by the DS species code) were trapped (with no duplicates). Sort the list in chronological order (from oldest to newest). Save it as Dates With Dipodomys Spectabilis.

  8. Write a query that returns the number of individuals of all species combined captured in each year, sorted chronologically. Include the year in the output. Save it as Total Abundance By Year.

  9. Write a query that returns the number of individuals of each species captured in each year and the total biomass of those individuals (the sum of the wgt column). Include the year and species ID in the output. Sort the result chronologically by year and then alphabetically by species. Save as Mass-Abundance Data.