Assignment 2 for Intro

Write a series of print statements that returns the following (include a blank line between each answer):

a. Post hoc ergo propter hoc.

b. What's up with scientists using all of this snooty latin?

c. atgcatgcatgcatgcatgcatgcatgcatgcatgcatgcatgcatgcatgcatgcatgc
[Note: this is a series of 15 'atgc' values, so there is a much easier way to do this than typing out the entire string].

d. Darwin's "On the origin of species" is a seminal work in biology.

Active learning must be a pretty sweet gig for professors.
I mean, seriously, all this guy does is sit there and answer questions.
What do I need to do to get a job like that?

Use string methods to print the following. Leave a blank line between each answer. Start with the string provided and then use a method to perform the task described.

  • 'species' in all capital letters
  • 'gcagtctgaggattccaccttctacctgggagagaggacatactatatcgcagcagtggaggtggaatgg' with all of the occurences of 'a' replaced with 'A'
  • "    Thank goodness it's Friday" without the leading white space (i.e., without the spaces before 'Thank')
  • The number of a's in 'gccgatgtacatggaatatacttttcaggaaacacatatctgtggagagg'. You'll need a method that is not described in your book to do this easily. Look it up by typing help(str). [Hint: it starts with a 'c']
  • Print the length of this dna sequence 'gccgatgtacatggaatatacttttcaggaaacacatatctgtggagagg' [Hint: this doesn't use a method, but the general function for determining the length of things len()]

For the following DNA sequence determine the following properties and print them to the screen (you can cut and paste the following into your code, it's a lot longer than you can see on the screen, but just select the whole thing and when you paste it into Python you'll see what it looks like):

dna = """ttcacctatgaatggactgtccccaaagaagtaggacccactaatgcagatcctgtgtgtctagctaagatgtattattctgctgtggatcccactaaagatatattcactgggcttattgggccaatgaaaatatgcaagaaaggaagtttacatgcaaatgggagacagaaagatgtagacaaggaattctatttgtttcctacagtatttgatgagaatgagagtttactcctggaagataatattagaatgtttacaactgcacctgatcaggtggataaggaagatgaagactttcaggaatctaataaaatgcactccatgaatggattcatgtatgggaatcagccgggtctcactatgtgcaaaggagattcggtcgtgtggtacttattcagcgccggaaatgaggccgatgtacatggaatatacttttcaggaaacacatatctgtggagaggagaacggagagacacagcaaacctcttccctcaaacaagtcttacgctccacatgtggcctgacacagaggggacttttaatgttgaatgccttacaactgatcattacacaggcggcatgaagcaaaaatatactgtgaaccaatgcaggcggcagtctgaggattccaccttctacctgggagagaggacatactatatcgcagcagtggaggtggaatgggattattccccacaaagggagtgggattaggagctgcatcatttacaagagcagaatgtttcaaatgcatttttagataagggagagttttacataggctcaaagtacaagaaagttgtgtatcggcagtatactgatagcacattccgtgttccagtggagagaaaagctgaagaagaacatctgggaattctaggtccacaacttcatgcagatgttggagacaaagtcaaaattatctttaaaaacatggccacaaggccctactcaatacatgcccatggggtacaaacagagagttctacagttactccaacattaccaggtaaactctcacttacgtatggaaaatcccagaaagatctggagctggaacagaggattctgcttgtattccatgggcttattattcaactgtggatcaagttaaggacctctacagtggattaattggccccctgattgtttgtcgaagaccttacttgaaagtattcaatcccagaaggaagctggaatttgcccttctgtttctagtttttgatgagaatgaatcttggtacttagatgacaacatcaaaacatactctgatcaccccgagaaagtaaacaaagatgatgaggaattcatagaaagcaataaaatgcatgctattaatggaagaatgtttggaaacct"""

  • How many occurences of 'gagg' occur in the sequence?
  • What is the starting position of the first occurrence of 'atta'? [report the actual base pair position as a human would understand it]
  • How long is the sequence?
  • What is the GC content of the sequence? The GC content is the percentage of bases that are either G or C (as a percentage of total base pairs) Print the result as "The GC content of this sequence is XX.XX%" where XX.XX is the actual GC content. Do this using a "formatted printing" (Ch. 3.5 in your book)

The length of an organism is typically strongly correlated with its body mass. This is useful because it allows us to estimate the mass of an organism even if we only know its length. This relationship generally takes the form Mass (kg) = a* Length(m)^b, where the parameters a and b vary among groups. Write a script that prompts the user for the following pieces of information:

  • Genus name
  • Species name
  • The length of the species

and then estimates the mass of the organism using the equation above. The script should print out the result using formatted printing as:

Genus species is length meters long and weighs approximately mass kg.

where the words in italics are replaced with the appropriate values. As is standard practice the first letter (and only the first letter) of the Genus name should be capitalized, and the species name should appear in all lower case letters (regardless of what the user inputs)

This allometric approach is regularly used to estimate the mass of dinosaurs since we cannot typically weigh something that is only preserved as bones. I'll be testing your script using the length of a Spinosaurus (Spinosaurus aegyptiacus), which is 16 m long based on it's reassembled skeleton. So, use the values of a and b for Theropoda (the appropriate dinosaur clade): a has been estimated as 0.73 and b has been estimated as 3.63 (Seebacher 2001). Spinosaurus is a predator that is bigger, and therefore, by definition, cooler, than that stupid Tyrannosaurus that everyone likes so much.