Homework 6: SQL

Update: answers posted below each problem.

Due: Thursday Nov 8, 11:59PM.

To submit: Send an email to me at jal2016@email.vccs.edu with subject CSC 110 HW6 with your work in the body of an email, or attached in a plain text file. Note: no Microsoft Word documents will be accepted for this or future assignments.

This assignment will use the example SQLite databases linked in this post. Include the SQL statements you used to find the answer.

Using the UN database, answer the following questions. The questions don't state this explicitly, but assume they always mean "given the available data" (since some countries have data missing in some years).

  1. What was the largest total population in the data set? Which country?

    select country, male + female as total from population
      order by total desc limit 1;
    

    country

    total

    China

    1312978855

  2. Which 3 countries had the largest male population in 1999?

    select country, male from population
      where year = 1999
      order by male desc limit 3;
    

    country

    male

    China

    643160000

    India

    516309000

    United States of America

    136802873

  3. Explain what the following query does. Can you explain why Montserrat shows up in the results? How many rows appear to indicate bad data?

    select first.year, first.male + first.female as pop_orig,
        next.male + next.female as pop_next_year, first.country,
        1.0 * (next.male + next.female) / (first.male + first.female) - 1.0 as percent_change
      from population first join population next
      on first.year = next.year - 1 and first.country = next.country
      order by percent_change limit 10;
    

    Here are the results:

    year

    pop_orig

    pop_next_year

    country

    percent_change

    2000

    66528578

    1741308

    Egypt

    -0.973826165351077

    2000

    25051543

    2481300

    Iraq

    -0.900952208812048

    1997

    22046000

    2270200

    Iraq

    -0.897024403519913

    1997

    6094

    3595

    Montserrat

    -0.410075484082704

    2000

    10130661

    7503433

    Serbia

    -0.259334311946674

    1995

    10195681

    7622185

    Serbia

    -0.252410407897226

    2001

    10651650

    8113868

    Serbia and

    -0.238252477315721

    1995

    10324

    7867

    Montserrat

    -0.23798915149167

    1996

    7867

    6094

    Montserrat

    -0.225371806279395

    1990

    9568745

    7595636

    Serbia

    -0.206203530348024

    The join operation is linking the same table to itself, but lining up rows that differ in year by 1. This is allowing us to compute a year-to-year population change. So we're finding the countries and years where a country lost the most population from one year to the next.

    The top three results are most likely errors. However, Serbia and Montserrat have serious explanations for the decreases. In 1995, a volcano began erupting on Montserrat that forced evacuations. Conflict in Serbia can explain some decreases (although it also probably caused difficulty in gathering population estimates).

Use the movie database to answer the next set of questions:

  1. How many users listed an occupation of "student"?

    select count(*) as num_students from users
      where occupation = 'student';
    

    num_students

     

    196

     
  2. What are all the movie titles that were classified as both romance and sci-fi?

    select title from movies where genre_romance = 1 and genre_sci_fi = 1;
    

    title

     

    Star Wars (1977)

     

    Empire Strikes Back, The (1980)

     

    Return of the Jedi (1983)

     

    Nutty Professor, The (1996)

     

    Kid in King Arthur's Court, A (1995)

     
  3. What movies are listed with titles that start with the word "Big"?

    select title from movies where title like 'Big%';
    

    title

     

    Big Night (1996)

     

    Big Sleep, The (1946)

     

    Big Squeeze, The (1996)

     

    Big Lebowski, The (1998)

     

    Big Blue, The (Grand bleu, Le) (1988)

     

    Big Green, The (1995)

     

    Big Bully (1996)

     

    Big Bang Theory, The (1994)

     

    Big One, The (1997)

     
  4. What was the most common rating (number) given?

    select rating, count(*) as count from ratings group by rating order by count desc;
    

    rating

    count

    4

    34174

    3

    27145

    5

    21201

    2

    11370

    1

    6110

  5. Which movie did lawyers rate the most times? Hint: the template below is one possible solution, where you have to fill in the ??? sections.

    select ???
      from ratings natural join users natural join movies
      where ???
      group by movie_id
      order by ???
      limit ???;
    
    select title, count(*) as rating_count
      from ratings natural join users natural join movies
      where occupation = 'lawyer'
      group by movie_id
      order by rating_count desc
      limit 1;
    

    title

    rating_count

    Full Monty

    9

Use the Chinook database to answer the next set:

  1. Which 5 artists have the most albums listed?

    select Name, count(*) as album_count from Album natural join Artist
      group by ArtistID
      order by album_count desc limit 5;
    

    Name

    album_count

    Iron Maiden

    21

    Led Zeppeli

    14

    Deep Purple

    11

    Metallica

    10

    U2

    10

  2. Which customer has ordered the largest total in dollars?

    select CustomerId, FirstName, LastName, count(*) as orders, sum(Total) as total
      from Customer natural join Invoice
      group by CustomerId
      order by total desc limit 1;
    

    CustomerId

    FirstName

    LastName

    orders

    total

    6

    Helena

    HolĂ˝

    7

    49.62