Update: answers posted below each problem.
Due: Thursday Nov 8, 11:59PM.
To submit: Send an email to me at jal2016@email.vccs.edu with subject CSC 110 HW6 with your work in the body of an email, or attached in a plain text file. Note: no Microsoft Word documents will be accepted for this or future assignments.
This assignment will use the example SQLite databases linked in this post. Include the SQL statements you used to find the answer.
Using the UN database, answer the following questions. The questions don't state this explicitly, but assume they always mean "given the available data" (since some countries have data missing in some years).
What was the largest total population in the data set? Which country?
select country, male + female as total from population order by total desc limit 1;
country
total
China
1312978855
Which 3 countries had the largest male population in 1999?
select country, male from population where year = 1999 order by male desc limit 3;
country
male
China
643160000
India
516309000
United States of America
136802873
Explain what the following query does. Can you explain why Montserrat shows up in the results? How many rows appear to indicate bad data?
select first.year, first.male + first.female as pop_orig, next.male + next.female as pop_next_year, first.country, 1.0 * (next.male + next.female) / (first.male + first.female) - 1.0 as percent_change from population first join population next on first.year = next.year - 1 and first.country = next.country order by percent_change limit 10;
Here are the results:
year
pop_orig
pop_next_year
country
percent_change
2000
66528578
1741308
Egypt
-0.973826165351077
2000
25051543
2481300
Iraq
-0.900952208812048
1997
22046000
2270200
Iraq
-0.897024403519913
1997
6094
3595
Montserrat
-0.410075484082704
2000
10130661
7503433
Serbia
-0.259334311946674
1995
10195681
7622185
Serbia
-0.252410407897226
2001
10651650
8113868
Serbia and
-0.238252477315721
1995
10324
7867
Montserrat
-0.23798915149167
1996
7867
6094
Montserrat
-0.225371806279395
1990
9568745
7595636
Serbia
-0.206203530348024
The join operation is linking the same table to itself, but lining up rows that differ in year by 1. This is allowing us to compute a year-to-year population change. So we're finding the countries and years where a country lost the most population from one year to the next.
The top three results are most likely errors. However, Serbia and Montserrat have serious explanations for the decreases. In 1995, a volcano began erupting on Montserrat that forced evacuations. Conflict in Serbia can explain some decreases (although it also probably caused difficulty in gathering population estimates).
Use the movie database to answer the next set of questions:
How many users listed an occupation of "student"?
select count(*) as num_students from users where occupation = 'student';
num_students
196
What are all the movie titles that were classified as both romance and sci-fi?
select title from movies where genre_romance = 1 and genre_sci_fi = 1;
title
Star Wars (1977)
Empire Strikes Back, The (1980)
Return of the Jedi (1983)
Nutty Professor, The (1996)
Kid in King Arthur's Court, A (1995)
What movies are listed with titles that start with the word "Big"?
select title from movies where title like 'Big%';
title
Big Night (1996)
Big Sleep, The (1946)
Big Squeeze, The (1996)
Big Lebowski, The (1998)
Big Blue, The (Grand bleu, Le) (1988)
Big Green, The (1995)
Big Bully (1996)
Big Bang Theory, The (1994)
Big One, The (1997)
What was the most common rating (number) given?
select rating, count(*) as count from ratings group by rating order by count desc;
rating
count
4
34174
3
27145
5
21201
2
11370
1
6110
Which movie did lawyers rate the most times? Hint: the template below is one possible solution, where you have to fill in the ??? sections.
select ??? from ratings natural join users natural join movies where ??? group by movie_id order by ??? limit ???;
select title, count(*) as rating_count from ratings natural join users natural join movies where occupation = 'lawyer' group by movie_id order by rating_count desc limit 1;
title
rating_count
Full Monty
9
Use the Chinook database to answer the next set:
Which 5 artists have the most albums listed?
select Name, count(*) as album_count from Album natural join Artist group by ArtistID order by album_count desc limit 5;
Name
album_count
Iron Maiden
21
Led Zeppeli
14
Deep Purple
11
Metallica
10
U2
10
Which customer has ordered the largest total in dollars?
select CustomerId, FirstName, LastName, count(*) as orders, sum(Total) as total from Customer natural join Invoice group by CustomerId order by total desc limit 1;
CustomerId
FirstName
LastName
orders
total
6
Helena
HolĂ˝
7
49.62