Lab02_WriteUp
Contents
Question 1
For this question we will use SQL to extract data out of the indiv_sample_nyc
table. The schema for this table is given below.
Question 1a
Let’s start by looking at 2016 election contributions made by Donald Trump, who was a NY resident during that year. Write a SQL statement that will return the cmte_id, transaction_amt, and name for every contribution made by any donor with “DONALD” and “TRUMP” in their name.
Hint: You’ll need to use WHERE name LIKE '%TRUMP%' AND name LIKE '%DONALD%'
.
|
|
Question 1b
If we look at the list above, it appears that some donations were not by Donald Trump himself, but instead by an entity called “DONALD J TRUMP FOR PRESIDENT INC”. If we look at the list of donors, we see that our query only seems to have picked up one such anomalous name.
|
|
Question 1c
As a review of some of the key ideas from CS61A/CS88, let’s refine our query above.
We see that many of the contributions above were to the same committee. Create a new SQL query that returns the total amount that Donald Trump contributed to each committee. Your table should have three columns cmte_id
, total_amount
, and num_donations
, where total_amount
is the total amount contributed to that committee, and num_donations
is the total number of donations. Your table should be sorted in decreased order of total_amount
.
Hint: Use GROUP BY
as covered in CS61A and CS88.
Hint: Use COUNT(*)
as covered in CS61A and CS88.
Hint: Use SUM()
as covered in CS61A and CS88.
Hint: Use ORDER BY
as covered in CS61A and CS88.
Hint: Use DESC
as covered in CS61A and CS88.
Hint: Your WHERE
clause should be the same as question 1b.
|
|
Question 1d
Now we’d like to know the name of each committee, i.e. we want to know the name of the identity that Donald Trump was making contributions to. Modify your query from question 1c so that it also shows the name of each committee. That is, your table should have four columns cmte_id
, total_amount
, num_donations
, and cmte_nm
.
The names of the committees are given in the cmte_nm
column of the comm
table.
Hint: You’ll need to join the indiv_sample_nyc
and comm
tables.
Hint: One approach is a nested select statement (a.k.a. subquery) as discussed in lecture.
|
|
Question 2
Question 2a
Let’s select transactions in the indiv_sample_nyc
table with a cmte_id
(committee id) that ends in 5. Then, groupby the transactions in the indiv_sample_nyc
table with the cmte_id
column, and count how many transactions and how much total amount of contribution there are for each cmte_id
. Select the top 5 committees ordered by the transaction count.
Your output should look similar to:
committee_id | total_amount | count | |
---|---|---|---|
0 | C00575795 | 25099091 | 110657 |
1 | C00000935 | 6989835 | 36237 |
2 | C00035675 | 316019 | 3417 |
3 | C00040535 | 492567 | 3277 |
4 | C00012245 | 499659 | 2128 |
|
|
Question 2b
When you printed the schema for the tables, you may have noticed that both the cand
and comm
tables have a cand_id
column. Let’s try joining these two tables on this column.
List the first 5 candidate names (cand_name
) in reverse lexicographic order by cand_name
, along with their corresponding committee names. Only select rows that have a matching cand_id
in both tables.
Your output should look similar to:
cand_name | cmte_nm | |
---|---|---|
0 | ZUTLER, DANIEL PAUL MR | CITIZENS TO ELECT DANIEL P ZUTLER FOR PRESIDENT |
1 | ZUMWALT, JAMES | ZUMWALT FOR CONGRESS |
2 | ZUKOWSKI, ANDREW GEORGE | ZUKOWSKI FOR CONGRESS |
3 | ZUCCOLO, JOE | JOE ZUCCOLO FOR CONGRESS |
4 | ZORN, ROBERT ERWIN | CONSTITUTIONAL COMMITTEE |
|
|
Question 2c
As in the previous part, list the first 5 candidate names (cand_name
) by reverse lexicographic order and their corresponding committee names (cmte_nm
). This time, include all candidate names. If there is no matching cand_id
in the comm
table, then cmte_nm
should be NULL.
Your output should look similar to:
cand_name | cmte_nm | |
---|---|---|
0 | ZUTLER, DANIEL PAUL MR | CITIZENS TO ELECT DANIEL P ZUTLER FOR PRESIDENT |
1 | ZUMWALT, JAMES | ZUMWALT FOR CONGRESS |
2 | ZUKOWSKI, ANDREW GEORGE | ZUKOWSKI FOR CONGRESS |
3 | ZUCCOLO, JOE | JOE ZUCCOLO FOR CONGRESS |
4 | ZORNOW, TODD MR | None |
|
|