SQL - Practicing Queries with Baseball Trivia
Utilizing the Lahman Baseball Database.
Using this existing site, I’ll be practicing creating queries to answer the trivia questions listed out below.
Q: Who were the two Cy Young Award winners in 2013?
Methodology:
Using the awardsplayers table we can select the awardID and yearID to come up with the Cy Young Award and the year 2013. Then we can join the people table to get the players names.
SELECT
p.nameFIRST, p.nameLAST, a.playerID, a.awardID, a.yearID
FROM
people p
JOIN
awardsplayers a ON p.playerID = a.playerID
WHERE
a.yearID = 2013
AND a.awardID = 'Cy Young Award';Q: Miguel Cabrera became the third player with 350 home runs with the Tigers. Who are the other two?
Methodology: We can join the people table with the batting and teams table to get a list of players that have batted for the Tigers. We can then use the HAVING clause because we are aggregating in order to filter down to having 350 or more home runs while batting for the Tigers.
SELECT
p.playerID AS Player_ID,
p.nameFIRST AS First_Name,
p.nameLAST AS Last_Name,
t.name AS Team_Name,
SUM(b.HR) AS Sum_Home_Runs
FROM
people p
JOIN
batting b ON p.playerID = b.playerID
JOIN
teams t ON b.teamID = t.teamID
AND b.yearID = t.yearID
GROUP BY Player_ID , First_Name , Last_Name , Team_Name
HAVING t.name = 'Detroit Tigers'
AND Sum_Home_Runs >= 350
ORDER BY Sum_Home_Runs DESC;Q: Who is the most recent Royals player to win the AL Rookie of the Year Award?
Methodology: For this question we can join the people table with the batting, teams and awardsPlayers tables to get a list of players that have batted for the Kansas City Royals the same year that the award was given to them. We can then use our WHERE clause to narrow down the results for the teamID ‘KCA’ and awardID ‘Rookie of the Year’.
SELECT
p.nameFIRST AS First_Name,
p.nameLAST AS Last_Name,
t.name AS Team_Name,
ap.yearID AS Year
FROM
people p
JOIN
batting b ON p.playerID = b.playerID
JOIN
teams t ON b.teamID = t.teamID
AND b.yearID = t.yearID
JOIN
awardsplayers ap ON b.playerID = ap.playerID
AND b.yearID = ap.yearID
WHERE
t.teamID = 'KCA'
AND ap.awardID = 'Rookie of the Year'
ORDER BY ap.yearID DESC;Q: Who is the only player born in the United Kingdom to hit more than 100 career home runs?
Methodology: For this question, we can use the people table to query players born in the UK and then use the batting table to use a WHERE clause that gives us players with more than 100 career home runs.
SELECT
p.nameFIRST AS First_Name,
p.nameLAST AS Last_Name,
p.birthCountry AS Birth_Country,
SUM(b.HR) AS Sum_Home_Runs
FROM
people p
JOIN
batting b ON p.playerID = b.playerID
GROUP BY p.playerID
HAVING sum_home_runs > 100
AND p.birthCountry = 'United Kingdom';Q: Who has the most 30-homer seasons in Orioles history?
Methodology: For this question, we will first use a sub query to get a list of all players that have batted for the Orioles, along with the year and count of home runs that are over 30. Then we can have an outer query that will give us the player information and then COUNT the number of seasons (years) that each player has had with the sub query criterea.
SELECT
orioles_batting.playerID,
orioles_batting.First_Name,
orioles_batting.Last_Name,
COUNT(orioles_batting.year_ID) AS Seasons
FROM
(SELECT
p.playerID AS playerID,
p.nameFIRST AS First_Name,
p.nameLAST AS Last_Name,
b.yearID AS Year_ID,
b.teamID AS Team_ID,
b.HR AS Home_Runs
FROM
batting b
JOIN people p ON b.playerID = p.playerID
WHERE
HR >= 30 AND teamID = 'BAL') AS orioles_batting
GROUP BY orioles_batting.playerID
ORDER BY Seasons DESC;Q: Who are the two previous Brewers to win Rookie of the Year?
Methodology: For this question, we can join the people, batting, teams and awardsplayers tables together to look for the commonalities of players that have batted for teams with names like “brewer” and have also won Rookie of the Year awards in the same year.
SELECT
b.playerID, p.nameFIRST, p.nameLAST, ap.yearID
FROM
people p
JOIN
batting b ON p.playerID = b.playerID
JOIN
teams t ON b.teamID = t.teamID
JOIN
awardsplayers ap ON b.playerID = ap.playerID
AND b.yearID = ap.yearID
WHERE
t.name LIKE '%brewer%'
AND awardID = 'Rookie of the Year'
GROUP BY b.playerID , ap.yearID
ORDER BY ap.yearID DESC;Q: Who holds the record for most hits in a single postseason?
Methodology: We can join the people table and the battingPost table in order to get a list of players and their total hits in the post season. We can then group by the playerID and yearID in order to get a sum total for each year.
SELECT
p.nameFIRST, p.nameLAST, bp.yearID, SUM(bp.H) AS total_hits
FROM
people p
JOIN
battingpost bp ON p.playerID = bp.playerID
GROUP BY p.playerID , bp.yearID
ORDER BY total_hits DESC;Q: Which two sluggers share the record for most home runs in a Division Series, with five?
Methodology: We can join the people and battingPost tables together to get a SUM of home runs for each year for rounds that are like ‘%DS%’ (Division Series) and order it by desc for total home runs.
SELECT
bp.round AS Round,
p.nameFIRST AS First_Name,
p.nameLAST AS Last_Name,
bp.yearID AS Year,
SUM(bp.HR) AS Total_HR
FROM
people p
JOIN
battingpost bp ON p.playerID = bp.playerID
GROUP BY bp.round , p.playerID , bp.yearID
HAVING bp.round LIKE '%DS%'
ORDER BY total_hr DESC;Q: Who was the first Yankees pitcher to win a Cy Young Award?
Methodology: We can join the people, awardsPlayers, appearances, and teams tables to get a list of players that pitched for the Yankees and won the Cy Young Award in the same year.
SELECT
p.playerID AS playerID,
p.nameFIRST AS First_Name,
p.nameLAST AS Last_Name,
ap.yearID AS Year,
ap.awardID AS Award
FROM
people p
JOIN
awardsplayers ap ON p.playerID = ap.playerID
JOIN
appearances a ON a.playerID = ap.playerID
AND a.yearID = ap.yearID
JOIN
teams t ON t.teamID = a.teamID
AND t.yearID = a.yearID
WHERE
t.name = 'New York Yankees'
AND awardID = 'Cy Young Award'
AND a.g_P >= 1
ORDER BY ap.yearID ASC;Q: Josh Donaldson became the second Blue Jays player to win an MVP Award in 2015. Who was the first?
Methodology: To find this answer, we are able to join the people, awardsPlayers, and appearances tables together to get data on players that have won awards during the same year they have played for a certain team. We can then use a WHERE clause to narrow the results down to the MVP award and only players that have played for Toronto Blue Jays in the same year they have won the award. Then, we just order by the yearID in ascending order.
SELECT
p.playerID AS player_id,
p.nameFIRST AS First_Name,
p.nameLAST AS Last_Name,
ap.yearID AS Year,
ap.awardID AS Award
FROM
people p
JOIN
awardsplayers ap ON p.playerID = ap.playerID
JOIN
appearances a ON ap.playerID = a.playerID
AND ap.yearID = a.yearID
WHERE
ap.awardID = 'Most Valuable Player'
AND a.teamID = 'TOR'
ORDER BY ap.yearID ASC;Q: Who was the first shortstop to be named World Series MVP?
Methodology: To answer this question, we are able to join the people, awardsPlayers and fieldingPost tables together to get data on the players who have won awards and reference the positions they played during the same year in the post season.
Then, we can narrow the results down by players who have won World Series MVP awards and played short stop during the post season in the same year.
SELECT
p.playerID AS player_id,
p.nameFIRST AS First_Name,
p.nameLAST AS Last_Name,
ap.awardID AS Award,
ap.yearID AS Year
FROM
people p
JOIN
awardsplayers ap ON p.playerID = ap.playerID
JOIN
fieldingpost fp ON ap.playerID = fp.playerID
AND ap.yearID = fp.yearID
WHERE
ap.awardID = 'World Series MVP'
AND fp.POS = 'SS'
GROUP BY p.playerID , ap.yearID
ORDER BY ap.yearID ASC;











