Every wondered if pitchers have solved striking out the best players by just walking them too often? Some people do believe that by walking a good hitter often enough, they will start to lose their ability to hit well enough.
This issue came up a few seasons ago when Bryce Harper was slumping due to a higher number of walks. Recently with the slump of Judge, I wondered if one of the most walked players in the MLB was going through that same thing.
By using the number of walks and the number of strikeouts for the top players in each league that season, I tried to find the correlation of it all. This all leads to the question of: Does a higher number of walks lead to a high number of strikeouts?
2016- Mike Trout (116 W, 137 SO)
Paul Goldschmidt (110 W, 150 SO)
2015- Joey Votto (143 W, 135 SO)
Jose Bautista (110 W, 106 SO)
2014- Carlos Santana (113 W, 124 SO)
Matt Carpenter (95 W, 111 SO)
2013- Joey Votto (135 W, 138 SO)
Mike Trout (110 W, 136 SO)
2012- Adam Dunn (105 W, 222 SO)
Dan Uggla (94 W, 168 SO)
2011- Jose Bautista (132 W, 111 SO)
Joey Votto (110 W, 129 SO)
This is called small negative association, where the r is negative and below -0.3. The actual graph was listed at -0.246, so it fits the bill. This means that for every increase in one area, there is an decrease in the other and vice versa. So, for every walk there is actually a decrease in strikeouts. I guess we were wrong on this one.
The simulation sees how common it is for the small negative association to occur and it is actually very common. Therefore, we cannot conclude that an increased number of walks creates an increased number of strikeouts. But, since the pitchers like to do that to the best players, let them keep thinking that it works.
Hope that this debunked some previous hunches about walks!
The Dodgers have made July their month by going 18-3, a fantastic .857 winning percentage, and hold the best record in baseball at 72-31. This month has been historic for the already historic franchise so let's go over some stats:
On Wednesday they reached the earliest time in franchise history where they were 40 games over .500.
Even when they are not dominating in a game, they are an MLB-best 29-31 when trailing.
Their win on Monday rank them fourth in most wins in 100 games in the Expansion Era that dates back to 1961 (all from ESPN Stats and Info).
So, it's time for us to put their month to the test and see if they were streaky in July or not. Let's go over the rules again...
What qualifies as streaky? Well, since there are less than 50 games, we will see how many times a streak of 3 or more occurred. We will do this with a simulation that counts how many times a streak of 3+ happens and if a majority of the dots are 3+, then they are not streaky.
SO, the magic number to be streaky is to have 5% or less of dots to be 3+ because that means that what they are doing is truly special.
If it is above, then they are not streaky and their game results are independent from each other. So that means that every game is new chance to win and to not build off from the day before.
Results from all of July up until last night's win are used for the simulation below that consisted of 200 trials.
Longest Streak= 11 wins
Streaks of 3+= 2
They may not be classified as 'streaky' here but they still have all those wins in July and you cannot take those away from them. Let's see if they can continue their winning ways as the month fades out into August.
With all the increased home run activity going on in baseball right now, might as well check to see which league is superior. The obvious answer would be the American League due to Aaron Judge winning the Home Run Derby but there is the chance that both leagues may be more similar than we thought.
Through comparing the two means of each league in different categories, we will find out who is the better league (or maybe they're equal!).
How This Works: I looked up the Home Runs per Game and Average True Distance for both leagues on hittrackeronline.com to serve as the data. The differences between each mean/average were used as the test statistic which layed the groundwork for what we were looking for. If the test statistic occured more than 5% of the time in 400 trials, the abilities in each league are the same (yay for equality!). If it occured less than 5%, then one has a better ability than the other.
Difference in HRs per Game
AL Average: 2.61 HRs
NL Average: 2.43 HRs
Test Statistic/Difference: 0.18
Verdict: Teams in the AL have the same ability as teams in the NL and the increased home runs per game is due to random chance.
Difference in Average True Distance
AL Average: 401.0
NL Average: 400.6
Test Statistic/Difference: 0.4
Verdict: Teams in the AL have the same ability as teams in the NL in terms of hitting longer home runs and the slightly increased performance is due to random chance.
Looks like no one is super special yet, but hey! There's still more to come this season regarding home runs! Make sure to stay tuned and we will see you soon!
Note: Trials were conducted on Thursday night so the stats may be slightly different now.
Ever wondered if the first half could easily tell you who the winner of the World Series is? Find out today if there IS actually a correlation between the first half and the second. By using the split records from the past ten winners, we will see if the first half correlates to second half success and also compare today's top teams to yesterday's.
First Half Statistics*
Average Winning Percentage- .579
Median Winning Percentage- .585
Least Amount of Wins- 46 by San Francisco Giants in 2012 (.523 Winning %)
Most Amount of Wins- 58 by Boston Red Sox in 2013 (.659 Winning %)
2016: Chicago Cubs (53-35)
2015: Kansas City Royals (52-34)
2014: San Francisco Giants (53-43)
2013: Boston Red Sox (58-39)
2012: San Francisco Giants (46-40)
2011: St. Louis Cardinals (49-43)
2010: San Francisco Giants (47-41)
2009: New York Yankees (51-37)
2008: Philadelphia Phillies (48-33)
2007: Boston Red Sox (53-34)
*Compiled from past 10 World Series Winners
Second Half Records
2016: Chicago Cubs (50-23)
2015: Kansas City Royals (43-33)
2014: San Francisco Giants (35-31)
2013: Boston Red Sox (39-26)
2012: San Francisco Giants (48-28)
2011: St. Louis Cardinals (41-29)
2010: San Francisco Giants (45-29)
2009: New York Yankees (52-22)
2008: Philadelphia Phillies (44-37)
2007: Boston Red Sox (43-32)
Data Plot & Results
From what we see here, there is not a direct correlation between first and second half. That means that doing well in the first half does not mean you will do well in the second and win the World Series. All that matters really is for your team to outlast the posteason and the grueling schedule.
Past to Present
IF Houston or Los Angeles win it all then it would be the first time in ten years that a team with 60 or more wins before the All-Star break won the World Series.
The team closest to the average winning percentage would have to be Boston at .568 and if history repeats itself, Boston could bring back another one.
None of the division-leading teams could beat San Francisco's record of 46 wins even though Cleveland came close with 47 wins. Missed it by that much.
But if Minnesota, Atlanta, Chicago, and Texas won the World Series, they could break the first half record.
Hope this helped you understand more about first half success and how it does not relate to the second half but can be helpful.
The All-Star break is fast approaching, so it's about time for us to get prepared for the always fun Home Run Derby. This week we will look over confidence intervals and see which player has the highest confidence to hit the most home runs. All of this is based on this season's numbers, so the career numbers do not count (works better for the rookies).
Remember that confidence intervals only measure 95% of our confidence in their ability. There is a margin of error and this is not the know-all for the winner. I will predict that on Monday when we debrief who is participating in a deeper way.
20.5 HRs to 25.6 HRs
Home Runs: 23
11.24 HRs to 14.76 HRs
Home Runs: 13
22.18 HRs to 27.8 HRs
Home Runs: 25
17.71 HRs to 22.29 HRs
Home Runs: 20
21.02 HRs to 26.9 HRs
Home Runs: 24
16.14 HRs to 19.85 HRs
Home Runs: 18
25.65 HRs to 32.34 HRs
Home Runs: 29
16.7 HRs to 21.3 HRs
Home Runs: 19
Looks like Aaron Judge is the one to beat here with the best confidence intervals. Fear not, there is more to be tested in order for Judge to be considered the clear choice as the winner. Stay tuned and enjoy these stats because more are coming!
Happy Canada Day to everyone! For today's 150th celebration, I will show some cool stats from some of Canada's sports icons. I am so happy for this lovely country because it has contributed so much to sports, especially hockey. It also has done a lot for other sports and has successful womens' teams in soccer, basketball, curling, hockey (of course!), and many more.
Today's special celebration will include THE icon for hockey and for Canada and one of the best active soccer players in the world.
O Canada, thank you for your contributions to the world of sports and beyond!
Wayne Gretzky's Moving Average
What better way to celebrate the Great White North than to show just how great the Great One is? Moving average is the average of an athlete's performances in a specified time period that includes the before and after the time period. This is perfect for Gretzky because he played so long that he gave us more years to work with and it also shows how consistently great his production was.
Things to Notice-
Gretzky had his best numbers at age 24, which was a Cup-winning year for him.
His lowest point was at age 33 with 81 points (good for an average player, but he's not average in the least) but he was able to rebound by 46.3 points the next year.
One reason for that could be because he suffered a back injury that limited him to 45 games during the 1992-93 season (age 32). That could definitely mess with the numbers for the average.
Overall, he finished with 82 average points, which is not a bad way to finish. He set the bar so high that we probably may not see this kind of talent again.
Christine Sinclair's Time Plot
The active leader in international goals scored and second on the all-time list (165) is one of the many reasons for Canada's recent successes in soccer on the international stage. Sinclair's 17 seasons of playing internationally are shown below in a time plot, which shows how she did in each year (no averages here!).
Things to Notice-
Sinclair had her best year ever in 2012 when she scored 23 goals and Canada won their bronze medal at the London Games.
15 goals definitely sounds like a strong debut for Sinclair when she came up at age 16 and scored three goals in her first tournament (the most out of any players in the tournament).
Sinclair's goals and leadership has helped Canada finish in their best standings ever at world tournaments, including their fourth place finish at the 2003 World Cup.
Enjoy the rest of the day because it's a good one!
The National League West is looking like the National League Best with three teams fighting for that first place spot and the other two looking good for a wild card. It definitely looks like these teams are going on long win streaks, but our investigation will see if they are actually streaky teams.
How to find out if they are streaky or not: Put the order of wins and losses in a column (I use a simulation for this) and then see how many streaks of 5 or more are in there since it is above 50 games.
In the simulation if over 5% of the dots suggest they could have a high number of streaks above 5 wins, then they are not streaky.
Streaky?: Nope. They are way above the percentage for being streaky.
Streaky?: Nope. They are also above 5% for streaks of 5 wins or longer.
Now that the Stanley Cup has wrapped up, it's time to start focusing more on baseball and doing investigations for that. Today's investigation will deal with z-scores, which measure how many standard deviations a performance is above or below the average (think the more, the merrier).
Since the All Star Game is coming up, it would make sense to see who is the best for at least one of the positions. With that being said, let's explore the tight race for shortstops in both leagues. It was hard for me to choose the top shortstop in the AL and the voters might feel the same way as well. Also, the NL race is close for shortstop while other races see people taking huge leads.
How this works:
The top five in each league for shortstops will have their totals put in a list that will find their average and standard deviation. With the average, we can subtract their actual average with the one that is comprised of everyone's. After that, we divide it by the standard deviation to see how much they are different from the rest. The higher, the better!
American League Top Five- *actual average in parentheses
Carlos Correa, Houston (.293): .125
Francisco Lindor, Cleveland (.254): -0.85
Didi Gregorius, New York (.339): 1.275
Xander Bogaerts, Boston (.324): 0.9
Troy Tulowitzki, Toronto (.234): -1.35
Standard Deviation= .04
National League Top Five-
Zack Cozart, Cincinnati (.324): 1.2
Corey Seager, Los Angeles (.281): .125
Addison Russell, Chicago (.215): -1.525
Trea Turner, Washington (.268): -0.2
Chris Owings, Arizona (.295): 0.475
Standard Deviation= .04
American League Top Five-
Correa (41): 1.57
Lindor (27): .085
Gregorius (25): -0.127
Bogaerts (23): -0.34
Tulowitzki (15): -1.191
Standard Deviation= 9.4
National League Top Five-
Cozart (33): 0.4
Seager (31): 0.036
Russell (23): -1.4
Turner (29): -0.327
Owings (38): 1.31
Standard Deviation= 5.5
Obviously, popularity is a huge factor that gets some people in the top five who are not doing their best right now, which brought down the average by a lot and helped others more than it should.
Also, Gregorius looked great in the average part but since he's only had 165 at-bats, he should not be highly considered for the All Star Game just yet. It was good to add the RBI part because that shows how much catching up he needs to do.
After doing this, I still stand by my decision on voting for Cozart in the National League but I need to keep my eye on Owings because he is putting up some good numbers that should not be overlooked. It's nice to see a loyal fanbase promote someone who deserves recognition.
I hope this helped with your decision and have a great time voting!
For the (maybe?) final edition of the stats for the Stanley Cup, we will go over if either team likes playing at home better than away and if they happen to be streaky at all. This is definitely important because neither team was able to win on the road so far in the series and whoever wins their road game, wins the Stanley Cup (unless the Penguins lose in Nashville and then win in Pittsburgh).
Home Winning Percentage: 90%
Away Winning Percentage: 45.4%
Home Ice Advantage: None found
Streaks of Three or More: 1
In the simulation, they were not found to be a streaky team because they were predicted to have streaks less than ot higher than 3 about 51 times out of 100.
Home Winning Percentage: 76.9%
Away Winning Percentage: 45.4%
Home Ice Advantage: None found
Streaks of Three or More: 3
In the simulation, they were not found to be a streaky team because they were predicted to have that amount of streaks 55 times out of 100.
You would think that a team with a 90% winning percentage at home would give them an advantage, but their p-value was not small enough for that. Oh well. Pittsburgh has a chance to lower it tomorrow!
Ever wondered how great your team is offensively based on how much effort they put in? Well, with correlation, you can figure that out! This week explores the correlation between shots and goals in the entire playoffs for both Stanley Cup Teams. This definitely tells you about their differing types of play and how they have played their way to the highest level.
The data looks stronger here because the Penguins are able to score a higher number of goals on a lower number of shots, compared to Nashville. The leading scorer on the Penguins did not even have the highest number of goals. This definitely proves that Pittsburgh is the stronger team on offense.
The data here looks slightly weaker only because the points are more spread out. Nashville takes a lot of shots in games, especially in this current series where they are taking control of the games. The only problem is that they do not have high goal totals for their shots and that proves that Nashville is great defensively and they are able to get the job done in low-scoring games.
Hi, I'm Jenna and I'm a sports fan! I've been avidly watching sports since 2011 because I found that by watching sports, I would be able to communicate with my dad and brother better. Ever since I got into sports, I've been able to enjoy myself more when I go to sporting events with my family.