pysketball package¶
Submodules¶
pysketball.nba_boxplot module¶
-
pysketball.nba_boxplot.nba_boxplot(dataset, stats, position=None, teams=None)¶ Creates a boxplot of the categorical variable of interest on the y-axis and the stat of interest on the x-axis. You can only use one of position or teams argument for categorical variable and stats argument must be chosen.
- dataset: pd.DataFrame
- This dataframe is created after using the nba_scraper.py function or if the csv has already been loaded, read the csv in and pass it as the parameter.
- stats: str
- The parameter of interest examples: Points, 3_Pointers, Turnovers
- teams: list
- list of team names to compare examples: [“ORL”, “UTAH”, “LAC”, “MIN”, “BOS”]
- position: str
- to compare position’s stats put “POS” in argument
- display : altair boxplot visual
- Boxplot
>>> from pysketball import nba_boxplot.py >>> d = {"POS" :["C", "FOR", "PO","FOR", "C"], "Team" : ["ORL", "UTAH", "LAC", "MIN", "BOS"], "GP" : [3, 5, 5, 2, 1]} >>> nba_2018 = pd.DataFrame(data=d) >>> nba_boxplot(nba_2018, position= "POS", teams= None, stats= "GP")
pysketball.nba_ranking module¶
-
pysketball.nba_ranking.nba_ranking(data, column, by, top=10, ascending=True, fun='mean')¶ Generates a ranking and a visualization based on a column of a dataset
Parameters: - data (pandas.DataFrame) – pandas DataFrame where we calculate the ranking from
- column_name (str) – name of the column we want to rank
- by (str) – name of the column we want to rank by
- top (int) – number of elements for the ranking. Default is 10.
- ascending (bool) – True if we want to rank ascending false otherwise
- fun ({'mean', 'sum'}, default = 'mean') – function to operate over the ranking variable
Returns: - ranking (pandas.DataFrame) – Ranking table
- display (altair barplot) – A ranking visualization
Examples
>>> from pysketball import nba_ranking >>> diction = {'A': [1, 2, 3, 6], 'B': [2, 1, 4, 6], 'C': ["A", "B", "A", "C"]} >>> data = pd.DataFrame(diction) >>> nba_ranking(data, 'C' , 'B', top = 2, ascending = False, fun = 'mean')
pysketball.nba_scraper module¶
-
pysketball.nba_scraper.nba_scraper(season_year, season_type='regular', csv_path=None)¶ Scrapes data from ESPN NBA data and returns a pandas DataFrame. User can specify the year of the season (2016, 2017, etc) and the season type (regular or postseason). If csv_path is given, the scraped data will be written to csv based on input path.
Parameters: - season_year (int) – An integer input of the year of interest for the NBA season.
- season_type (string) –
- A string input of the NBA season type (either “regular” or
- ”postseason”).
Default is “regular”.
- csv_path_name (string) –
- A string input stating the path to store the scraped csv file and
- ending with “.csv”. Default is None.
Returns
- ------- –
- pandas.DataFrame – scraped data in DataFrame format
Examples
>>> from pysketball.nba_scraper import nba_scraper >>> # Scrape regular season 2018/19 and return a dataframe while storing it as csv file called "nba_2018.csv" >>> nba_scraper(season_year = 2018, season_type = "regular", csv_path = "nba_2018.csv") >>> >>> # Scrape postseason season 2017/18 and return a dataframe without storing it as csv file. >>> nba_scraper(season_year = 2017, season_type = "postseason", csv_path = None)
pysketball.nba_team_stats module¶
-
pysketball.nba_team_stats.nba_team_stats(nba_data, stats_filter=None, teams_filter=None, positions_filter=None)¶ Generate summary stats for NBA players.
The function provides descriptive team statistics of NBA data. Users can specify which statistic of interest (3PA, 3PM, etc) along with teams of interest (GS, HOU, etc). If positions of interest (C, PG, etc) are specified, the returned dictionary depicts relevant descriptive statistics for the relevant positions in the relevant teams.
Parameters: - nba_data (pandas.DataFrame) – A pandas DataFrame with overall statistics for a particular season of NBA.
- stats_filter (list) – A list of column names for whom summary stats are required
- teams_filter (list) – A list of team names for whom summary stats are required
- positions_filter (list) – A list of positions for whom summary stats are required
Returns: dict of str – The stats summarised in a dictionary where keys holds summary stats for each stat in stats_filter
Return type: pandas.DataFrame:
Examples
>>> from pysketball import nba_team_stats >>> nba_team_stats.nba_team_stats(nba_data, stats_filter = ['GP', '3PM', 'FT%']) >>> nba_team_stats.nba_team_stats(nba_data, stats_filter = ['GP', '3PM', 'FT%'], teams_filter = ['UTAH', 'PHX', 'DET']) >>> nba_team_stats.nba_team_stats(nba_data, stats_filter = ['GP', '3PM', 'FT%'], teams_filter = ['UTAH', 'PHX', 'DET'], positions_filter = ['C', 'PG']) {'GP': count mean std min 25% 50% 75% max Team POS DET C 2.0 73.5 7.778175 68.0 70.75 73.5 76.25 79.0 PG 1.0 82.0 NaN 82.0 82.00 82.0 82.00 82.0 PHX C 1.0 71.0 NaN 71.0 71.00 71.0 71.00 71.0 UTAH C 1.0 81.0 NaN 81.0 81.00 81.0 81.00 81.0 PG 1.0 68.0 NaN 68.0 68.00 68.0 68.00 68.0, '3PM': count mean std min 25% 50% 75% max Team POS DET C 2.0 0.05 0.070711 0.0 0.025 0.05 0.075 0.1 PG 1.0 2.10 NaN 2.1 2.100 2.10 2.100 2.1 PHX C 1.0 0.00 NaN 0.0 0.000 0.00 0.000 0.0 UTAH C 1.0 0.00 NaN 0.0 0.000 0.00 0.000 0.0 PG 1.0 1.20 NaN 1.2 1.200 1.20 1.200 1.2, 'FT%': count mean std min 25% 50% 75% max Team POS DET C 2.0 68.6 13.57645 59.0 63.8 68.6 73.4 78.2 PG 1.0 86.4 NaN 86.4 86.4 86.4 86.4 86.4 PHX C 1.0 74.6 NaN 74.6 74.6 74.6 74.6 74.6 UTAH C 1.0 63.6 NaN 63.6 63.6 63.6 63.6 63.6 PG 1.0 85.5 NaN 85.5 85.5 85.5 85.5 85.5}