bdill's Curated Datasets

ID Dataset Name Rows KB File Notes Google Sheets Pastebin.com Self Hosted Source
1US annual inflation rates by year9641929-2022. GDP and Fed Funds rateGoogle SheetPastebin - Rawcsv filethebalance.com
2US federal minimum wage by year8511938-2022Google SheetPastebin - Rawcsv filedol.gov
3US federal govt parties by year166111857-2022Google SheetPastebin - Rawcsv filehouse.gov
4US Supreme Court roster12125All SCOTUS justices through Ketanji Brown JacksonGoogle SheetPastebin - Rawcsv fileWikipedia
5State level stats514# counties, cities, ZIP codes, districts, schools, populationGoogle SheetPastebin - Rawcsv filevarious
6Census county population 2010-20203143356Population estimates by county for years 2010-2020Google SheetPastebin - Rawcsv fileUS Census Bureau
7Census race by county 2020 3221426race and ethnicity count and % for each countyGoogle SheetPastebin - Rawcsv fileUS Census Bureau
8Countries List ISO 31662499List of countries as of 2021Google SheetPastebin - Rawcsv fileWikipedia
9US Presidential election by county 20203155256US county level votes for POTUS 2020Google SheetPastebin - Rawcsv fileharvard.edu
10US Presidential election by state 2020514US state level votes for POTUS 2020Google SheetPastebin - Rawcsv fileharvard.edu
11World Airport Codes574216087List of all airport codes, city, name, lat/long (2019)Google Sheetcsv filedatahub.io
12Electoral College vote state allotment by year528EC vote allotment for each state for all POTUS electionsGoogle SheetPastebin - Rawcsv fileWikipedia
13GDP by country 1960-2021266127GDP (current US$) 1960-2020Google SheetPastebin - Rawcsv fileWorld Bank
14US Recessions since 18603314Start/End datesGoogle SheetPastebin - Rawcsv fileWikipedia
15Billboard Hot 100334687180831958-08-04 to 2022-09-24 (too big for Pastebin) Google sheet has summary dataGoogle Sheetcsv fileBillboard Hot 100
16US Congresses1203start/end dates for each 2 year CongressGoogle SheetPastebin - Rawcsv fileWikipedia
17US Congress Bioguide515424762Every term served by every Rep & SenatorGoogle Sheetcsv filecongress.gov


Motivation

The Internet has so many good datasets out there, but there are common issues that make them cumbersome to use right away. Many tables combine 2 or more things into a single column which is OK for a visual, but bad for data manipulation. I break each of these items out into their own column so they can be easily sorted and /or filtered.

The US Census Bureau has vast amounts of excellent data sets, but accessing it is daunting for a rookie and there are so many columns/variables that it's hard to know where to start. Meanwhile Wikipedia has lots of good reference data, but it is often formatted in a way that getting it into a spreadsheet is cumbersome at best. Over the years I have scrubbed these data sets into clean and concise tables that are accessable to the novice user. Unless otherwise noted, the self hosted and pastebin.com files are comma separated (*.csv)

If you have suggestions, please reply to this announcement tweet

If you get value from any of these data sets, give me a shout out on twitter @bdill or on Linked In and if I really saved you some time, you can tip me a few $ on Venmo @wbdill



Useful Sources

US Census Bureau - Decennial census, ACS survey, population estimates, etc.

Worldbank - global data by country

Our World in Data

datahub.io many datasets from 2019



All The Things (uncurated)

bdill's public data folder on Google drive (Google Sheets)

bdill's public data files on pastebin.com

bdill's random R scripts on github.com

bdill's random SQL scripts on github.com


Browse some of my photos on YouPic.com