bdill's Curated Datasets



Motivation

The Internet has so many good datasets out there, but there are common issues that make them cumbersome to use right away. Many tables combine 2 or more things into a single column which is OK for a visual, but bad for data manipulation. I break each of these items out into their own column so they can be easily sorted and /or filtered.

The US Census Bureau has vast amounts of excellent data sets, but accessing it is daunting for a rookie and there are so many columns/variables that it's hard to know where to start. Meanwhile Wikipedia has lots of good reference data, but it is often formatted in a way that getting it into a spreadsheet is cumbersome at best. Over the years I have scrubbed these data sets into clean and concise tables that are accessable to the novice user. Unless otherwise noted, the self hosted and pastebin.com files are comma separated (*.csv)

If you have suggestions, please reply to this announcement tweet

If you get value from any of these data sets, give me a shout out on twitter @bdill or on Linked In and if I really saved you some time, you can tip me a few $ on Venmo @wbdill



Useful Sources

US Census Bureau - Decennial census, ACS survey, population estimates, etc.

Worldbank - global data by country

Our World in Data

datahub.io many datasets from 2019



All The Things (uncurated)

bdill's public data folder on Google drive (Google Sheets)

bdill's public data files on pastebin.com

bdill's random R scripts on github.com

bdill's random SQL scripts on github.com


Browse some of my photos on YouPic.com