Audrey Leung, Brian Carlo, Carlos Lasa, Stephanie Snipes
INFO 247 | Information Visualization Final Project
Exploring the menu collection of the New York Public Library
With approximately 17,500 transcribed menus dating from the 1850s to the present, The New York Public Library’s restaurant menu collection is one of the largest in the world, used by historians, chefs, novelists and everyday food enthusiasts.
The menus contain specific information about dishes, prices, and the location on menus. From there, we can extrapolate the stories these items tell us about the evolution of various dishes over time.
Through our exploration of this extensive collection, we learned that the data contained several menus from various special occasions such as Thanksgiving, Christmas, and birthdays, and there could be hundreds of different ways to list a single menu item such as 'fried sweet potatoes'.
As such, we began by cleaning the data and clustering common dishes together to gain a better understanding of overall trends.
In total, here were the number of menus, dishes, and years that the NYPL collection contained.
International from US to Germany
Also known as "menu items" including items such as beverages
Though not every year had menus and some years had more menus than others
How can we efficiently use the menu collection?
From 1.3 million dishes spanning menus around the globe, we narrowed the scope of our project to about 90,000 menu items in the U.S. We chose roughly the top 13,000 menus and 90,000 dishes based on "popularity," or how frequently they appeared. Then, we grouped those dishes into 25 clusters based on their common name. Finally, we placed them into food groups analogous to the food pyramid. Below, you can explore the tree map that demonstrates our clustering methodology.
How did the average prices for each dish cluster change over time?
The graphs below chart the average price of dishes in each cluster (looking at the 50th percentile, disregarding outliers) over time. This data is contrasted with the inflation of the Consumer Price Index over time, from 1904 to the present day. It can be generally noted that dish prices have increased with inflation, but there is volatility during certain time periods and variation across the clusters.
How did dishes rise and fall on the menu pages themselves?
One exciting feature of the NYPL dataset was the way each menu item was tagged with an (X,Y) coordinate indicating its spatial placement on a menu page. As you read a menu from top to bottom, generally you see starters on top, then main courses or special dishes, then sides, and finally desserts and beverages. It's also a proxy for how special or trendy a dish might be. We wondered whether the Top 25 menu items shifted roles from starter to main or from main course to after-dinner fare, so we plotted the change in Y-coordinate for each dish over time.
Special thanks to the New York Public Library for providing us an incredible (and fun!) resource to work with over the past month. We would also like to thank scholar Trevor Munoz, whose work with the NYPL dataset helped to inspire our clustering methods. Finally, an extra helping of dessert for Victor Yee, whose guidance during our data crunching was invaluable.