Is there anything better than a passion project?! I often get tossed random data projects from my older brother since he knows I’m a nerd and data is in fact my hobby. These projects often stem from a harebrained idea to get rich quick or for him to make more money in some side hustle, but it follows much of the same process that a consultant will or could go through to start up a project at a client. It includes requirements gathering, timeline decisions, and the actual development of technical deliverables.
The most recent of his side hustles that he sent my way involves Pokémon cards. These little cards are near and dear to my heart from our childhood when my brother, neighbor, and I would sit on top of the transformer box in between our yards and play Pokémon, Yu-Gi-Oh, and later Magic: The Gathering. With that background in mind, you bet my answer to his project request was “heck yes.”
When starting a hobby data project, I use the same general steps I would use for a client project: do the leg work to determine how I will complete the project, how long it will take to do so, and what the end user’s ultimate goal or use case is for the project.
Do the leg work
Identify which methods, sources, and technologies will help me deliver the best possible end product to the end user.
The leg work for this project was determining whether I would have to use a web-scraping Python script or if there was an API available to find all Pokémon pricing information. I found that the site TCGPlayer.com was a great option as they had so many different cards and an API that you can sign up for! I liked the fact that they had different kinds of cards so if my user moved on to a different kind of playing card, like baseball cards, I would already have a good framework to go off. Looking at the API, I determined all the calls I would have to make to join the data together for all necessary information. For this project, we wanted the prices, card image URLs, high and low sale prices, and all cards for the Evolving Skies group. I also noticed that there was pagination occurring for a call, which occurs when the products are broken up onto different pages. My best non-technical way to think of this is as a regular book: you could have 100 words per page in that book and a total of 355 words in the whole book, this would mean that I need to have 4 pages with 3 pages having 100 each and the last page having 55.
Next was understanding how my user would utilize this data. My options were reading directly from a Snowflake table I could create in my own personal Snowflake account, or creating a PowerBI or Tableau visual that could be utilized. We settled for the Tableau visual because he does not come from a technical background (but with new additions in dashboards in Snowsight, that now would be the route I would have taken). I had to figure out a way this would refresh the data source in Tableau as well, so I settled for Google Sheets, as Tableau Public can connect to this data source and automatically update itself! With this leg work completed
Leg work: ✅. Let’s dive into the code a bit.
Import code packages
I’d like to start this code breakdown with a disclaimer that I am by no means a Python wizard like some individuals I have had the pleasure of working with! Therefore, I realize that there are certainly places I could improve my code. My first step was importing my packages as every good coder does. Note: I was playing around with my Snowflake Python work, so that is why that import is still in my list. “Gs” prefixed items are for the Google Sheets output process.
Set variables
Next, I set my variables with my API ID, public, and private keys. Although I did not do it for this project, I recommend a more secure approach to this: storing in a security file and referencing the file to pass these in. I start working with my API connection and making the necessary calls to compile my information.
Locate all needed objects
Now we’re onto the meat and potatoes of my code. This is where I take the multi-step approach to find all the objects I want for my reporting. I start by calling for my entire product listing, which will give me all the cards based upon my group ID for the Evolving Skies card group. I sort of hacked this a little by finding the groupID beforehand so I could just ingest the hardcoded ID into my call. The neat thing here that flexed my Python code a bit was the pagination and offset that is defined in the range and the limit calls. I then loop through the products (cards) to find the pricing information per product. I left all my comments and work visible in this post so you can see how messy it can be to develop these things and that there are different pathways that can be attempted to get to your result – which is more than okay! The best way to learn is just to fumble around sometimes. In an actual production instance, I would recommend cleaning your code and adding meaningful comments to be sure others can track what is going on and troubleshoot where needed.
Write to data frames
The final step here in my code was to write to my data frames, join my products and pricing to create one table and then write out to my Google Sheet. I used the append method so I can track pricing daily.
Create Tableau visual
With my code created and scheduled using Task Scheduler, I created my basic Tableau Visual, which can be seen on my Tableau Public page or in the image below. The most interesting I found in my visual was that the card image shown changed based on the file you selected. So, for example, I can view Altaria by clicking on the name in the table next to the image!
I hope this nerdy passion project gave you some ideas for your own side projects, as well as a good framework for your professional projects! I always recommend to newcomers that if they’re looking to learn a tool or process better, explore and hack around in it (which you can see was the case for me in my Python comments). Feel free to reach out to me on LinkedIn if there are code suggestions as I’m continually learning and evolving my methods!