SAVED data framework
A Story
Scene on a Zoom call, early Mary’s time, late Steve. Mary has blurred background, Steve has chosen the Golden Gate Bridge, but lives in Tulsa, OK.
Mary : “Steve, did you save the data?”
Steve : “Sure, yeah, I saved it. “
Mary : “Hmmmm…. did you save it?”
Steve : “Um, yeah, it’s on disk”
Mary : “If I wanted to recreate this entire analysis, could I do it from your documentation?”
Steve : “Uh… no.”
Mary : “I’m having thoughts right now.”
Steve : “But what can I do?”
Background
When you have a complicated analysis, or even a simple pipeline, it helps to have a clean way of knowing what the basic elements are needed to make it run. Like a pilot’s flight prep list, this helps you make sure you’re ready to keep your data saved.
Goals
The SAVED framework gives a quick way to make sure you’re ready to go. Filling out all the relevant information will ensure that other users (including yourself in the future) will praise rather than vilify you.
Framework
S - Source
Where is the data coming from? Be specific and include any relevant filters or caveats. If there’s multiple sources, give them simple short names, but make sure that you’re as precise as possible.
A - Analysis
What did you do with the data? Did you summarize it? Perform some sort of data science thing?
V - Validate
Nothing can ruin a good day like finding out the data you depend on is mis-understood or poorly assembled upstream. Go on the defensive and test the data before shipping it to your friends and neighbors.
E - Export
Magic has happened - you’ve found something amazing that you want to share with the world. Well? What is it? A table? A chart? A Zip file? Where can I find it in the future?
D - Documentation
All the above is prep for your documentation. You do documentation right? And you do it as you go? Sure! This makes it easier.
Conclusion
Years ago, it was common to lose data if you didn’t save regularly. It was normal to hit the Save button before you went to the bathroom, finish for the evening or go for a few minutes working on a document. Things happened, and you wanted to be ready.
Then Word and other tools introduced Auto Save - this (now) obvious idea was that it wasn’t that expensive anymore to hit the save button for you. Sure bad things happened, but statistically you were less likely to lose something important.
With the migration to the cloud, the save button has all but disappeared. Type a letter, make an alteration, paste a picture - your file is saved continuously. Have problems with your computer? Not a problem, everything up to the last second was saved.
The loss of the save button has removed an important concept that’s relevant in the world of data analysis and engineering. You need to be able to show your work.
By following the framework, you can be sure to retain the key elements making it easier to understand your flow and save your bacon in the future.
You’re making better sense of the world with key data - be ready to show how you got there. Use SAVED.