Benefits of Skepticism: Big Data

 

Big data is the future of design. Big data is the future of marketing. Big data is empirical. Big data is going to make up for the fallibility of the human mind.

Though there are a lot of potential applications for utilizing this data, it is important to look at it for what it is: a big pile of information we’re still trying to figure out how to sort.

Big data is the term for large data sets, “typically consisting of billions or trillions of records, that are so vast and complex that they require new and powerful computational resources to process.”[1] Often, this data is accumulated through computational processes: algorithms, machine learning, etc.

It feels significant because it is an enormous amount of information that outskirts the need for research methods and the design of a research study. If you can just access and process the data, you have your research right there.

There are, however, many problems with this usage of big data, and our utopian view of it. Big data, much like other kinds of data sets, can be incredibly biased. It can be misinterpreted. Its quality varies. It is not as sound or completely reliable as we would like it.

 

There are 4 things we need to consider when talking about big data:

 

1. The cum hoc ergo propter hoc fallacy

Latin for “with this, therefore because of this.” The phrase presents a logical fallacy about correlation. If two variables are correlated, we are often tempted to assume that one caused the other. The vast majority of assumptions made using big data are based on correlation. That one thing causes the other, or is in someway related to the other.

For example:

“A big data analysis might reveal, for instance, that from 2006 to 2011 the United States murder rate was well correlated with the market share of Internet Explorer: Both went down sharply. But it’s hard to imagine there is any causal relationship between the two. Likewise, from 1998 to 2007 the number of new cases of autism diagnosed was extremely well correlated with sales of organic food (both went up sharply), but identifying the correlation won’t by itself tell us whether diet has anything to do with autism.”[1]

Just because two variables are correlated does not necessarily mean one caused the other. Though that is the case some of the time, it is important to understand that it is not the case all of the time.

We used Google Correlate, which is the algorithm responsible for Google Flu Trends, and looked up a few random words. Google Correlate “finds search patterns which correspond with real-world trends” according to their site. Out of curiosity, we looked up “robots,” which correlates with a variety of things, but our favorite is the phrase “being a girl.”

So, at some point in 2005, a bunch of people were Googling the phrase “robots” and the phrase “being a girl.” If we were to make a cum hoc ergo propter hoc assumption about these variables, we would say that there is something about robots that is like being a girl. Or that being a girl caused us to think about robots.

It’s important to look beyond your data to add context. In early March 2005, the wonderfully whimsical children’s movie (starring the voices of Ewan McGregor, Mel Brooks, and Robin Williams) Robots hit theaters. Also, around the same time, GAP aired a commercial featuring Sarah Jessica Parker, singing a song called “I Enjoy Being a Girl.”

It’s a silly example, but it is something to consider. We can’t assume that correlation is causality, especially not without research and context outside of the dataset.

 

2. Recency Bias

If 90% of the world’s data was created in the last few years, we have an inherent recency bias in our data.

Recency bias is “the tendency to assume that future events will closely resemble recent experience. It’s a version of what is also known as the availability heuristic: the tendency to base your thinking disproportionately on whatever comes most easily to mind. It’s also a universal psychological attribute.”

The present moment is always the largest dataset, having a greater influence on our research than anything in the past. Thus, if we’re looking at big data for something predictive, something to tell us how things will be in the future, we need to know what is significant in our present data. We need to wash away what isn’t significant. We also need to include the past. We can not determine our future based on what has happened in the last couple years alone.

 

3. Confirmation Bias

Another very human psychological attribute that affects our data is confirmation bias. Confirmation bias is the “seeking or interpreting of evidence in ways that are partial to existing beliefs, expectations, or a hypothesis in hand.” This, much like recency bias, is a universal psychological characteristic. This is something everyone does, whether they are aware of it or not.

“Once we have formed a view, we embrace information that confirms that view while ignoring, or rejecting, information that casts doubt on it. Confirmation bias suggests that we don’t perceive circumstances objectively. We pick out those bits of data that make us feel good because they confirm our prejudices. Thus, we may become prisoners of our assumptions.”

The issue here is that we are coming to the data with questions. Because big data is far too large for it to yield one result, like a designed research study might, we approach the data with a question. That question, presumably, has an answer. We, as people, have an assumption of what that answer is going to be, and tend to look for data that confirms our assumptions.

This is just a truth of human psychology. We all naturally create linkages between the things we want to believe and what evidence exists that would confirm those beliefs. However, our general inability to critically think about data, especially when it’s giving us the answer we want, becomes problematic.

 

4. Data Quality

Data, in the past, was a result of research. Now, the majority of our data comes from private companies who are collecting it without a designed study or a specific goal. It is simply being dug up and piled somewhere. Because of this, it’s hard to tell what data we’re missing. We don’t have a good sense of what we have, let alone what the gaps in the information are.

Research is designed for a reason: to work toward an empirical and well rounded set of data, to know where it comes from, to be aware of its accuracies and faults. When we use random data, we don’t attribute for what we’re missing, for what its faults are.

For example:

“…consider the Twitter data generated by Hurricane Sandy, more than 20 million tweets between October 27 and November 1. A fascinating study combining Sandy-related Twitter and Foursquare data produced some expected findings (grocery shopping peaks the night before the storm) and some surprising ones (nightlife picked up the day after — presumably when cabin fever strikes). But these data don’t represent the whole picture. The greatest number of tweets about Sandy came from Manhattan. This makes sense given the city’s high level of smartphone ownership and Twitter use, but it creates the illusion that Manhattan was the hub of the disaster. Very few messages originated from more severely affected locations, such as Breezy Point, Coney Island and Rockaway. As extended power blackouts drained batteries and limited cellular access, even fewer tweets came from the worst hit areas. In fact, there was much more going on outside the privileged, urban experience of Sandy that Twitter data failed to convey, especially in aggregate. We can think of this as a “signal problem”: Data are assumed to accurately reflect the social world, but there are significant gaps, with little or no signal coming from particular communities.” See more here.

Big data is not sorted through or thought about critically. Because it’s simply gathered and stockpiled, it’s full of holes, inaccuracies, and misleading correlations. We must learn to read and scrutinize our data thoroughly. It is a form of literacy we have not developed because our society has an overarching believe that computation is somehow beyond human fallibility. We forget that the data is curated by algorithms we wrote, and is made up of our own information.

 

Overall, this is not to say big data isn’t a valuable resource. The potential for its application in a variety of fields is significant. We do, however, need to develop these literacies. We need to be skeptical about our data, where it comes from, and what it’s telling us.

– – – – –

[1] Gary Marcus & Ernest Davis, The New York Times.

[1]dictionary.com.



Project Highlight: Orsden Web Experience

Orsden is a direct-to-consumer ski apparel start up, aiming to bring style and performance back to the slopes. Sara Segall, a former marketer at Revlon, founded Orsden. Their apparel has been featured in Sports Illustrated and recently wrapped up a Kickstarter campaign to expand their collection.

Orsden’s brand was designed by Anton Anger. The logo features a bear silhouette, which calls out to the meaning of the brand’s name. Orsden comes from the French for snow bear: ours de neige.

 

We came into the project to design and develop a web experience for Orsden. We needed to create a site that aesthetically balances the quality of their skiwear with the affordability of the direct-to-consumer model. This is a dichotomy that a lot of companies have to work around (Warby Parker, Tuft & Needle) and the key is to create a unique experience that is informative for the user. Because they were only selling two products at the time, we created an experience that felt full with the capacity to grow.

Orsden positions themselves as a lifestyle brand—that their apparel is not just for the slopes but also for après. The photography, custom shot in Chilé, promotes lifestyle imagery compared to the industry standard of aggressive sports photography. We also developed a snow tracker that allows the user to see where it is snowing around the world. Jon Arvizu created a custom illustration of a snow bear for the About page.

Originally, we considered building the website on Shopify. However, we soon realized Shopify wouldn’t allow us the flexibility we needed for custom product pages and a blog. We used Craft to build the custom pages and integrated Shopify for the cart and inventory management. Craft allowed for a tailored experience that is simple to use but fast for the user. We also initiated speed optimization for the high-resolution images and for the code. The site is responsive, thus accessible from all devices.

We wrote the copy for the site, focusing on the lifestyle aesthetic here as well. The main headline “For Those Who Don’t Hibernate” ties together the ours de neige imagery while exciting the user with some initial energy. The copy combines the technical and the experiential. It teaches the user about the research and development of the apparel while celebrating the Alpine lifestyle.



Project Highlight: Properties by JADA

As a husband-wife-son operation, JADA renovates historic homes to create unique living spaces for people interested in Phoenix’s central corridor.

We developed a brand and web experience for JADA, creating a cohesive aesthetic that represents their style of renovation.

Because they restore historic homes in Phoenix, we needed to create a brand that has a historic appearance while still catering to a contemporary audience. Most of Phoenix’s historic districts hold homes built between 1915 and 1950. We created a brand that featured early mid-century traits. Their logo is built with shapes and geometric forms that show the letters of the brand name. The mark is contained and easily expanded to other applications.

Their color palette is strong and modern without being overtly masculine. The photography and colors blend the feminine aspects of the brand with the masculine, creating an aesthetic that is accessible for everyone.

The web experience is simple. They are a relatively straightforward company with a simple goal: creating homes they would want to live in. Their website is mimetic of this model. It is clean and straightforward, offering the information you need to know about them up front. Unlike many other companies in their industry, their web experience is not overtly elegant or hyper-modern. It is simple and content-forward.

We developed the site on Craft, which allows for a clean development process. It also provides an easy control system for JADA to access and adjust the content on their site in the future.

The photography is a critical piece of JADA’s web experience. Because what they do is so visual, their potential consumers want to see the homes they restore. The photography defines their aesthetic and the work they put into the homes they restore.



Project Highlight: My Birthday Playlist

My Birthday Playlist collects the #1 Billboard hits from your birthday every year since you were born. It populates a timeline of all the songs and allows you to export your playlist to Spotify.

We developed this little web application as a holiday project after returning to the studio the first week of January. The idea was to develop something that fostered a sense of nostalgia, something that would encapsulate your life in a simple playlist.

My Birthday Playlist

Development:

It seems simple, but logistically, creating My Birthday Playlist proved more complicated than we had anticipated. First, Billboard doesn’t offer an API, so we had to use a service called Apifier to crawl their site. This provided the data set we needed, but it wasn’t formatted in a usable way. We wrote a Ruby script that took in a file, parsed it, and put it into a meaningful format, which was then put through a JS script that grouped the tracks by year. So, we had the data (finally), but then we had to work through it.

The lifeblood of the app was the ability to integrate the playlist with Spotify. We created a Javascript function that looks through each song and queries against Spotify’s API. Because Spotify’s API is amazing, it was easy to set up the authentication feature.

My Birthday Playlist results

Design:

We designed a minimalistic interface that would be not only simple to use, but also focus its energy on the content. With simple text and functionality, we used the bright colors as the significant visual element of the site. We created an original experience through the type, livening up a relatively straightforward input field.

The visual aesthetic, influenced by Spotify’s use of bright colors and gradients, emphasizes the Spotify playlist integration.



An Interview with David Hildreth and Silas Kyler

blog_davidsilas

David Hildreth and Silas Kyler are the creators of Felled, a documentary film about urban lumberjacking. In conjunction with the movie, they created a book called The Art and Craft of Wood.

Could you tell us a bit about how Felled came to fruition?

David: Felled is the story of saving a dead neighborhood tree that was headed for the landfill and giving it new life and purpose. It’s about finding worth and beauty in something that everyone else sees as trash. Silas and I worked together on a bunch of different video projects over the years and talked about making a documentary together. After a big monsoon storm Silas told me that he thought he’d stumbled upon a good story for a short documentary. Silas and his friend James found a big tree that came down in the storm and started on a journey to turn the tree into lumber for furniture. I think Silas just had a good intuition for the topic and realized that he was meeting really interesting people along the way. It turns out that when a tree comes down in someone’s yard the wood generally just goes to the landfill. That’s a huge waste of resources that could otherwise become beautiful furniture or art. So we started shooting the process and after a while it was pretty clear that we could make a feature length piece about the whole issue. We happened upon a subculture of people who are fundamentally offended by that waste and were doing great work to turn trash into something more meaningful.

Is there a community of urban lumberjacks in Phoenix? How did you access this community?

Silas: Yes! Discovering that people were doing this work around the Valley was one of the main things that pulled David and I into this story initially. Starting out, I really wasn’t sure if recycling urban trees was a thing in Phoenix or not, but after doing some searching, mainly through Craigslist, I got in touch with a couple local sawyers. As it turns out, there are a bunch of mills in the valley that use local wood, all working pretty much autonomously. Through the course of producing this film I would say the community aspect has begun to grow, which is exciting. It seems like Facebook has really become a major word of mouth marketing medium for these businesses, and I’ve seen more and more of these businesses connect. More importantly, word is getting out about what they’re doing.

What has the experience of making the film been like? What goes into making a feature length film?

David: I was involved with a few other features before, but this is the first I have a hand in directing. It’s an enormous amount of work. We’ve essentially spent all of our free time for the last 2.5 years engaged in shooting, editing, or promoting the film in some way. We had a preview screening a month or so ago and someone asked how much time had been spent on the film. I answered that essentially every single evening and weekend since July ‘14 and the audience had a good laugh. Obviously that’s an exaggeration but some days I look back and it feels like that’s exactly how it’s been. Since we started production we’ve both changed jobs, we’ve both had a kid (my first, Silas’ third), and I don’t even know how many times I’ve gone into the office the next day on just a few hours of sleep. I think that’s just the reality of documentary filmmaking. It’ll be interesting to see what documentary filmmaking looks like in 10 years, but from what I see, the vast majority of documentaries will be a side hustle like Felled. The economics of filmmaking just work out that way. We’re incredibly lucky to have had a lot of help from friends and family.

How did the book become a part of this project?

David: We put out the first trailer for Felled last year and had some great success with it being shared all over Facebook. From that came a lot of interesting opportunities including a book deal with Quarto Publishing. It’s certainly not the first woodworking book on the market, but urban lumber brings an interesting twist on traditional woodworking. It’s exciting to take what we’ve learned about urban lumber and give people a step by step way of engaging with this wasted resource. We’d love for the book and the film to inspire people to look at the trees in their neighborhoods differently. Trees aren’t just a good way to clean the air and provide shade. When they die, they could be your next dinner table. Across the country we’re seeing people take this overlooked resource and turn it into something beautiful.

How did you go about writing it?

Silas: Starting out, the task of writing The Art and Craft of Wood was pretty intimidating to me. I had never planned to go write a book and I had never considered how to approach such a project. It felt like I was entering a big, unknown world. Thankfully, the book follows a pretty standard how-to format, so the structure was straight forward when we got down to it. David always accuses me of being a linear thinker, which is totally true. Being able to build the actual projects in the book and take a lot of photographs along the way gave us a nice linear workflow and foundation to build upon, which pleased my brain. Our editor, Jess, has been great through this whole process as well, and working through all of this stuff with her has been an incredible benefit.

Anything that has been deemed worthy of being published, especially in a physical book carries a certain authority, which is something that blows my mind about this project. Being a self-trained woodworker, I never held myself as some sort of expert on the topic. As I dug into the technical instructions in the book, it made me question every procedure I would describe, which led to a lot of extra research to make sure I wasn’t committing some kind of craftsmanship malpractice.

Have you had any unexpected things happen throughout the course of this project?

David: We are blown away and very grateful for the way Felled has been received. The people we tell about the project are excited and want to know more. Most people have never thought about what happens to the tree in their yard when it dies and everyone can recognize that sending that tree to the landfill is a waste. It’s refreshing to be able to make a positive film and maybe along the way inspire people to build some furniture and art that they can pass down through their families for generations to come.

You’ve been working on this project for quite a while. How did it influence you?

Silas: I’ve never worked this long on a single creative project before and I think it’s really unique how spending several years of my life on Felled has changed the way I see the trees in my city. Before I started this project I had never heard of the term “urban forestry,” but now I find myself going to urban forestry conferences and talking to urban foresters with ease. This process has taught me that, in many ways, modern society sees urban and suburban areas as the place where we consume resources, leaving the extraction of resources, or production of goods to those places outside our cities, or even our country. As cities have grown and people have spread further and further, it is apparent that a consumer relationship with our world isn’t very sustainable, and seeing what’s around us, as part of a wholistic system, is really important. This can start with something as simple as a tree, and understanding that it is part of something larger. I guess I’m just describing sustainability, but learning how to use a fallen tree, and discovering it’s place in a larger system just made that concept a lot more tangible to me.