Applying Natural Selection to Location Based Data

Posted in Blog

Jonathan Lenaghan
By Jonathan Lenaghan

Economy and efficiency are key communication in the complex world of location based data. There’s no need to use 100 words when 10 will do, just like there’s no need to analyze 1,000 mobile ad requests with poor location-quality, when 10 high quality requests are at your disposal. While having more data produces better location data model and more reliable predictions, this is only true if that data is cleansed, verified and of the highest quality.

Having just enough quality data maximizes predictions, inferences and understanding. This is especially important when analyzing location data from ad request logs, which are notoriously loaded with noise, fraud and general misrepresentation. Darwin is an evolutionary pipeline that allows PlaceIQ to rigorously evaluates the quality of location data that is always changing.

Darwin embodies the broad skills and diverse backgrounds of our Data Science team. From a sociological perspective, a lot of time is spent thinking about what defines human behavior. On the physics side, there’s an obsession with measurement and quantification. Then there are computer scientists who build scalable machine learning algorithms to infer human behavior.

PlaceIQ ingests over 100 billion ad requests each month, so it’s imperative to know which of these requests are of high enough quality that they can be relied on in terms of human movement and behavior. The questions that enabled the development of a high-scale pipeline to compute metrics across many terabytes of data were:

  • What do human movement patterns look like?
  • How do they change throughout the day?
  • When are they sparse and where do they cluster?

When it comes to data analytics: “garbage in” simply leads to “garbage out”. There’s no magical machine that can transform poor quality data into golden nuggets. Therefore it’s crucial for any location data provider to be able to measure the quality of location data.

In the world of location-based targeting, quality is defined by the hyperlocality of a data set.

In this case, “hyperlocal” refers to the distance between a determined location and the true location. By nature, GPS technology will always have some level of error, be it 100 meters or 10 meters. And the reality is some publishers accurately report location up until a certain digit, then assign arbitrary digits at the end to make data sets appear hyperlocal. Darwin distinguishes these publishers from those who truly provide hyperlocal data, and identifies truly hyperlocal data by matching up the information gain from each lat/long digit is with the associated human behavior. Yes, human behavior is random, but not to the extent that a computer generates it.

Once it’s confirmed that a given partner satisfies hyperlocal needs, filtration is complete, right? Wrong. PlaceIQ is delivering location-based targeting, so “high quality” data must also have users that display normal human behavior. Humans generally have dwells – locations we tend to be around consistently. Adults generally have home, work and leisure dwells. With that in mind, the origin of ad requests, based on user location, should cluster to some degree. PlaceIQ refers to this notion as “clusterability”. If the clusterability of a data set is low due to fraud, where devices appear to be around the world within short time spans, then those mobile devices must be removed from PlaceIQ’s analytics pipeline, in order to achieve a higher quality data set.

By reducing the sources of “garbage” Darwin ensures that only the fittest data survives. This pipeline filter has dramatically improved the quality of PlaceIQ’s data and offers a maximized understanding of human movement and behavior.

Why Place And Context Are Crucial For Understanding Consumer Behavior

Posted in Blog

Jessica Yiu
By Jessica Yiu

Originally Published in AdExchanger“Data-Driven Thinking” is written by members of the media community and contains fresh ideas on the digital revolution in media. Today’s column is written by Jonathan Lenaghan, head of data science at PlaceIQ.

The golden rule in real estate is location, location, location. The same mantra has been adopted by the world of digital marketing, albeit with a slightly different meaning, because human behavior, including consumer habits and preferences, is fundamentally shaped by places and contexts.

Put simply, what you buy is predicted by not just who you are, but where you’ve been.

Places are integral to our identities. Even as children, we gain an intuitive understanding of the places that are important to us, such as our home, school, neighborhood park or favorite ice cream shop. As we mature, our connection to significant locations only deepens as we develop enduring associations, more sophisticated representations, and enriched attributes of those places.
Subconsciously, we create place-based mental schemas, keeping track of where we’ve been, how far we are from home and which events occur where.

Layering Information

To make sense of all these different facets of location-based information, they can be organized into “layers” of information that are gathered and filtered through our daily experiences. Suppose you’ve been invited to attend a baseball game at Yankee Stadium but are also expected to attend a meeting in Hoboken, New Jersey, that is scheduled shortly after the game. Your decision on whether to attend the game, as well as what time to leave if you do, is predicated upon your knowledge of the distance between the two venues, the most efficient form of transportation during the time of day, and a contingency plan if something goes awry.

Place-based information goes beyond strictly location to include temporality, modes of transport and patterns of population flow, among other pieces of “layering” information. A place-based understanding of consumer behavior should similarly integrate multiple “layers” of information from different sources and synthesize them into useful models.

A location-focused method for analyzing consumer behavior, which goes beyond a naïve or basic approach to location, should address several objectives:

  • Create Clear And Concise Location Categories

Since we are dealing with billions of data points across space and time, it’s important to efficiently map those disparate data points into clear and comprehensive location taxonomies. Each location category must be clearly defined and its constituent elements should share a distinct set of characteristics. For instance, both Motel 6 and the Four Seasons fall under the hotel category but they obviously cater to very different clienteles. In order to differentiate between disparate customer bases, it is important to create salient subcategories – in this case, subcategories for the budget vs. luxury traveller.

  • Balance Between Robustness And Precision

Achieving high accuracy sometimes comes at the price of generating stable and robust results. Finding the right balance between these two competing dimensions is a delicate but important task. A key consideration is the unit of measurement for location analysis: Too large and the richness of the data is clouded, but if it’s too small, the results are bound to specifications of the statistical model rather than reflect empirical reality. In other words, hyperlocal data are not necessarily desirable, particularly if they introduce too much noise into the analysis or render it computationally infeasible.

  • Account For Time And Population Flow

Places often serve multiple functions, particularly depending on the time of day, day of the week or month of the year. To contextualize locations in an array of temporal dimensions, you must model the “ebbs and flows” of human activity. Moreover, the flow of people through time and space undergoes cyclical and non-cyclical patterns. For instance, the kinds of people who wander into Central Park and their activities vary drastically over the course of time: early morning joggers vs. elderly strollers in the afternoon, family outings in the park on weekends vs. shady characters after midnight. Say a jogging company wanted to leverage location data to target joggers for their next campaign. Knowing who goes to the park at what time would help the company design their advertising campaign to optimize their budget, which is a fruitful starting point for their campaign.

  • Consider Heterogeneity In The Same Location Categories

Related to the point above about the interactions between time, population flow and location, it’s important to remember that not all locations are created equal. In addition to accounting for temporal variation, the same types of locations also vary along a wide range of other dimensions, from climate and geography to more abstract factors, such as cultural idiosyncrasies or social development. Imagine targeting an ad campaign to soccer fans in Brazil during the World Cup. Conceivably, the strategy would be different from an ad campaign targeting MLB fans in the US Midwest. Even though both events take place at large stadiums, linguistic differences and culturally distinct rituals associated with soccer vs. baseball fans make these events markedly different experiences. For instance, while hot dogs and Cracker Jack are the noshes of choice at a baseball game, it’s common practice for Italian soccer fans to eat meatball subs – preferably made by their mothers – before the game. Same categories of locations can therefore record very distinct population footprints.

  • Create A Holistic Profile Of The Consumer Journey

The ultimate goal of leveraging location data is to better understand the entire consumer journey. In order to do this effectively, multiple data sources should be “layered” on top of the locational data to create the most holistic and in-depth profiles of your targeted consumer base. Suppose Starbucks is interested in knowing the demographics of its customers, such as where they live and work. While those are useful pieces of locational information, advanced methodologies can provide even more useful information, such as places that the consumer visited that day before going to Starbucks and his or her favorite TV shows. And, quite often, analyses involving the crisscrossing of different streams of information can reveal surprising insights.
These insights produced by a location-focused analysis of consumer behavior can offer potentially be both revealing and fruitful. However, as with any type of analysis, there are challenges and caveats. An effective analytical analysis leverages, integrates and layers multiple and highly diverse data sources, while being able to interpret and make sense of the results.

Ultimately, the goal is to fully understanding the consumer journey through the prism of location.
Follow PlaceIQ (@PlaceIQ) and AdExchanger (@adexchanger) on Twitter.

26 members, 12 questions, 1 buyer’s guide to mobile location data

Posted in Audience Series

Jonathan Lenaghan
By Caity Noonan

It’s not everyday that you join your competitors to give advice to media buyers who you’ll likely compete for down the road. But the Interactive Advertising Bureau’s (IAB) commitment to sharing best practices and thought leadership led to the creation of the IAB Location Data Working Group, 26 expert companies united to furthering the mobile advertising ecosystem. PlaceIQ welcomed the opportunity to participate and is excited to empower media buyers to better recognize high quality, accurate location data.

The result of this industry-wide collaboration was a 12-question guide that advertisers, agencies and marketers can reference when exploring location-based digital marketing and selecting a mobile ad provider. Each question digs deep into the components that dictate the speed, accuracy and ROI of mobile ad delivery within a campaign. The 12 questions are split evenly between two categories: “place data” and “device data.”

The place data questions in the Buyer’s Guide revolve around the source, precision, and verification process of fixed locations in the physical world – such as a baseball field. While the status quo across the ad tech space is to rely on basic geo-fencing and licensed map data to locate these places, we want to pinpoint locations, not just get in the vicinity. So our internal cartography team draws polygons by hand on real-world locations – such as your neighborhood grocery store. That way all data sets on our 100-meter by 100-meter grid, which spans the United States, are truly targeted.

The device data questions are similar in their investigation of: type of location data, filtration methods, and data accuracy, but different in that they refer to the location of a user’s devices – which may be at a baseball diamond now, and a grocery store two hours later. For this half of the questions we rely on the work of our data science team and an analytics pipeline that rigorously evaluates the quality of location data within ad requests.

Because we know that opt-in data is the highest quality location information available, it is the only type of location data that PlaceIQ utilizes. To achieve this, we take data filtration further and selectively remove bad data points that don’t reflect normal human behavior, are clearly computer generated, or are artifacts of location data infrastructure. For you visual learners, here’s what we’re talking about:

In this example  it doesn’t take a data scientist to recognize that these data points are unnaturally dispersed (left) and hyper clustered (right). These data points are far too organized to reflect the randomness of human movement patterns, and therefore must be removed.

We’re proud to have helped put together the IAB’s Mobile Location Data Buyer’s Guide, and to further the mobile ad tech ecosystem by educating current and future media buyers. While filtration and verification isn’t the sexiest part of what we do, not asking these questions could be detrimental to your mobile ad campaign.

To download the guide, please click here. And if you missed our panel at today’s IAB Mobile Road Show in Chicago, we’ll also be presenting at the NYC stop on the Road Show, on August 21.

Location Precision: The Good, The Bad And The Ugly

Posted in Blog, News, PlaceIQ

Jonathan Lenaghan
By Jonathan Lenaghan

Originally Published in AdExchanger“Data-Driven Thinking” is written by members of the media community and contains fresh ideas on the digital revolution in media. Today’s column is written by Jonathan Lenaghan, head of data science at PlaceIQ.

There’s been a lot of talk lately about “precision” and “accuracy” in the world of mobile marketing. But it’s becoming clear that the pressure to pinpoint high-quality mobile location data is blurring the line between what is good and bad, and realistic and unrealistic.
Here is a little context to help set the record straight.

While it’s true that the degree of precision determines the speed and relevance of mobile ad delivery, in some cases, bigger is not necessarily better. And when it comes to hyperlocal, smaller is not necessarily better, either. This means that some location points, which may sound hyperlocal in theory, can actually mislead or confuse mobile marketing clients.

Latitude And Longitude

Consider the digits or decimals to which location signals are gathered. It works like this: Latitude and longitude (lat/long) figures represent the point on a map where a location sensor is picked up. The decimal places of that lat/long figure indicate the size of the fence surrounding that data point. Two decimals equate to a 1.1 km x 1.1 km fence, or roughly two-thirds of a mile. Three decimals equate to a 100 m x 100 m tile, which is roughly the size of a city block. Four decimals zoom in to 10 meters, which is where the line is drawn between good and bad data.

While a location data company is capable of capturing five, nine or even 12 digits, that doesn’t mean it should. Five decimals hone in on a 1 m x 1 m meter space, at 8 decimals (1 mm x 1 mm) you’re looking at an ant, and when you reach 12, it’s plain ugly. Twelve decimals gets you accuracy somewhere between a micrometer and a nanometer, which has no applicability in mobile advertising. It is (almost literally) splitting hairs.

Don’t aim for these hyperlocal levels of accuracy. They are not only ludicrous from a location technology point of view, they are also computationally infeasible to execute given the latency requirements found in today’s real-time bidding environments, not to mention the lack of consumer willingness to wait for application or page loads on their mobile devices.

Think about the time it takes the GPS on your smartphone to determine your location or find your destination. If a marketer is trying to determine the location of a smartphone on the move,which has a GPS accuracy of 3 meters at best, then 10 meters or four decimal places in lat/long is as accurate as you can get.

Four decimal places is usually the sweet spot. Precise location signals can be dependably gathered at ad tech speed within 100 m x 100 m, which requires no more than four decimal places. While 100 meters may place a device in a few locations (if the area is dense), applying time of day to the tile greatly improves the precision. And for a majority of less dense areas, 100 meters produces data that is quite unique and accurate.

The Right Strategy

When trying to determine whether the money you put into a mobile ad campaign is buying good, bad or ugly data, you’ll want to make sure that the strategy being used to gather location data sensors is close enough to target devices in specific stores, rather than a shopping complex or general area.

The strategy should also be designed to accurately determine behavior trends in desired users. Underpinning the strategy should be logic that can be consistently applied across the United States.

Mobile advertising is about using best-case accuracy to serve low-latency, highly relevant marketing messages to targeted audiences. Don’t get lost in the numbers.

Follow PlaceIQ (@PlaceIQ) and AdExchanger (@adexchanger) on Twitter.

Scaling Culture: Why it Should be Your Startup’s Hiring Focus

Posted in Blog

Matt Novick
By Matt Novick

From unlimited vacations and catered meals to video games and a beer on tap – you name it, startups have it. But successful startups also have to meet the demands of a growing business by rapidly hiring more people. And while growth is a good problem to have, executives need to make sure employees are first and foremost satisfied with the work they do and the culture they’ve built, rather than just the perks.

Matt Novick, our Chief Financial Officer, discusses the challenges of scaling a successful startup in the super competitive advertising technology industry. Here’s what he had to say:

Q: What critical challenges does a scaling business face?

A: Maintaining a thriving culture may not seem like the most obvious answer, but it’s a crucial piece of the puzzle. As PlaceIQ grows it’s important to continually emphasize why people come to work here in the first place. There’s value not only in an enjoyable workplace, but in a company’s culture to drive business outcomes and innovation. With that in mind, we got our employees more invested in the hiring process by asking them to craft new PlaceIQ values that define the culture and the right type of person to work here. These have successfully been rolled out company-wide. Employees live the experience every day, so they ought to be responsible for driving the culture forward. Our employee base has grown at more than 100 percent year-over-year and our client base is growing at 300%. This rapid growth requires a unified and productive team, so scaling the culture has never been more important.

Q: Describe in one word your office environment

A: Relaxed. Our open office inspires innovation and fosters collaboration among and across teams. Put simply, if employees aren’t comfortable, business results won’t meet expectations. While an entirely flexible workspace may not be the right fit for every company, employees need to take advantage of the freedom to work in ways that boost creativity and output. In our office, lounges and kitchens allow for a change of scenery or spontaneous conversation with members of other teams. Optional standing desks keep our employees from sitting hunched over at a desk all day. Then there’s our open floor plan, which encourages efficient conversation and the exchange of ideas.

Q: Is collaboration across several teams and 100+ employees really possible?

A: Yes! Our employees really are a family. The social interactions stem directly from our inclusive hiring process and lead to an atmosphere unlike any we’ve witnessed. Employees who spend time with one another outside of work develop relationships that go beyond work done in the office. This professional and personal relationship has the invaluable effect of boosting productivity and retaining employees. While the phrase “work hard, play hard” may be overused, this mindset often gets our teams through work-related challenges and onto team-building or celebratory events.

Q: What’s different about your hiring process?

A: We involve each employee in the hiring process. We’ve found that great people build great culture, and not the other way around. To attract great people, members of the PlaceIQ family judge which candidates would be the best additions to the company. While job requirements vary across positions, there are key traits our employees look for in every candidate to determine if they are “A Players.” We’ve found that empowering employees to select people they relate to drives our culture to grow organically along with the rest of the company.

Q: What else can Startups do to maintain their culture while growing fast?

A: Celebrate the smallest wins and largest milestones together. At four-years-old, we know the recent doubling of PlaceIQ’s workforce and five quarters of surpassed goals can be greatly attributed to the employee retention our culture generates. Dedicated and satisfied employees who feel a direct connection to positive results tend to build better and longer-lasting relationships with clients – not to mention create better products that drive new business.

To learn more about PlaceIQ’s culture, check out our careers page and follow us on Twitter, Facebook, and LinkedIn.

iOS 8 Update: A Privacy Win for the Location Data Ecosystem

Posted in Audience Series, Blog, News

PlaceIQ’s Audience Series sets out to highlight the importance of segments in the advertising world. As the pioneer of mobile’s application to location intelligence, and leaders in the mobile audience field, PlaceIQ has the knowledge you need. Key audience experts from each PIQ department — from engineering, to data science, to sales — will tackle topics to give a 360-degree view on this vast, ever-changing industry.

Drew Breunig
By Drew Breunig

At their developer conference during the first week of June, Apple announced many changes coming in iOS 8. One of the changes regarded how a device’s MAC address is communicated to WiFi access points.

PlaceIQ doesn’t use MAC addresses to identify devices, and we welcome this change because it protects consumers and reduces potential privacy risks. As it currently stands, MAC address tracking via WiFi is not an opt-in experience and cannot be opted out from.

If you’d like to understand more, including how this works, whom it affects, and why PlaceIQ welcomes the change, read on. Fair warning: it might get a bit nerdy.

So What’s a MAC Address?

A MAC address is a hardware-based identification number, provided by any device that connects to a network. Hardware-based identifiers are read-only, meaning they can never be changed. They are written to the physical network chip in each device. When a device connects to a router or WiFi network, the device is identified by its MAC address for the duration of its connection. This allows the right traffic to be sent to and from your phone, PC, or TV regardless of how many devices are connected.

When your device is in your pocket or purse, it is regularly looking for a known WiFi network to join. This way, when you pull out your phone at home or at work it’s already connected and ready to go. But as your phone searches for WiFi, it is broadcasting its MAC address to WiFi access points within range. It’s part of the handshake devices engage in to recognize each other.

Recently, a few companies have developed WiFi hubs that remember the MAC addresses they see. They log your device as it scans for a hub, whether or not you join the WiFi access point. These companies have installed these logging WiFi hubs in many places, allowing them to compare visitors as they move from place to place, without their knowledge.

Even if people were informed that their devices were being monitored, the only way to prevent this type of tracking is to turn off WiFi completely. That’s a rather extreme step.

Finally, there’s a difficulty with hardware-based identifiers. The mobile advertising industry, including big players like Google and Apple, has worked hard to move away from hardware-based identifiers as much as possible. Software-based identifiers, like Apple’s IDFA, can be reset by users or blocked entirely. Hardware identifiers will not change for the life of the device. If there is a data leak and a malicious source obtains a hardware-based device identifier, the only way to ensure you will not be affected is to buy a new device.

A Privacy Challenge

So Apple was faced with a challenge: their users’ devices were being logged without their knowledge, without their consent, all while using a hardware-based identifier. Apple’s adherence to standard network practices – broadcasting MAC addresses to WiFi hubs – created an environment where this situation could occur. So Apple made moves to change that standard practice.

Starting in iOS 8, iPhones, iPads, and iPod Touches will broadcast random MAC addresses. In Apple’s words, “The MAC address for WiFi scans may not always be the device’s (universal) address.” Companies that log MAC addresses won’t be able to connect individual visits to a single device. They’ll know someone is there, but not where else they’ve gone.

Some have suggested that this move is a play to get more people using Apple’s own iBeacon API. This may be true. But iBeacons are much more user friendly. To see a company’s iBeacons, users must install an associated application and grant it the appropriate location permissions. Applications that use iBeacons are opt-in and users are always able to opt-out by managing their location permissions in their device settings.

The Right Move

iOS has a history of protecting user privacy and providing access controls. In fact, this isn’t their first big MAC address change. Last year they blocked applications from accessing the MAC address. And this wasn’t their only update to location privacy this year: with iOS 8, Apple is introducing much more explicit background location access controls.

Overall, I believe Apple’s decision to randomize MAC addresses is a win both for users and the location data ecosystem. They provide a managed space where developers can innovate without overstepping user expectations.

As a growing number of applications use location in more diverse ways than ever before, they can now do so in an environment where users still retain control.

Clients, Crew Enjoy PlaceIQ Pirate Party

Posted in Blog, Events, PlaceIQ

Our Pirate Costume Party joined together the PlaceIQ Crew with some of our favorite clients and friends at the Lightship Frying Pan last week. Our own Eric DeLange performed with his band, and many enjoyed the infamous PIQful Dead Man’s Chest cocktail. Thanks to all who donned their sea legs!
piratecollage1
You can see more photos from the night on our Facebook page. Don’t forget to “like” us!

Machines Won’t Take Your Ad Tech Job

Posted in Audience Series, Blog, PlaceIQ

PlaceIQ’s Audience Series sets out to highlight the importance of segments in the advertising world. As the pioneer of mobile’s application to location intelligence, and leaders in the mobile audience field, PlaceIQ has the knowledge you need. Key audience experts from each PIQ department — from engineering, to data science, to sales — will tackle topics to give a 360-degree view on this vast, ever-changing industry. The following article ran in AdExchanger’s Data-Driven Thinking series.

Jonathan Lenaghan
By Jonathan Lenaghan

Warnings of the coming Skynet-ization of digital advertising are becoming increasingly common. But rest assured, the near-term future of our industry is not going to be filled with self-aware, artificially intelligent machines that will replace all humans currently employed at ad tech companies.

However, digital advertising does seem to be on the cusp of a significant transformation, in the form of a rapid emergence of platform-centric ecosystems. The number of ad tech companies announcing the launch of a new platform seems to grow daily. These systems will be significantly more feature-rich than the real-time bidding or large Hadoop-based back-end platforms that have defined the industry for the past several years.

To unlock real value, though, these platforms need to enable business analysts, data scientists, campaign managers and an entire host of operations personnel. Many ad tech businesses rely on ingesting, processing and analyzing hundreds of terabytes of data coming from varied and disparate sources. Traditionally, large teams of engineers toting extensive experience within the Hadoop ecosystem were necessary to get actionable insights.

The next generation of platforms will still perform these functions, but aggregations, algorithms, internal languages and interactive visualization layers will empower this larger family of end users to better define and segment audiences, optimize campaigns based upon industry-specific KPIs or slice and pivot campaign data along many new dimensions. These platforms will no longer be under the exclusive purview of data teams but will be pushed deeper into organizations to those with perhaps less technical experience in big data but with deeper domain experience.

Checkmate

If the focus in the past decade has been to capture and process enormous amounts of data, the next step is to design platforms that strip away this complexity and scale and seamlessly incorporate the expertise of analysts and operations personnel. The emerging platform ecosystem will augment the intelligence of analysts and enable them to effortlessly make business decisions.

Chess is an excellent example of algorithms augmenting the intelligence of a human being. Ever since Deep Blue beat Gary Kasparov in 1997, computers and algorithms have been able to beat the best human grandmasters. The most formidable chess playing system, however, is a combination of top chess programs and human grandmasters. The sum of algorithms and human beings is greater than either individually.

Algorithmic Black Boxes

The term “platform” has a tendency to evoke images of the purely algorithmic black boxes that dominate the high-frequency equity-trading world. Similarly, bidding on ad inventory will always be algorithmic with little to no direct human interaction. Targeting and serving a digital ad needs to happen in a matter of milliseconds, and so it makes sense that on the surface much of the digital advertising ecosystem is loosely modeled after equity markets.

Algorithmic black boxes, however, are really only successful when they exploit time and capacity scales with a very narrow and specific purpose, such as optimally bidding on ad inventory with sub-millisecond latencies or taking advantage of tiny price discrepancies across multiple stock exchanges. The coming era where an analyst or data scientist is going to be replaced by a black box is a long ways off.

The R Project for Statistical Computing and other statistical packages, for example, have been around for many years but as a general population, we do not seem any better at understanding statistical concepts. Even companies that specialize in black box optimization utilize teams of analysts and data scientists to identify and implement optimization strategies. Those that do not rely on human intuition and experience, I conjecture, are doing a lot of optimization towards click and impression fraud.

The Power Of Deep Domain Expertise

To be sure, I am no Luddite. I have spent my career with machines, models and algorithms, and the greatest business leverage is found by combining the analyst with the algorithm. There is a common saying among data scientists in which bigger data beats better algorithms. I posit that deep domain understanding beats both.

Given the choice between doubling my data size, spending a few months investigating more sophisticated algorithms, or incorporating the work and expertise of a knowledgeable analyst into my platform, I’ll take the human being. Rules and heuristics defined by experts have more utility and can be implemented more quickly and efficiently that building fully automated systems that learn a domain.

This model has been very successful for companies like Palantir or Quid and is the core strength of the Consumer Insights Platform that PlaceIQ is developing. The Palantir platform works “at the intersection of data, technology and human expertise” to yield actionable results for governments, as well as businesses. Quid, likewise, has built a platform to ingest large amounts of unstructured data to provide analysts with a means of interrogating complex relationships. In these platforms, data and algorithms are used to leverage human experience and intuition.

At the end of the day, the role of domain expertise will tend to outweigh both the sophistication of methodologies and access to more data. The next “Rise of the Machines” will aid analysts and managers, rather than replace them.

CRO’s Amazing Ride with PlaceIQ: Stepping Back, but Not Away!

Posted in Blog, News, PlaceIQ

Duncan McCall
By Duncan McCall

As we start another busy week here at PlaceIQ, I have some personnel news I wanted to share. After a truly amazing stint building out and leading our sales team, Tony Nethercutt, our fearless CRO, will be stepping back from a full-time role.

At the start of Q3, Tony will shift his role to that of CRO Emeritus and continue to be an employee and advisor to the company, while becoming less involved with the day-to-day operations. He will continue to help us recruit talent and advise the sales teams, and, of course, attend our various company events – no doubt with cowbell in hand!

We were very lucky to convince Tony to join us at PlaceIQ, as he was already thinking hard about stepping back before he came aboard. As we all know, Tony has had a truly incredible, hard-charging career with companies such as Yahoo, AdMob, YouTube and others – and has achieved many of the objectives he set out to here at PlaceIQ. He has more than earned the opportunity to slow down a bit and take some time to figure out what’s next for him and his family.

In Tony’s own words:

What an amazing ride I’ve been on with you at PlaceIQ—I couldn’t have made it up if I tried. As Q3 begins, I will shift my role to that of CRO Emeritus at PlaceIQ (there is a funny story about this title…just ask me…and I will tell you). I will continue to be an employee and an advisor to the company, but less involved with the day-to-day. I’ll also continue to help recruit new sales management, regional sellers and advise our sales team (thank you, and everyone else at the company, for 5 straight quarters of exceeding goal… you done yourself proud).

I’m very grateful on so many levels for everything we’ve been able to accomplish together at this company and I plan on giving back—big time. Most who know me know that I could never bring myself to use the “R” word. So I won’t. I’ve said many times in the past that PlaceIQ would be my last stop, but it’s not the end. I’m going to continue to enjoy the best of what we’re building here (thank you Exec team for encouraging me to stay involved), my digital investments and advisory positions, have more direct charity involvement, and also take the opportunity to experience more of “life” in general.

You can all still easily reach me at tony@placeiq.com. I may need your help to break me into the next phase of my life—my wife is pretty sure that I’ll struggle without the constant email. ;) I know that there are many more great things in the future for PlaceIQ and its people – and I’m so honored to have been a part of its growth and thankful that I will continue to be! I’ll be with you at the finish line, wherever and whenever that may be.


tony

We are extremely fortunate to have had Tony’s talents applied here, and we are also very lucky to be in such a strong position as we grow past 115 odd people, continue to beat our revenue targets, and roll out market-defining new products. There’s never an ideal time to step back, but going out on a high note is an incredible place to be!

Remember, Tony is not going away — he will continue to be involved in the company and at our events, where I am confident that he will continue to tear up the dance floor (in pirate garb or not!) and bring his incredible energy and enthusiasm to the proceedings. Still, we will certainly see him less frequently… so I am sure you’ll all join me in ringing the cowbells long and hard and thanking Tony so much for everything he has done for us to date, and will continue to do in the future!

Thanks,
Duncan

Relative Consistency, or What Gödel Might Have to Say About ‘Big Data’

Posted in Audience Series, Blog

PlaceIQ’s Audience Series sets out to highlight the importance of segments in the advertising world. As the pioneer of mobile’s application to location intelligence, and leaders in the mobile audience field, PlaceIQ has the knowledge you need. Key audience experts from each PIQ department — from engineering, to data science, to sales — will tackle topics to give a 360-degree view on this vast, ever-changing industry.

Susan Zhang
By Susan Zhang

In the early 1900s, David Hilbert set out to prove the consistency of mathematics by reducing all mathematical statements into a formal language, from which we could deduce all mathematical statements.

Hilbert believe that derived statements would be consistent with one another. There would be no method of derivation in which we can obtain, from the same set of axioms, “1 + 1 = 2” in one case and “1 + 1 ≠ 2” in another.

One hundred years later, we sit on more data points about human behavior than ever. “Data-driven” is the go-to phrase for making decisions using statistical inference and complex computations. In digital marketing, utilizing these data points can help drive consumer outreach, illustrate trends in consumer behavior, and shed light on patterns that would have otherwise gone unnoticed.

The ways in which choose to “utilize” this data can vary tremendously. How then can we choose the “best” model?

In order to determine which method yields better results, some metric of measurement is needed from which error can be minimized. Unfortunately, these “true sets” or “true values” are not necessarily present or obvious.

Take, for example, the task of describing everyday human behaviors.

Do people who shop at one grocery store also frequent the nearby fast food chain? Do people with a higher income behave differently than the unemployed? In each case, the point of the investigation is to determine the “truth set” – what people are actually doing, how they should be classified, and what this classification implies about the state of the world. Sure, we can create our own target sets with pre-defined socio-economic biases, but then our algorithms would merely strive to confirm such biases within the entire population, not develop them independently from the raw data itself.

In 1931, Kurt Gödel published two incompleteness theorems establishing the impossibility of Hilbert’s claims. His second incompleteness theorem can be paraphrased:

Given a set of axioms and all statements derived from these axioms, there cannot exist a statement within this set that proves the consistency of this system. If such statement exists, then this system is inconsistent.

You can almost think of this like defining a word in the dictionary using the word itself: the self-referential nature negates the explanation.

The idea behind Gödel’s second incompleteness theorem closely mimics the limitations seen in the task of defining human behavior. We need some “truth set” to base an algorithm upon, but at the same time, any method used to obtain an audience’s true behavior, which simultaneously proves its own consistency, would be in violation of Gödel’s theorem.

While there may not be a method of deriving the absolute state of the world and knowing its degree of consistency, there is a way we can build ourselves up, layer by layer, using relative consistency.

Let’s start with describing people who own cars. Suppose we have a data set where 10% of the population consists of 18 to 23-year- olds. Our car ownership algorithm determines that 2% of all car owners are 18 to 23 years old.

This makes sense since young adults may be less capable of buying a car than older adults. The 2% number, when compared to the 10% number, appears to be accurate. But if the algorithm determined that 80% of all car owners are 18 to 23 years old, we would have a problem. The 80% number, when compared to the 10% number, does not appear to be anywhere near accurate.

In this case, the inconsistency in the results points to a potentially flawed algorithm or a corrupted input data set that is not representative of the true population. A check for the relative consistency of the results would tell us where a problem might exist, and prevent us from further iterations on a flawed algorithm and data set.

Like the processes of quality assurance in a manufacturing plant and ongoing maintenance for the structural base of a skyscraper, these consistency checks are fundamental to the iterative process of extracting meaning from big data. While we rely on complex algorithms to augment human intelligence and intuition, we must also question the integrity of the algorithms themselves to ensure that inconsistencies are rooted out as early as possible.

Gödel’s theorems may only be applicable in a particularly esoteric branch of mathematics, but they still illustrate a lesson that we can all benefit from: it is better to iterate with relative consistency than to settle for inconsistent systems.